凯拉斯的二阶导数

更新时间：2021-12-15 15:21:52

为了 K.gradients() 这样的图层，您必须将其封装在 Lambda() 图层中，因为否则，将不会创建完整的Keras图层，并且您无法对其进行链接或训练.因此，此代码可以正常工作(经过测试):

In order for a K.gradients() layer to work like that, you have to enclose it in a Lambda() layer, because otherwise a full Keras layer is not created, and you can't chain it or train through it. So this code will work (tested):

import keras
from keras.models import *
from keras.layers import *
from keras import backend as K
import tensorflow as tf

def grad( y, x ):
    return Lambda( lambda z: K.gradients( z[ 0 ], z[ 1 ] ), output_shape = [1] )( [ y, x ] )

def network( i, d ):
    m = Add()( [ i, d ] )
    a = Lambda(lambda x: K.log( x ) )( m )
    return a

fixed_input = Input(tensor=tf.constant( [ 1.0 ] ) )
double = Input(tensor=tf.constant( [ 2.0 ] ) )

a = network( fixed_input, double )

b = grad( a, fixed_input )
c = grad( b, fixed_input )
d = grad( c, fixed_input )
e = grad( d, fixed_input )

model = Model( inputs = [ fixed_input, double ], outputs = [ a, b, c, d, e ] )

print( model.predict( x=None, steps = 1 ) )

def network模型 f(x)= log(x + 2) ，位于 x = 1 . def grad是完成梯度计算的位置.这段代码输出:

def network models f( x ) = log( x + 2 ) at x = 1. def grad is where the gradient calculation is done. This code outputs:

[array([1.0986123]，dtype = float32)，array([0.33333334]，dtype = float32)，array([-0.11111112]，dtype = float32)，array([0.07407408]，dtype = float32)，数组([-0.07407409]，dtype = float32)]

[array([1.0986123], dtype=float32), array([0.33333334], dtype=float32), array([-0.11111112], dtype=float32), array([0.07407408], dtype=float32), array([-0.07407409], dtype=float32)]

是 log(3) ， ⅓ ， -1/3 ² ， 2/3 ³ ， -6/3 ⁴ .

作为参考，普通TensorFlow中的相同代码(用于测试):

For reference, the same code in plain TensorFlow (used for testing):

import tensorflow as tf

a = tf.constant( 1.0 )
a2 = tf.constant( 2.0 )

b = tf.log( a + a2 )
c = tf.gradients( b, a )
d = tf.gradients( c, a )
e = tf.gradients( d, a )
f = tf.gradients( e, a )

with tf.Session() as sess:
    print( sess.run( [ b, c, d, e, f ] ) )

输出相同的值:

[1.0986123，[0.33333334]，[-0.11111112]，[0.07407408]，[-0.07407409]]

[1.0986123, [0.33333334], [-0.11111112], [0.07407408], [-0.07407409]]

黑森州

tf.hessians() 确实返回了第二个导数，这是链接的简写两个 tf.gradients() . Keras后端虽然没有hessians，所以您必须将两个 K.gradients() .

Hessians

tf.hessians() does return the second derivative, that's a shorthand for chaining two tf.gradients(). The Keras backend doesn't have hessians though, so you do have to chain the two K.gradients().

如果由于某种原因上述方法均无效，则您可能需要考虑在较小的 ε 距离上采用差值，以数值近似二阶导数.这基本上使每个输入的网络增长了三倍，因此，该解决方案除了缺乏准确性外，还引入了严重的效率考虑.无论如何，代码(经过测试):

If for some reason none of the above works, then you might want to consider numerically approximating the second derivative with taking the difference over a small ε distance. This basically triples the network for each input, so this solution introduces serious efficiency considerations, besides lacking in accuracy. Anyway, the code (tested):

import keras
from keras.models import *
from keras.layers import *
from keras import backend as K
import tensorflow as tf

def network( i, d ):
    m = Add()( [ i, d ] )
    a = Lambda(lambda x: K.log( x ) )( m )
    return a

fixed_input = Input(tensor=tf.constant( [ 1.0 ], dtype = tf.float64 ) )
double = Input(tensor=tf.constant( [ 2.0 ], dtype = tf.float64 ) )

epsilon = Input( tensor = tf.constant( [ 1e-7 ], dtype = tf.float64 ) )
eps_reciproc = Input( tensor = tf.constant( [ 1e+7 ], dtype = tf.float64 ) )

a0 = network( Subtract()( [ fixed_input, epsilon ] ), double )
a1 = network(               fixed_input,              double )
a2 = network(      Add()( [ fixed_input, epsilon ] ), double )

d0 = Subtract()( [ a1, a0 ] )
d1 = Subtract()( [ a2, a1 ] )

dv0 = Multiply()( [ d0, eps_reciproc ] )
dv1 = Multiply()( [ d1, eps_reciproc ] )

dd0 = Multiply()( [ Subtract()( [ dv1, dv0 ] ), eps_reciproc ] )

model = Model( inputs = [ fixed_input, double, epsilon, eps_reciproc ], outputs = [ a0, dv0, dd0 ] )

print( model.predict( x=None, steps = 1 ) )

输出:

[array([1.09861226])，array([0.33333334])，array([-0.1110223])]

[array([1.09861226]), array([0.33333334]), array([-0.1110223])]

(这仅涉及二阶导数.)

(This only gets to the second derivative.)

上一篇 : ：求解变系数二阶线性ODE？下一篇 : 未定义的引用google :: protobuf :: internal :: empty_string_ [abi：cxx11]

凯拉斯的二阶导数

黑森州

Hessians

相关阅读

技术问答最新文章