且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

手动将 pytorch 权重转换为卷积层的 tf.keras 权重

更新时间:2023-12-02 11:53:28

在tensorflow中,set_weights基本上是用于get_weights的输出,所以用赋值以避免出错.

In tensorflow, set_weights is basically used for outputs from get_weights, so it is better to use assign to avoid making mistakes.

此外,tensorflow 中的相同"填充有点复杂.有关详细信息,请参阅我的 SO回答.这取决于input_shapekernel_sizestrides.在您的示例中,它在 pytorch 中被转换为 torch.nn.ZeroPad2d((2,3,2,3)).

Besides, 'same' padding in tensorflow is a little bit complicated. For details, see my SO answer. It depends on input_shape, kernel_size and strides. In your example here, it is translated to torch.nn.ZeroPad2d((2,3,2,3)) in pytorch.

示例代码:从 tensorflow 到 pytorch

Example codes: from tensorflow to pytorch

np.random.seed(88883)

#initialize the layers respectively
torch_layer = torch.nn.Conv2d(
    in_channels=3,
    out_channels=64,
    kernel_size=(7, 7),
    stride=(2, 2),
    bias=False
)
torch_model = torch.nn.Sequential(
              torch.nn.ZeroPad2d((2,3,2,3)),
              torch_layer
              )

tf_layer = tf.keras.layers.Conv2D(
    filters=64,
    kernel_size=(7, 7),
    strides=(2, 2),
    padding='same',
    use_bias=False
)

#setting weights in torch layer and tf layer respectively
torch_weights = np.random.rand(64, 3, 7, 7)
tf_weights = np.transpose(torch_weights, (2, 3, 1, 0))

with torch.no_grad():
  torch_layer.weight = torch.nn.Parameter(torch.Tensor(torch_weights))

tf_layer(np.zeros((1,256,256,3)))
tf_layer.kernel.assign(tf_weights)

#prepare inputs and do inference
torch_inputs = torch.Tensor(np.random.rand(1, 3, 256, 256))
tf_inputs = np.transpose(torch_inputs.numpy(), (0, 2, 3, 1))

with torch.no_grad():
  torch_output = torch_model(torch_inputs)
tf_output = tf_layer(tf_inputs)

np.allclose(tf_output.numpy() ,np.transpose(torch_output.numpy(),(0, 2, 3, 1))) #True

从pytorch到tensorflow

from pytorch to tensorflow

torch_layer = torch.nn.Conv2d(
    in_channels=3,
    out_channels=64,
    kernel_size=(7, 7),
    stride=(2, 2),
    padding=(3, 3),
    bias=False
)

tf_layer=tf.keras.layers.Conv2D(
    filters=64,
    kernel_size=(7, 7),
    strides=(2, 2),
    padding='valid',
    use_bias=False
    )

tf_model = tf.keras.Sequential([
           tf.keras.layers.ZeroPadding2D((3, 3)),
           tf_layer
           ])