且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何从检查点加载图层

更新时间:2022-11-30 15:30:42

preload_from_files 目前无法以这种方式实现.所以我目前看到这些可能的选项:

This is currently not possible with preload_from_files in this way. So I currently see these possible options:

  1. 我们可以扩展preload_from_files(和CustomCheckpointLoader)的逻辑以允许诸如此类(一些通用变量/层名称映射).

  1. We could extend the logic of preload_from_files (and CustomCheckpointLoader) to allow for sth like that (some generic variable/layer name mapping).

或者您可以将图层从 source_embed_raw 重命名为例如old_model__target_embed_raw 然后使用 preload_from_filesprefix 选项.如果你不想重命名,你仍然可以添加一个像old_model__target_embed_raw这样的层,然后在source_embed_raw中使用参数共享.

Or you could rename your layer from source_embed_raw to e.g. old_model__target_embed_raw and then use preload_from_files with the prefix option. If you do not want to rename it, you could still add a layer like old_model__target_embed_raw and then use parameter sharing in source_embed_raw.

如果检查点中的参数实际上被称为output/rec/target_embed_raw/...,您可以创建一个名为old_model__outputSubnetworkLayercode>,在另一个名为 recSubnetworkLayer,以及名为 target_embed_raw 的层.

If the parameter in the checkpoint is actually called sth like output/rec/target_embed_raw/..., you could create a SubnetworkLayer named old_model__output, in that another SubnetworkLayer with name rec, and in that a layer named target_embed_raw.

您可以编写一个脚本来简单地加载现有的检查点,并将存储作为一个新的检查点,但具有重命名的变量名称(这也完全独立于 RETURNN).

You could write a script to simply load the existing checkpoint, and store is as a new checkpoint but with renamed variable names (this is also totally independent from RETURNN).

LinearLayer(和大多数其他层)允许准确指定参数的初始化方式(forward_weights_initbias_init).参数初始化相当灵活.例如.可以使用诸如 load_txt_file_initializer 之类的东西.目前没有这样的函数可以直接从现有的检查点加载它,但我们可以添加它.或者你可以简单地在你的配置中实现逻辑(它只会像 5 行左右的代码).

LinearLayer (and most other layers) allows to specify exactly how the parameters are initialized (forward_weights_init and bias_init). The parameter initialization is quite flexible. E.g. there is sth like load_txt_file_initializer which can be used. Currently there is no such function to directly load it from an existing checkpoint but we could add that. Or you could simply implement the logic inside your config (it will only be sth like 5 lines of code or so).

除了使用 preload_from_files,您还可以使用 SubnetworkLayerload_on_init 选项.然后是与选项 2 中类似的逻辑.

Instead of using preload_from_files, you could also use SubnetworkLayer and the load_on_init option. And then a similar logic as in option 2.