Keras嵌入层蒙版。为什么input_dim需要| vocalbulary | + 2？

更新时间：2023-12-01 23:24:10

我相信那里的文档有些误导。在正常情况下，您正在映射 n 输入数据索引 [0，1，2，...，n-1] 到向量，因此您的 input_dim 应该与您拥有的元素数量一样

I believe the docs are a bit misleading there. In the normal case you are mapping your n input data indices [0, 1, 2, ..., n-1] to vectors, so your input_dim should be as many elements as you have

input_dim = len(vocabulary_indices)

一种等效的表达方式（但有点令人困惑），以及文档的操作方式是

An equivalent (but slightly confusing) way to say this, and the way the docs do, is to say

1 +输入数据中出现的最大整数索引。

1 + maximum integer index occurring in the input data.

input_dim = max(vocabulary_indices) + 1

如果启用屏蔽，则对值 0 的处理会有所不同，因此您增加 n 索引加1： [0、1、2，...，n-1，n] ，因此您需要

If you enable masking, value 0 is treated differently, so you increment your n indices by one: [0, 1, 2, ..., n-1, n], thus you need

input_dim = len(vocabulary_indices) + 1

或

input_dim = max(vocabulary_indices) + 2

文档变为esp

The docs become especially confusing here as they say

（input_dim应该等于 | vocabulary | + 2 ）

在这里我会解释 | x | 作为集合的基数（相当于 len（x）），但是作者的意思似乎是

where I would interpret |x| as the cardinality of a set (equivalent to len(x)), but the authors seem to mean

2 +输入数据中出现的最大整数索引。

2 + maximum integer index occurring in the input data.

上一篇 : ：pytorch中的多元输入LSTM

Keras嵌入层蒙版。为什么input_dim需要| vocalbulary | + 2？

相关阅读

推荐文章