且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

堆叠LSTM网络中每个LSTM层的输入是什么?

更新时间:2023-01-14 07:47:59

input_shape仅对于第一层是必需的.后续层将上一层的输出作为输入(因此,它们的input_shape参数值将被忽略)

The input_shape is only required for the first layer. The subsequent layers take the output of previous layer as its input (as so their input_shape argument value is ignored)

下面的模型

model = Sequential()
model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
model.add(LSTM(32))

代表以下架构

您可以从model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm_26 (LSTM)               (None, 5, 64)             17152     
_________________________________________________________________
lstm_27 (LSTM)               (None, 32)                12416     
=================================================================

替换行

model.add(LSTM(32))

model.add(LSTM(32, input_shape=(1000000, 200000)))

仍会为您提供相同的体系结构(使用model.summary()进行验证),因为input_shape被忽略,因为它将上一层的张量输出作为输入.

will still give you the same architecture (verify using model.summary()) because the input_shape is ignore as it takes as input the tensor output of the previous layer.

如果您需要一个序列来像下面的序列进行构架

And If you need a sequence to sequence architecture like below

您应该使用以下代码:

model = Sequential()
model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
model.add(LSTM(32, return_sequences=True))

应该返回模型

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm_32 (LSTM)               (None, 5, 64)             17152     
_________________________________________________________________
lstm_33 (LSTM)               (None, 5, 32)             12416     
=================================================================