且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何为1dCNN + LSTM网络(Keras)设置输入形状?

更新时间:2023-12-02 08:58:28

问题出在您的输入上.您输入的形状为(100, 64),其中第一个维度是时间步长.因此,忽略该输入,您的输入的形状为(64)Conv1D.

The problem is with your input. Your input is of shape (100, 64) in which the first dimension is the timesteps. So ignoring that, your input is of shape (64) to a Conv1D.

现在,请参考 Keras Conv1D文档,其中指出输入应为3D张量(batch_size, steps, input_dim).忽略batch_size,您的输入应该是2D张量(steps, input_dim).

Now, refer to the Keras Conv1D documentation, which states that the input should be a 3D tensor (batch_size, steps, input_dim). Ignoring the batch_size, your input should be a 2D tensor (steps, input_dim).

因此,您要提供一维张量输入,其中输入的预期大小是2D张量.例如,如果您以单词的形式向Conv1D提供自然语言输入,则句子中有64个单词,并且假设每个单词都使用长度为50的矢量编码,则输入应为(64, 50).

So, you are providing 1D tensor input, where the expected size of the input is a 2D tensor. For example, if you are providing Natural Language input to the Conv1D in form of words, then there are 64 words in your sentence and supposing each word is encoded with a vector of length 50, your input should be (64, 50).

此外,请确保按照下面的代码为LSTM提供正确的输入.

Also, make sure that you are feeding the right input to LSTM as given in the code below.

因此,正确的代码应该是

So, the correct code should be

embedding_size = 50  # Set this accordingingly
mfcc_input = Input(shape=(100, 64, embedding_size), dtype='float', name='mfcc_input')
CNN_out = TimeDistributed(Conv1D(64, 16, activation='relu'))(mfcc_input)
CNN_out = BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001, center=True, scale=True)(CNN_out)
CNN_out = TimeDistributed(MaxPooling1D(pool_size=(64-16+1), strides=None, padding='valid'))(CNN_out)

# Directly feeding CNN_out to LSTM will also raise Error, since the 3rd dimension is 1, you need to purge it as
CNN_out = Reshape((int(CNN_out.shape[1]), int(CNN_out.shape[3])))(CNN_out)

LSTM_out = LSTM(64,return_sequences=True)(CNN_out)

... (more code) ...