且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

Keras:我应该如何为RNN准备输入数据?

更新时间:2023-12-02 08:11:40

如果您只想使用最近的5个输入来预测输出,则无需提供任何训练样本的完整600个时间步长.我的建议是通过以下方式传递训练数据:

If you only want to predict the output using the most recent 5 inputs, there is no need to ever provide the full 600 time steps of any training sample. My suggestion would be to pass the training data in the following manner:

             t=0  t=1  t=2  t=3  t=4  t=5  ...  t=598  t=599
sample0      |---------------------|
sample0           |---------------------|
sample0                |-----------------
...
sample0                                         ----|
sample0                                         ----------|
sample1      |---------------------|
sample1           |---------------------|
sample1                |-----------------
....
....
sample6751                                      ----|
sample6751                                      ----------|

训练序列总数总计

(600 - 4) * 6752 = 4024192    # (nb_timesteps - discarded_tailing_timesteps) * nb_samples

每个训练序列包含5个时间步长.在每个序列的每个时间步上,您都要传递特征向量的所有13个元素.随后,训练数据的形状将为(4024192,5,13).

Each training sequence consists of 5 time steps. At each time step of every sequence you pass all 13 elements of the feature vector. Subsequently, the shape of the training data will be (4024192, 5, 13).

此循环可以重塑数据:

input = np.random.rand(6752,600,13)
nb_timesteps = 5

flag = 0

for sample in range(input.shape[0]):
    tmp = np.array([input[sample,i:i+nb_timesteps,:] for i in range(input.shape[1] - nb_timesteps + 1)])

    if flag==0:
        new_input = tmp
        flag = 1

    else:
        new_input = np.concatenate((new_input,tmp))