且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

Keras误解了训练数据的形状

更新时间:2023-12-02 10:17:16

(根据OP对这个问题的评论进行了编辑,他们在其中发布了此链接:

(Edited, according to OP's comment on this question, where they posted this link: https://github.com/fchollet/keras/issues/1920)

您的X不是单个numpy数组,而是一个数组数组. (否则,其形状将为X.shape=(35730,513,15).

Your X is not a single numpy array, it's an array of arrays. (Otherwise its shape would be X.shape=(35730,513,15).

对于fit方法,它必须是单个numpy数组.由于长度是可变的,因此无法拥有包含所有数据的单个numpy数组,因此必须将其划分为较小的数组,每个数组包含的数据长度均相同.

It must be a single numpy array for the fit method. Since you have a variable length, you cannot have a single numpy array containing all your data, you will have to divide it in smaller arrays, each array containing data with the same length.

为此,您应该按形状创建字典,然后手动循环字典(可能还有其他更好的方法...):

For that, you should maybe create a dictionary by shape, and loop the dictionary manually (there may be other better ways to do this...):

#code in python 3.5
xByShapes = {}
yByShapes = {}
for itemX,itemY in zip(X,Y):
    if itemX.shape in xByShapes:
        xByShapes[itemX.shape].append(itemX)
        yByShapes[itemX.shape].append(itemY)
    else:
        xByShapes[itemX.shape] = [itemX] #initially a list, because we're going to append items
        yByShapes[itemX.shape] = [itemY]

最后,您循环这本词典进行培训:

At the end, you loop this dictionary for training:

for shape in xByShapes:
    model.fit(
              np.asarray(xByShapes[shape]), 
              np.asarray(yByShapes[shape]),...
              )


掩盖

或者,您可以填充数据,以使所有样本都具有相同的长度(使用零或一些虚拟值).


Masking

Alternatively, you can pad your data so all samples have the same length, using zeros or some dummy value.

然后在模型中的任何内容之前,可以添加一个Masking层,该层将忽略这些填充的段. (警告:某些类型的图层不支持遮罩)

Then before anything in your model you can add a Masking layer that will ignore these padded segments. (Warning: some types of layer don't support masking)