Keras模型无法减少损失

更新时间：2023-12-02 15:35:46

您的代码有一个关键问题:维度改组.您应该从不接触的一个维度是批量维度-根据定义，它包含数据的独立样本.在第一次重塑中，您将要素尺寸与批尺寸混合在一起:

Your code has a single critical problem: dimensionality shuffling. The one dimension you should never touch is the batch dimension - as it, by definition, holds independent samples of your data. In your first reshape, you mix features dimensions with the batch dimension:

Tensor("input_1:0", shape=(12, 6, 16, 16, 16, 3), dtype=float32)
Tensor("lambda/Reshape:0", shape=(72, 16, 16, 16, 3), dtype=float32)

这就像喂入72个形状为(16,16,16,3)的独立样本一样.进一步的层也遇到类似的问题.

This is like feeding 72 independent samples of shape (16,16,16,3). Further layers suffer similar problems.

解决方案:

不要重塑过程中的每个步骤(应使用Reshape)，而是对现有的Conv和缓冲层进行整形，以使所有内容都可以直接解决.
除了输入和输出图层外，***为每个图层加上简短的标题-不会丢失清晰度，因为每一行都由图层名称很好地定义了
GlobalAveragePooling旨在作为 final 层，因为它折叠了要素尺寸-在您的情况下，如下所示:(12,16,16,16,3) --> (12,3);转换之后没有什么用处
以上，我将Conv1D替换为Conv3D
除非您使用可变的批处理大小，否则始终选择batch_shape=和shape=，因为您可以完整检查图层尺寸(非常有帮助)
您真实的batch_size这是6，根据您的评论回复推导出来
kernel_size=1和(尤其是)filters=1是一个非常弱的卷积，我相应地替换了它-如果需要，您可以还原
如果您的预期应用程序中只有2个类，我建议使用Dense(1, 'sigmoid')且binary_crossentropy损失

Instead of reshaping every step of the way (for which you should use Reshape), shape your existing Conv and pooling layers to make everything work out directly.
Aside the input and output layers, it's better to title each layer something short and simple - no clarity is lost, as each line is well-defined by layer name
GlobalAveragePooling is intended to be the final layer, as it collapses features dimensions - in your case, like so: (12,16,16,16,3) --> (12,3); Conv afterwards serves little purpose
Per above, I replaced Conv1D with Conv3D
Unless you're using variable batch sizes, always go for batch_shape= vs. shape=, as you can inspect layer dimensions in full (very helpful)
Your true batch_size here is 6, deducing from your comment reply
kernel_size=1 and (especially) filters=1 is a very weak convolution, I replaced it accordingly - you can revert if you wish
If you have only 2 classes in your intended application, I advise using Dense(1, 'sigmoid') with binary_crossentropy loss

最后一点:您可以将除以外的所有内容抛弃，以获取尺寸改组建议，并且仍能获得理想的列车设置性能；这是问题的根源.

As a last note: you can toss all of the above out except for the dimensionality shuffling advice, and still get perfect train set performance; it was the root of the problem.

def create_model(batch_size, input_shape):

    ipt = Input(batch_shape=(batch_size, *input_shape))
    x   = Conv3D(filters=64, kernel_size=8, strides=(2, 2, 2),
                             activation='relu', padding='same')(ipt)
    x   = Conv3D(filters=8,  kernel_size=4, strides=(2, 2, 2),
                             activation='relu', padding='same')(x)
    x   = GlobalAveragePooling3D()(x)
    out = Dense(units=2, activation='softmax')(x)

    return Model(inputs=ipt, outputs=out)

BATCH_SIZE = 6
INPUT_SHAPE = (16, 16, 16, 3)
BATCH_SHAPE = (BATCH_SIZE, *INPUT_SHAPE)

def generate_fake_data():
    for j in range(1, 240 + 1):
        if j < 120:
            yield np.ones(INPUT_SHAPE), np.array([0., 1.])
        else:
            yield np.zeros(INPUT_SHAPE), np.array([1., 0.])


def make_tfdataset(for_training=True):
    dataset = tf.data.Dataset.from_generator(generator=lambda: generate_fake_data(),
                                 output_types=(tf.float32,
                                               tf.float32),
                                 output_shapes=(tf.TensorShape(INPUT_SHAPE),
                                                tf.TensorShape([2])))
    dataset = dataset.repeat()
    if for_training:
        dataset = dataset.shuffle(buffer_size=1000)
    dataset = dataset.batch(BATCH_SIZE)
    dataset = dataset.prefetch(tf.data.experimental.AUTOTUNE)
    return dataset

结果:

Epoch 28/500
40/40 [==============================] - 0s 3ms/step - loss: 0.0808 - acc: 1.0000

上一篇 : ：Django-检测用户是否在线/离线下一篇 : 如何在java中显示在线聊天用户jlist？

Keras模型无法减少损失

相关阅读

推荐文章