更新时间:2023-12-02 19:11:40
我同意 @cyniikal ,您的网络似乎对于此数据集来说太复杂了.使用单层模型,我可以在训练数据上达到93.75%的精度,在测试数据上可以达到86.7%的精度.
I agree with @cyniikal, your network seems too complex for this dataset. With a single layer model, I was able to achieve 93.75% accuracy on the training data and 86.7% accuracy on the test data.
在我的模型中,我使用了 GradientDescentOptimizer
,它与您一样将 cross_entropy
最小化.我还使用了 16
批量大小.
In my model, I used GradientDescentOptimizer
that minimized cross_entropy
just as you did. I also used a size 16
batch-size.
您看到的与我的主要区别在于:
The main difference I see between your approach and mine is that I:
请参阅此带有我的单层模型代码示例的笔记本.
如果您想在神经网络中添加图层(网络将收敛,将遇到更多困难),我强烈建议您阅读消失梯度问题.请参见此页面以解决消失梯度
.
If you would like to add layers to your neural network (the network will converge with more difficulties), I highly recommend reading this article on neural nets. Specifically, since you added sigmoid
as your last activation function, I believe you are suffering from a vanishing gradient problem. See this page to address the vanishing gradient
.