更新时间:2023-12-02 20:16:10
最近,我还在seq2seq模型上工作. 在遇到我的情况之前,我是通过更改损失函数来解决问题的.
recently I also work on seq2seq model. I have encountered your problem before, in my case, I solve it by changing the loss function.
您说您使用口罩,所以我想您像以前一样使用tf.contrib.seq2seq.sequence_loss
.
You said you use mask, so I guess you use tf.contrib.seq2seq.sequence_loss
as I did.
我更改为tf.nn.softmax_cross_entropy_with_logits
,它可以正常工作(并且计算成本更高).
I changed to tf.nn.softmax_cross_entropy_with_logits
, and it works normally (and higher computation cost).
(编辑05/10/2018.对不起,我发现我的代码中存在严重错误,因此我需要进行编辑)
(Edit 05/10/2018. Pardon me, I need to edit since I found there is an egregious mistake in my code)
tf.contrib.seq2seq.sequence_loss
确实可以很好地工作.
根据官方文件中的定义:
tf.contrib.seq2seq.sequence_loss
tf.contrib.seq2seq.sequence_loss
can work really well, if the shape of logits
,targets
, mask
are right.
As defined in official document :
tf.contrib.seq2seq.sequence_loss
loss=tf.contrib.seq2seq.sequence_loss(logits=decoder_logits,
targets=decoder_targets,
weights=masks)
#logits: [batch_size, sequence_length, num_decoder_symbols]
#targets: [batch_size, sequence_length]
#weights: [batch_size, sequence_length]
好吧,即使形状不符合,它仍然可以工作.但是结果可能很奇怪(很多#EOS #PAD ...等).
Well, it can still work even if the shape are not meet. But the result could be weird (lots of #EOS #PAD... etc).
由于decoder_outputs
和decoder_targets
可能具有与所需形状相同的形状(在我的情况下,我的decoder_targets
具有形状[sequence_length, batch_size]
).
因此,尝试使用tf.transpose
帮助您重塑张量.
Since the decoder_outputs
, and the decoder_targets
might have the same shape as required ( In my case, my decoder_targets
has the shape [sequence_length, batch_size]
).
So try to use tf.transpose
to help you reshape the tensor.