更新时间:2023-12-02 22:48:52
如果你想关注时间维度,那么这部分代码在我看来是正确的:
If you want to have an attention along the time dimension, then this part of your code seems correct to me:
activations = LSTM(units, return_sequences=True)(embedded)
# compute importance for each step
attention = Dense(1, activation='tanh')(activations)
attention = Flatten()(attention)
attention = Activation('softmax')(attention)
attention = RepeatVector(units)(attention)
attention = Permute([2, 1])(attention)
sent_representation = merge([activations, attention], mode='mul')
你已经计算出形状(batch_size, max_length)
的注意力向量:
You've worked out the attention vector of shape (batch_size, max_length)
:
attention = Activation('softmax')(attention)
我以前从未见过这个代码,所以我不能说这个代码是否真的正确:
I've never seen this code before, so I can't say if this one is actually correct or not:
K.sum(xin, axis=-2)
进一步阅读(你可以看看):
Further reading (you might have a look):
https://github.com/philipperemy/keras-attention-mechanism一个>