更新时间:2023-12-02 22:57:34
I think your model will be much simplified if you use tf.contrib.seq2seq.AttentionWrapper
with one of implementations: BahdanauAttention
or LuongAttention
.
通过这种方式,可以将注意力向量连接到单元格级别,以便在施加注意后,单元格输出已经 . seq2seq教程:
This way it'll be possible to wire the attention vector on a cell level, so that cell output is already after attention applied. Example from the seq2seq tutorial:
cell = LSTMCell(512)
attention_mechanism = tf.contrib.seq2seq.LuongAttention(512, encoder_outputs)
attn_cell = tf.contrib.seq2seq.AttentionWrapper(cell, attention_mechanism, attention_size=256)
请注意,通过这种方式,您将不需要window_size
循环,因为tf.nn.static_rnn
或tf.nn.dynamic_rnn
将实例化被注意包裹的单元格.
Note that this way you won't need a loop of window_size
, because tf.nn.static_rnn
or tf.nn.dynamic_rnn
will instantiate the cells wrapped with attention.
关于您的问题:您应该区分python变量和tensorflow图节点:您可以将last_encoder_state
分配给其他张量,因此,原始图节点不会更改.这是灵活的,但在结果网络中也会产生误导-您可能会认为将LSTM连接到一个张量,而实际上是另一个张量.通常,您不应该这样做.
Regarding your question: you should distinguish python variables and tensorflow graph nodes: you can assign last_encoder_state
to a different tensor, the original graph node won't change because of this. This is flexible, but can be also misleading in the result network - you might think that you connect an LSTM to one tensor, but it's actually the other. In general, you shouldn't do that.