更新时间:2023-12-02 18:41:46
以下是 https://github.com/rafaljozefowicz/lm/blob/master/language_model.py#L21
您将模型创建代码包装到 _forward
函数中,然后为每个 GPU 调用一次
You wrap your model creation code into _forward
function, and then call it once for each GPU
for i in range(hps.num_gpus):
with tf.device(assign_to_gpu(i, ps_device)), tf.variable_scope(tf.get_variable_scope(),
reuse=True if i > 0 else None):
loss = self._forward(i, xs[i], ys[i], ws[i])
losses += [loss]
if mode == "train":
cur_grads = self._backward(loss, summaries=(i == hps.num_gpus - 1))
tower_grads += [cur_grads]
self.loss = tf.add_n(losses) / len(losses)