Training API¶
deepspeed.initialize()
returns a training engine in its first argument
of type DeepSpeedEngine
. This engine is used to progress training:
for step, batch in enumerate(data_loader):
#forward() method
loss = model_engine(batch)
#runs backpropagation
model_engine.backward(loss)
#weight update
model_engine.step()
Forward Propagation¶
-
deepspeed.DeepSpeedEngine.
forward
(self, *inputs, **kwargs)¶ Execute forward propagation
Parameters: - *inputs – Variable length input list
- **kwargs – variable length keyword arguments
Backward Propagation¶
-
deepspeed.DeepSpeedEngine.
backward
(self, loss, allreduce_gradients=True, release_loss=False)¶ Execute backward pass on the loss
Parameters: - loss – Torch tensor on which to execute backward propagation
- allreduce_gradients – If this is False, then gradient averaging will be skipped. Default is True.
Optimizer Step¶
-
deepspeed.DeepSpeedEngine.
step
(self, lr_kwargs=None)¶ Execute the weight update step after forward and backward propagation on effective_train_batch.
Gradient Accumulation¶
-
deepspeed.DeepSpeedEngine.
is_gradient_accumulation_boundary
(self)¶ Query whether the current micro-batch is at the boundary of gradient accumulation, and thus will trigger gradient reductions and an optimizer step.
Returns: if the current step is a gradient accumulation boundary. Return type: bool