DeepSpeed Configuration¶
Configurations¶
Training Setup¶
-
class
deepspeed.config.
TrainingConfig
(**kwargs)[source]¶ Top-level configuration for all aspects of training with DeepSpeed.
-
batch
= None¶ Batch configuration, see
BatchConfig
-
fp16
= None¶ FP16 training, see
FP16Config
-
-
class
deepspeed.config.
BatchConfig
(**kwargs)[source]¶ Batch size related parameters.
-
train_batch_size
= None¶ The effective training batch size.
This is the number of data samples that leads to one step of model update.
train_batch_size
is aggregated by the batch size that a single GPU processes in one forward/backward pass (a.k.a.,train_step_batch_size
), the gradient accumulation steps (a.k.a.,gradient_accumulation_steps
), and the number of GPUs.
-
train_micro_batch_size_per_gpu
= None¶ The batch size to be processed per device each forward/backward step.
When specified,
gradient_accumulation_steps
is automatically calculated usingtrain_batch_size
and the number of devices. Should not be concurrently specified withgradient_accumulation_steps
.
-
gradient_accumulation_steps
= None¶ The number of training steps to accumulate gradients before averaging and applying them.
This feature is sometimes useful to improve scalability since it results in less frequent communication of gradients between steps. Another impact of this feature is the ability to train with larger batch sizes per GPU. When specified,
train_step_batch_size
is automatically calculated usingtrain_batch_size
and number of GPUs. Should not be concurrently specified withtrain_step_batch_size
.
-
Extending Configurations¶
-
class
deepspeed.config.
Config
(**kwargs)[source]¶ Base class for DeepSpeed configurations.
Config
is a struct with subclassing. They are initialized from dictionaries and thus also keyword arguments:>>> c = Config(verbose=True) >>> c.verbose True >>> c['verbose'] True
You can initialize them from dictionaries:
>>> myconf = {'verbose' : True} >>> c = Config.from_dict(myconf) >>> c.verbose True
Configurations should be subclassed to group arguments by topic.
-
resolve
()[source]¶ Infer any missing arguments, if possible.
This is useful for configs such as
BatchConfig
in only a subset of arguments are required to complete a valid config.
-