Skip to content

Commit

Permalink
Reduce ds_id name length (microsoft#5176)
Browse files Browse the repository at this point in the history
Fixing issue microsoft#5087 . Limited the naming of the ds_id in ZeRO 3 to the
first and last parameters of the group instead of every parameter in the
group.
  • Loading branch information
jomayeri authored and rraminen committed May 9, 2024
1 parent 077667c commit c704fc1
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion deepspeed/runtime/zero/stage3.py
Original file line number Diff line number Diff line change
Expand Up @@ -864,7 +864,9 @@ def _create_fp32_partitions(self):
self.device).clone().float().detach())

self.fp32_partitioned_groups_flat[i].requires_grad = True # keep this in case internal optimizer uses it
self.fp32_partitioned_groups_flat[i].ds_id = '_'.join(map(str, self.fp16_partitioned_groups_flat_id[i]))
ds_id_begin = str(self.fp16_partitioned_groups_flat_id[i][0])
ds_id_end = str(self.fp16_partitioned_groups_flat_id[i][-1])
self.fp32_partitioned_groups_flat[i].ds_id = ds_id_begin + '_' + ds_id_end

if len(swappable_fp32_tensors) > 0:
self.optimizer_swapper.initialize_parameters(parameters=swappable_fp32_tensors,
Expand Down

0 comments on commit c704fc1

Please sign in to comment.