Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate GPT-2 to new tracer #875

Merged
merged 2 commits into from
Nov 30, 2023
Merged

Migrate GPT-2 to new tracer #875

merged 2 commits into from
Nov 30, 2023

Conversation

kwen2501
Copy link
Contributor

@kwen2501 kwen2501 commented Nov 29, 2023

Description

Migrated GPT-2 example to work with new tracer based pippy.

examples/hf/hf_utils.py contains utility to generate inputs for HuggingFace models.

Model architecture:

GPT2ForSequenceClassification(
  (transformer): GPT2Model(
    (wte): Embedding(50257, 768)
    (wpe): Embedding(1024, 768)
    (drop): Dropout(p=0.1, inplace=False)
    (h): ModuleList(
      (0-11): 12 x GPT2Block(
        (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (attn): GPT2Attention(
          (c_attn): Conv1D()
          (c_proj): Conv1D()
          (attn_dropout): Dropout(p=0.1, inplace=False)
          (resid_dropout): Dropout(p=0.1, inplace=False)
        )
        (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (mlp): GPT2MLP(
          (c_fc): Conv1D()
          (c_proj): Conv1D()
          (act): NewGELUActivation()
          (dropout): Dropout(p=0.1, inplace=False)
        )
      )
    )
    (ln_f): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
  )
  (score): Linear(in_features=768, out_features=2, bias=False)
)

Run

$ torchrun --nproc-per-node 4 pippy_gpt2.py

Output

https://gist.github.com/kwen2501/b9ed6158d8d0dc90b16824aa6abd8d72

for i in range(1, gpt2.config.n_layer // decoders_per_rank):
annotate_split_points(
gpt2,
{f'transformer.h.{i * decoders_per_rank}': PipeSplitWrapper.SplitPoint.BEGINNING},
Copy link

@xw285cornell xw285cornell Nov 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean the split point is right before/after this submodule (specified by FQN)? I guess it's also not super clear to me what's the meaning of SplitPoint.beginning. Also curious - tracing will "flatten" the submodule into a bunch of aten ops, so there is no longer a concept of submodule. So does this API finds the first or last node of the submodule?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BEGINNING means right before; END would mean right after.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annotation occurs before tracing. So the module structure is still there at that time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come there is no corresponding END?

@kwen2501 kwen2501 changed the base branch from pippy_1.0 to main November 29, 2023 22:38
# Input configs
example_inputs = generate_inputs_for_model(
model_class, gpt2, model_name, args.batch_size, args.device)
input_ids = example_inputs["input_ids"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is input_ids in this case just a single microbatch? Or is it the entire minibatch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Entire.
When PipelineStage actually runs, it splits the batch internally before feeding to scheduler.

Copy link
Member

@H-Huang H-Huang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a few questions.

@lessw2020 lessw2020 mentioned this pull request Nov 30, 2023
@kwen2501 kwen2501 merged commit 92038fb into main Nov 30, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants