TEAMS: Add TensorFlow 2 Model Garden Conversion Script #25177

stefan-it · 2023-07-28T22:29:24Z

Hi,

with this PR a pretrained TEAMS model with TensorFlow Models Garden can be converted to an ELECTRA compatible model.

The TEAMS model was proposed in the "Training ELECTRA Augmented with Multi-word Selection paper and accepted at ACL 2021:

A new text encoder pre-training method is presented that improves ELECTRA based on multi-task learning and develops two techniques to effectively combine all pre- training tasks: using attention-based networks for task-specific heads, and sharing bottom layers of the generator and the discriminator.

The TEAMS implementation can be found in the TensorFlow Models Garden repository.

Unfortunately, the authors did not release any pretrained models.

However, I pretrained a TEAMS model on German Wikipedia and release all checkpoints on the Hugging Face Model Hub. Additionally, the conversion script to integrate pretrained TEAMS into Transformers is included in this PR.

Closes #16466.

Implementation Details

TEAMS use the same architecture as ELECTRA (just pretraining approach is different). ELECTRA in Transformers comes with two models: Generator and Discriminator.

In contrast to ELECTRA, the TEAMS generator use shared layers with discriminator:

Our study confirms this observation and finds that sharing some transformer layers of the generator
and discriminator and can further boost the model performance. More specifically, we design the
generator to have the same “width” (i.e., hidden size, intermediate size and number of heads) as the
discriminator and share the bottom half of all transformer layers between the generator and the
discriminator.

More precisely, the sharing of layers can be seen in the reference implementation:

https://github.com/tensorflow/models/blob/master/official/projects/teams/teams_task.py#L48

This shows, that the generator uses the first n layers from discriminator first (which is usually half size of specified total layers).

Retrieving TensorFlow 2 Checkpoints

In order to test the conversion script, the original TensorFlow 2 checkpoints need to be downloaded from Model Hub:

$ wget https://huggingface.co/gwlms/teams-base-dewiki-v1-generator/resolve/main/ckpt-1000000.data-00000-of-00001
$ wget https://huggingface.co/gwlms/teams-base-dewiki-v1-generator/resolve/main/ckpt-1000000.index

Additionally, to test the model locally, we need to download tokenizer:

$ wget https://huggingface.co/gwlms/teams-base-dewiki-v1-generator/resolve/main/tokenizer_config.json
$ wget https://huggingface.co/gwlms/teams-base-dewiki-v1-generator/resolve/main/vocab.txt

Converting TEAMS Generator

After retrieving the original checkpoints, the generator configuration must be downloaded:

$ mkdir generator && cd $_
$ wget https://huggingface.co/gwlms/teams-base-dewiki-v1-generator/resolve/main/config.json
$ cd ..

After that, the conversion script can be run to convert TEAMS (generator part) into ELECTRA generator:

$ python3 convert_teams_original_tf2_checkpoint_to_pytorch.py \
    --tf_checkpoint_path ckpt-1000000 \
    --config_file ./generator/config.json \
    --pytorch_dump_path ./exported-generator \
    --discriminator_or_generator generator
$ cp tokenizer_config.json exported-generator
$ cp vocab.txt exported-generator

The generator can be tested with masked lm pipeline to predict next work:

from transformers import pipeline

predictor = pipeline("fill-mask", model="./exported-generator", tokenizer="./exported-generator")
predictor("Die Hauptstadt von Finnland ist [MASK].")

The example German should predict the capital city of Finland, which is Helsinki:

[{'score': 0.971819281578064,
  'token': 16014,
  'token_str': 'Helsinki',
  'sequence': 'Die Hauptstadt von Finnland ist Helsinki.'},
 {'score': 0.006745012942701578,
  'token': 12388,
  'token_str': 'Stockholm',
  'sequence': 'Die Hauptstadt von Finnland ist Stockholm.'},
 {'score': 0.003258457174524665,
  'token': 12227,
  'token_str': 'Finnland',
  'sequence': 'Die Hauptstadt von Finnland ist Finnland.'},
 {'score': 0.0025941277854144573,
  'token': 23596,
  'token_str': 'Tallinn',
  'sequence': 'Die Hauptstadt von Finnland ist Tallinn.'},
 {'score': 0.0014661155873909593,
  'token': 17408,
  'token_str': 'Riga',
  'sequence': 'Die Hauptstadt von Finnland ist Riga.'}]

Converting TEAMS Discriminator

After retrieving the original checkpoints, the generator configuration must be downloaded:

$ mkdir discriminator && cd $_
$ wget https://huggingface.co/gwlms/teams-base-dewiki-v1-discriminator/resolve/main/config.json
$ cd ..

After that, the conversion script can be run to convert TEAMS (generator part) into ELECTRA generator:

$ python3 convert_teams_original_tf2_checkpoint_to_pytorch.py \
    --tf_checkpoint_path ckpt-1000000 \
    --config_file ./discriminator/config.json \
    --pytorch_dump_path ./exported-discriminator \
    --discriminator_or_generator discriminator

I made experiments on downstream tasks (such as NER or text classification) and the results are superior than to compared BERT models (original BERT and Token Dropping BERT).

Made with 🥨and ❤️.

HuggingFaceDocBuilderDev · 2023-07-28T22:51:55Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

amyeroberts · 2023-07-31T09:45:32Z

cc @Rocketknight1

stefan-it · 2023-08-28T11:55:23Z

Please unstale 🤖

Rocketknight1 · 2023-09-22T12:30:43Z

No stale yet, please!

stefan-it · 2023-10-21T10:46:34Z

Please unstale bot 😄

github-actions · 2023-11-16T08:07:10Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

electra: add TEAMS TensorFlow 2 conversion script

879cd95

stefan-it changed the title ~~electra: add TEAMS TensorFlow 2 conversion script~~ ELECTRA: Add TEAMS TensorFlow 2 conversion script Jul 28, 2023

stefan-it changed the title ~~ELECTRA: Add TEAMS TensorFlow 2 conversion script~~ TEAMS: Add TensorFlow 2 Model Garden Conversion Script Jul 28, 2023

huggingface deleted a comment from github-actions bot Aug 28, 2023

huggingface deleted a comment from github-actions bot Sep 26, 2023

huggingface deleted a comment from github-actions bot Oct 23, 2023

github-actions bot closed this Nov 25, 2023

stefan-it reopened this Nov 25, 2023

github-actions bot closed this Dec 4, 2023

stefan-it reopened this Dec 4, 2023

github-actions bot closed this Dec 13, 2023

amyeroberts reopened this Dec 13, 2023

github-actions bot closed this Dec 22, 2023

amyeroberts reopened this Dec 22, 2023

github-actions bot closed this Dec 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TEAMS: Add TensorFlow 2 Model Garden Conversion Script #25177

TEAMS: Add TensorFlow 2 Model Garden Conversion Script #25177

stefan-it commented Jul 28, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Jul 28, 2023

amyeroberts commented Jul 31, 2023

stefan-it commented Aug 28, 2023

Rocketknight1 commented Sep 22, 2023

stefan-it commented Oct 21, 2023

github-actions bot commented Nov 16, 2023

TEAMS: Add TensorFlow 2 Model Garden Conversion Script #25177

TEAMS: Add TensorFlow 2 Model Garden Conversion Script #25177

Conversation

stefan-it commented Jul 28, 2023 • edited Loading

Implementation Details

Retrieving TensorFlow 2 Checkpoints

Converting TEAMS Generator

Converting TEAMS Discriminator

HuggingFaceDocBuilderDev commented Jul 28, 2023

amyeroberts commented Jul 31, 2023

stefan-it commented Aug 28, 2023

Rocketknight1 commented Sep 22, 2023

stefan-it commented Oct 21, 2023

github-actions bot commented Nov 16, 2023

stefan-it commented Jul 28, 2023 •

edited

Loading