gpt2-tensorflow-to-pytorch-converter

This repository contains a quick-to-use script to convert GPT-2 models from TensorFlow to PyTorch model format.

Usage

Collect all your TensorFlow model files into a singular directory, i.e. these files:

model-<number>.meta
vocab.bpe
model-<number>.data-00000-of-00001
model-<number>.index
checkpoint
counter
encoder.json
hparams.json

Clone the repo, install prerequisites with i.e. pip install -r requirements.txt if needed.

Run the script:

python convert_model.py /path/to/your/model/files

The converted PyTorch model will be saved in the ./converted_model directory.

Notes

Have fun, I probably won't be updating this one much.

License

This project is licensed under the MIT License.

Contribute

All code improvements are welcome. This should at least work on all TF1.x-based GPT-2 architecture models.

About

Flying from the mind of FlyingFathead
Digital ghost code by ChaosWhisperer

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.gitignore		.gitignore
README.md		README.md
add_missing_token.py		add_missing_token.py
add_missing_tokens.py		add_missing_tokens.py
check_for_problematic_token.py		check_for_problematic_token.py
check_model.py		check_model.py
check_token_in_merges.py		check_token_in_merges.py
check_vocab_coverage.py		check_vocab_coverage.py
clean_and_sync_vocab.py		clean_and_sync_vocab.py
consolidated_checkup.py		consolidated_checkup.py
convert_model.py		convert_model.py
fix_and_test_tokenizer.py		fix_and_test_tokenizer.py
fix_encoding_and_sync_files.py		fix_encoding_and_sync_files.py
inspect_tokenizer_files.py		inspect_tokenizer_files.py
merge_vocabularies.py		merge_vocabularies.py
print_vocab.py		print_vocab.py
requirements.txt		requirements.txt
simplified_tokenizer_test.py		simplified_tokenizer_test.py
sync_tokens.py		sync_tokens.py
sync_vocab_merges.py		sync_vocab_merges.py
test_pytorch_model.py		test_pytorch_model.py
test_tokenizer.py		test_tokenizer.py
update_vocab_with_missing_tokens.py		update_vocab_with_missing_tokens.py
validate_vocab_and_tokens.py		validate_vocab_and_tokens.py
verify_cleaned_merges.py		verify_cleaned_merges.py
verify_encoding.py		verify_encoding.py
verify_vocab.py		verify_vocab.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

gpt2-tensorflow-to-pytorch-converter

Usage

Notes

License

Contribute

About

About

Releases

Packages

Languages

FlyingFathead/gpt2-tensorflow-to-pytorch-converter

Folders and files

Latest commit

History

Repository files navigation

gpt2-tensorflow-to-pytorch-converter

Usage

Notes

License

Contribute

About

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages