From f9c7e6021e9a9a9fd3fc8bb291da9451066aeb8d Mon Sep 17 00:00:00 2001 From: wwwbai <168457988+wwwbai@users.noreply.github.com> Date: Tue, 3 Dec 2024 03:42:40 +0800 Subject: [PATCH] Translate bertlogy.md into Chinese (#34908) * bertology translation * Update docs/source/zh/_toctree.yml Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/zh/bertology.md Co-authored-by: blueingman <15329507600@163.com> * Update docs/source/zh/bertology.md Co-authored-by: blueingman <15329507600@163.com> * Update docs/source/zh/bertology.md Co-authored-by: Isotr0py <2037008807@qq.com> * Update docs/source/zh/bertology.md Co-authored-by: Isotr0py <2037008807@qq.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: blueingman <15329507600@163.com> Co-authored-by: Isotr0py <2037008807@qq.com> --- docs/source/zh/_toctree.yml | 2 ++ docs/source/zh/bertology.md | 33 +++++++++++++++++++++++++++++++++ 2 files changed, 35 insertions(+) create mode 100644 docs/source/zh/bertology.md diff --git a/docs/source/zh/_toctree.yml b/docs/source/zh/_toctree.yml index 61b74b0d7ca99d..1d8d969ce61db0 100644 --- a/docs/source/zh/_toctree.yml +++ b/docs/source/zh/_toctree.yml @@ -92,6 +92,8 @@ title: 分词器的摘要 - local: attention title: 注意力机制 + - local: bertology + title: 基于BERT进行的相关研究 title: 概念指南 - sections: - sections: diff --git a/docs/source/zh/bertology.md b/docs/source/zh/bertology.md new file mode 100644 index 00000000000000..9b39f948339474 --- /dev/null +++ b/docs/source/zh/bertology.md @@ -0,0 +1,33 @@ + + +# 基于BERT进行的相关研究(BERTology) + +当前,一个新兴的研究领域正致力于探索大规模 transformer 模型(如BERT)的内部工作机制,一些人称之为“BERTology”。以下是这个领域的一些典型示例: + + +- BERT Rediscovers the Classical NLP Pipeline by Ian Tenney, Dipanjan Das, Ellie Pavlick: + https://arxiv.org/abs/1905.05950 +- Are Sixteen Heads Really Better than One? by Paul Michel, Omer Levy, Graham Neubig: https://arxiv.org/abs/1905.10650 +- What Does BERT Look At? An Analysis of BERT's Attention by Kevin Clark, Urvashi Khandelwal, Omer Levy, Christopher D. + Manning: https://arxiv.org/abs/1906.04341 +- CAT-probing: A Metric-based Approach to Interpret How Pre-trained Models for Programming Language Attend Code Structure: https://arxiv.org/abs/2210.04633 + + +为了助力这一新兴领域的发展,我们在BERT/GPT/GPT-2模型中增加了一些附加功能,方便人们访问其内部表示,这些功能主要借鉴了Paul Michel的杰出工作(https://arxiv.org/abs/1905.10650): + + +- 访问BERT/GPT/GPT-2的所有隐藏状态, +- 访问BERT/GPT/GPT-2每个注意力头的所有注意力权重, +- 检索注意力头的输出值和梯度,以便计算头的重要性得分并对头进行剪枝,详情可见论文:https://arxiv.org/abs/1905.10650。 + +为了帮助您理解和使用这些功能,我们添加了一个具体的示例脚本:[bertology.py](https://github.com/huggingface/transformers/tree/main/examples/research_projects/bertology/run_bertology.py),该脚本可以对一个在 GLUE 数据集上预训练的模型进行信息提取与剪枝。 \ No newline at end of file