a list of BERT-related papers. Any feedback is welcome.
- Downstream task
- Generation
- Modification (multi-task, masking strategy, etc.)
- Probe
- Inside BERT
- Multi-lingual
- Other than English models
- Domain specific
- Multi-modal
- Model compression
- Misc.
- A BERT Baseline for the Natural Questions
- MultiQA: An Empirical Investigation of Generalization and Transfer in Reading Comprehension (ACL2019)
- Unsupervised Domain Adaptation on Reading Comprehension
- BERTQA -- Attention on Steroids
- A Multi-Type Multi-Span Network for Reading Comprehension that Requires Discrete Reasoning (EMNLP2019)
- SDNet: Contextualized Attention-based Deep Network for Conversational Question Answering
- Multi-hop Question Answering via Reasoning Chains
- Select, Answer and Explain: Interpretable Multi-hop Reading Comprehension over Multiple Documents
- Multi-step Entity-centric Information Retrieval for Multi-Hop Question Answering (EMNLP2019 WS)
- End-to-End Open-Domain Question Answering with BERTserini (NAALC2019)
- Latent Retrieval for Weakly Supervised Open Domain Question Answering (ACL2019)
- Multi-passage BERT: A Globally Normalized BERT Model for Open-domain Question Answering (EMNLP2019)
- Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering (ICLR2020)
- Learning to Ask Unanswerable Questions for Machine Reading Comprehension (ACL2019)
- Unsupervised Question Answering by Cloze Translation (ACL2019)
- Reinforcement Learning Based Graph-to-Sequence Model for Natural Question Generation
- A Recurrent BERT-based Model for Question Generation (EMNLP2019 WS)
- Learning to Answer by Learning to Ask: Getting the Best of GPT-2 and BERT Worlds
- Enhancing Pre-Trained Language Representations with Rich Knowledge for Machine Reading Comprehension (ACL2019)
- Incorporating Relation Knowledge into Commonsense Reading Comprehension with Multi-task Learning (CIKM2019)
- SG-Net: Syntax-Guided Machine Reading Comprehension
- MMM: Multi-stage Multi-task Learning for Multi-choice Reading Comprehension
- Cosmos QA: Machine Reading Comprehension with Contextual Commonsense Reasoning (EMNLP2019)
- ReClor: A Reading Comprehension Dataset Requiring Logical Reasoning (ICLR2020)
- Robust Reading Comprehension with Linguistic Constraints via Posterior Regularization
- BAS: An Answer Selection Method Using BERT Language Model
- Beat the AI: Investigating Adversarial Human Annotations for Reading Comprehension
- A Simple but Effective Method to Incorporate Multi-turn Context with BERT for Conversational Machine Comprehension (ACL2019 WS)
- FlowDelta: Modeling Flow Information Gain in Reasoning for Conversational Machine Comprehension (ACL2019 WS)
- BERT with History Answer Embedding for Conversational Question Answering (SIGIR2019)
- GraphFlow: Exploiting Conversation Flow with Graph Neural Networks for Conversational Machine Comprehension (ICML2019 WS)
- Beyond English-only Reading Comprehension: Experiments in Zero-Shot Multilingual Transfer for Bulgarian (RANLP2019)
- XQA: A Cross-lingual Open-domain Question Answering Dataset (ACL2019)
- Cross-Lingual Machine Reading Comprehension (EMNLP2019)
- Zero-shot Reading Comprehension by Cross-lingual Transfer Learning with Multi-lingual Language Representation Model
- Multilingual Question Answering from Formatted Text applied to Conversational Agents
- BiPaR: A Bilingual Parallel Dataset for Multilingual and Cross-lingual Reading Comprehension on Novels (EMNLP2019)
- MLQA: Evaluating Cross-lingual Extractive Question Answering
- Investigating Prior Knowledge for Challenging Chinese Machine Reading Comprehension (TACL)
- SberQuAD - Russian Reading Comprehension Dataset: Description and Analysis
- Giving BERT a Calculator: Finding Operations and Arguments with Reading Comprehension (EMNLP2019)
- BERT-DST: Scalable End-to-End Dialogue State Tracking with Bidirectional Encoder Representations from Transformer (Interspeech2019)
- Dialog State Tracking: A Neural Reading Comprehension Approach
- A Simple but Effective BERT Model for Dialog State Tracking on Resource-Limited Systems
- Fine-Tuning BERT for Schema-Guided Zero-Shot Dialogue State Tracking
- Goal-Oriented Multi-Task BERT-Based Dialogue State Tracker
- Domain Adaptive Training BERT for Response Selection
- BERT Goes to Law School: Quantifying the Competitive Advantage of Access to Large Legal Corpora in Contract Understanding
- BERT for Joint Intent Classification and Slot Filling
- Multi-lingual Intent Detection and Slot Filling in a Joint BERT-based Model
- A Comparison of Deep Learning Methods for Language Understanding (Interspeech2019)
- Fine-grained Information Status Classification Using Discourse Context-Aware Self-Attention
- GlossBERT: BERT for Word Sense Disambiguation with Gloss Knowledge (EMNLP2019)
- Improved Word Sense Disambiguation Using Pre-Trained Contextualized Word Representations (EMNLP2019)
- Using BERT for Word Sense Disambiguation
- Language Modelling Makes Sense: Propagating Representations through WordNet for Full-Coverage Word Sense Disambiguation (ACL2019)
- Neural Aspect and Opinion Term Extraction with Mined Rules as Weak Supervision (ACL2019)
- Assessing BERT’s Syntactic Abilities
- Does BERT agree? Evaluating knowledge of structure dependence through agreement relations
- Simple BERT Models for Relation Extraction and Semantic Role Labeling
- LIMIT-BERT : Linguistic Informed Multi-Task BERT
- A Simple BERT-Based Approach for Lexical Simplification
- Multi-headed Architecture Based on BERT for Grammatical Errors Correction (ACL2019 WS)
- Towards Minimal Supervision BERT-based Grammar Error Correction
- BERT-Based Arabic Social Media Author Profiling
- Sentence-Level BERT and Multi-Task Learning of Age and Gender in Social Media
- Evaluating the Factual Consistency of Abstractive Text Summarization
- NegBERT: A Transfer Learning Approach for Negation Detection and Scope Resolution
- xSLUE: A Benchmark and Analysis Platform for Cross-Style Language Understanding and Evaluation
- TabFact: A Large-scale Dataset for Table-based Fact Verification
- Rapid Adaptation of BERT for Information Extraction on Domain-Specific Business Documents
- LAMBERT: Layout-Aware language Modeling using BERT for information extraction
- BERT Meets Chinese Word Segmentation
- Toward Fast and Accurate Neural Chinese Word Segmentation with Multi-Criteria Learning
- Establishing Strong Baselines for the New Decade: Sequence Tagging, Syntactic and Semantic Parsing with BERT
- Evaluating Contextualized Embeddings on 54 Languages in POS Tagging, Lemmatization and Dependency Parsing
- NEZHA: Neural Contextualized Representation for Chinese Language Understanding
- Deep Contextualized Word Embeddings in Transition-Based and Graph-Based Dependency Parsing -- A Tale of Two Parsers Revisited (EMNLP2019)
- Parsing as Pretraining (AAAI2020)
- Cross-Lingual BERT Transformation for Zero-Shot Dependency Parsing
- Named Entity Recognition -- Is there a glass ceiling? (CoNLL2019)
- A Unified MRC Framework for Named Entity Recognition
- Training Compact Models for Low Resource Entity Tagging using Pre-trained Language Models
- Robust Named Entity Recognition with Truecasing Pretraining (AAAI2020)
- LTP: A New Active Learning Strategy for Bert-CRF Based Named Entity Recognition
- MT-BioNER: Multi-task Learning for Biomedical Named Entity Recognition using Deep Bidirectional Transformers
- Portuguese Named Entity Recognition using BERT-CRF
- Towards Lingua Franca Named Entity Recognition with BERT
- Resolving Gendered Ambiguous Pronouns with BERT (ACL2019 WS)
- Anonymized BERT: An Augmentation Approach to the Gendered Pronoun Resolution Challenge (ACL2019 WS)
- Gendered Pronoun Resolution using BERT and an extractive question answering formulation (ACL2019 WS)
- MSnet: A BERT-based Network for Gendered Pronoun Resolution (ACL2019 WS)
- Fill the GAP: Exploiting BERT for Pronoun Resolution (ACL2019 WS)
- On GAP Coreference Resolution Shared Task: Insights from the 3rd Place Solution (ACL2019 WS)
- Look Again at the Syntax: Relational Graph Convolutional Network for Gendered Ambiguous Pronoun Resolution (ACL2019 WS)
- BERT Masked Language Modeling for Co-reference Resolution (ACL2019 WS)
- Coreference Resolution with Entity Equalization (ACL2019)
- BERT for Coreference Resolution: Baselines and Analysis (EMNLP2019) [github]
- WikiCREM: A Large Unsupervised Corpus for Coreference Resolution (EMNLP2019)
- Ellipsis and Coreference Resolution as Question Answering
- Coreference Resolution as Query-based Span Prediction
- Utilizing BERT for Aspect-Based Sentiment Analysis via Constructing Auxiliary Sentence (NAACL2019)
- BERT Post-Training for Review Reading Comprehension and Aspect-based Sentiment Analysis (NAACL2019)
- Exploiting BERT for End-to-End Aspect-based Sentiment Analysis (EMNLP2019 WS)
- Adapt or Get Left Behind: Domain Adaptation through BERT Language Model Finetuning for Aspect-Target Sentiment Classification
- An Investigation of Transfer Learning-Based Sentiment Analysis in Japanese (ACL2019)
- "Mask and Infill" : Applying Masked Language Model to Sentiment Transfer
- Adversarial Training for Aspect-Based Sentiment Analysis with BERT
- Utilizing BERT Intermediate Layers for Aspect Based Sentiment Analysis and Natural Language Inference
- Matching the Blanks: Distributional Similarity for Relation Learning (ACL2019)
- BERT-Based Multi-Head Selection for Joint Entity-Relation Extraction (NLPCC2019)
- Enriching Pre-trained Language Model with Entity Information for Relation Classification
- Span-based Joint Entity and Relation Extraction with Transformer Pre-training
- Fine-tune Bert for DocRED with Two-step Process
- Entity, Relation, and Event Extraction with Contextualized Span Representations (EMNLP2019)
- Fine-tuning BERT for Joint Entity and Relation Extraction in Chinese Medical Text
- KG-BERT: BERT for Knowledge Graph Completion
- Language Models as Knowledge Bases? (EMNLP2019) [github]
- BERT is Not a Knowledge Base (Yet): Factual Knowledge vs. Name-Based Reasoning in Unsupervised QA
- Inducing Relational Knowledge from BERT (AAAI2020)
- Latent Relation Language Models (AAAI2020)
- Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model (ICLR2020)
- Zero-shot Entity Linking with Dense Entity Retrieval
- Investigating Entity Knowledge in BERT with Simple Neural End-To-End Entity Linking (CoNLL2019)
- Improving Entity Linking by Modeling Latent Entity Type Information (AAAI2020)
- How Can We Know What Language Models Know?
- REALM: Retrieval-Augmented Language Model Pre-Training
- How to Fine-Tune BERT for Text Classification?
- X-BERT: eXtreme Multi-label Text Classification with BERT
- DocBERT: BERT for Document Classification
- Enriching BERT with Knowledge Graph Embeddings for Document Classification
- Classification and Clustering of Arguments with Contextualized Word Embeddings (ACL2019)
- BERT for Evidence Retrieval and Claim Verification
- Conditional BERT Contextual Augmentation
- Stacked DeBERT: All Attention in Incomplete Data for Text Classification
- Exploring Unsupervised Pretraining and Sentence Structure Modelling for Winograd Schema Challenge
- A Surprisingly Robust Trick for the Winograd Schema Challenge
- WinoGrande: An Adversarial Winograd Schema Challenge at Scale (AAAI2020)
- Improving Natural Language Inference with a Pretrained Parser
- Adversarial NLI: A New Benchmark for Natural Language Understanding
- Adversarial Analysis of Natural Language Inference Systems (ICSC2020)
- Evaluating BERT for natural language inference: A case study on the CommitmentBank (EMNLP2019)
- CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge (NAACL2019)
- HellaSwag: Can a Machine Really Finish Your Sentence? (ACL2019) [website]
- Story Ending Prediction by Transferable BERT (IJCAI2019)
- Explain Yourself! Leveraging Language Models for Commonsense Reasoning (ACL2019)
- Align, Mask and Select: A Simple Method for Incorporating Commonsense Knowledge into Language Representation Models
- Informing Unsupervised Pretraining with External Linguistic Knowledge
- Commonsense Knowledge + BERT for Level 2 Reading Comprehension Ability Test
- BIG MOOD: Relating Transformers to Explicit Commonsense Knowledge
- Commonsense Knowledge Mining from Pretrained Models (EMNLP2019)
- Do Massively Pretrained Language Models Make Better Storytellers? (CoNLL2019)
- PIQA: Reasoning about Physical Commonsense in Natural Language (AAAI2020)
- Why Do Masked Neural Language Models Still Need Common Sense Knowledge?
- HIBERT: Document Level Pre-training of Hierarchical Bidirectional Transformers for Document Summarization (ACL2019)
- Deleter: Leveraging BERT to Perform Unsupervised Successive Text Compression
- Discourse-Aware Neural Extractive Model for Text Summarization
- Passage Re-ranking with BERT
- Investigating the Successes and Failures of BERT for Passage Re-Ranking
- Understanding the Behaviors of BERT in Ranking
- Document Expansion by Query Prediction
- CEDR: Contextualized Embeddings for Document Ranking (SIGIR2019)
- Deeper Text Understanding for IR with Contextual Neural Language Modeling (SIGIR2019)
- FAQ Retrieval using Query-Question Similarity and BERT-Based Query-Answer Relevance (SIGIR2019)
- Multi-Stage Document Ranking with BERT
- BERT has a Mouth, and It Must Speak: BERT as a Markov Random Field Language Model (NAACL2019 WS)
- Pretraining-Based Natural Language Generation for Text Summarization
- Text Summarization with Pretrained Encoders (EMNLP2019) [github (original)] [github (huggingface)]
- Multi-stage Pretraining for Abstractive Summarization
- PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization
- MASS: Masked Sequence to Sequence Pre-training for Language Generation (ICML2019) [github], [github]
- Unified Language Model Pre-training for Natural Language Understanding and Generation (NeurIPS2019)
- ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training
- Towards Making the Most of BERT in Neural Machine Translation
- Improving Neural Machine Translation with Pre-trained Representation
- On the use of BERT for Neural Machine Translation
- Incorporating BERT into Neural Machine Translation (ICLR2020)
- Recycling a Pre-trained BERT Encoder for Neural Machine Translation
- Leveraging Pre-trained Checkpoints for Sequence Generation Tasks
- Mask-Predict: Parallel Decoding of Conditional Masked Language Models (EMNLP2019)
- BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
- ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework for Natural Language Generation
- Cross-Lingual Natural Language Generation via Pre-Training (AAAI2020) [github]
- Multilingual Denoising Pre-training for Neural Machine Translation
- PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable
- Unsupervised Pre-training for Natural Language Generation: A Literature Review
- Multi-Task Deep Neural Networks for Natural Language Understanding (ACL2019)
- The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding
- BERT and PALs: Projected Attention Layers for Efficient Adaptation in Multi-Task Learning (ICML2019)
- Unifying Question Answering and Text Classification via Span Extraction
- ERNIE: Enhanced Language Representation with Informative Entities (ACL2019)
- ERNIE: Enhanced Representation through Knowledge Integration
- ERNIE 2.0: A Continual Pre-training Framework for Language Understanding (AAAI2020)
- Pre-Training with Whole Word Masking for Chinese BERT
- SpanBERT: Improving Pre-training by Representing and Predicting Spans [github]
- Blank Language Models
- RoBERTa: A Robustly Optimized BERT Pretraining Approach [github]
- ALBERT: A Lite BERT for Self-supervised Learning of Language Representations (ICLR2020)
- ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators (ICLR2020)
- FreeLB: Enhanced Adversarial Training for Language Understanding (ICLR2020)
- KERMIT: Generative Insertion-Based Modeling for Sequences
- DisSent: Sentence Representation Learning from Explicit Discourse Relations (ACL2019)
- StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding (ICLR2020)
- Syntax-Infused Transformer and BERT models for Machine Translation and Natural Language Understanding
- SenseBERT: Driving Some Sense into BERT
- Semantics-aware BERT for Language Understanding (AAAI2020)
- K-BERT: Enabling Language Representation with Knowledge Graph
- Knowledge Enhanced Contextual Word Representations (EMNLP2019)
- KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation
- Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks (EMNLP2019)
- SBERT-WK: A Sentence Embedding Method By Dissecting BERT-based Word Models
- Universal Text Representation from BERT: An Empirical Study
- Symmetric Regularization based BERT for Pair-wise Semantic Reasoning
- Transfer Fine-Tuning: A BERT Case Study (EMNLP2019)
- Improving Pre-Trained Multilingual Models with Vocabulary Expansion (CoNLL2019)
- ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
- SesameBERT: Attention for Anywhere
- Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer [github]
- SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization
- Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context (ACL2019) [github]
- The Evolved Transformer (ICML2019)
- Reformer: The Efficient Transformer (ICLR2020) [github]
- Transformer on a Diet
- A Structural Probe for Finding Syntax in Word Representations (NAACL2019)
- Linguistic Knowledge and Transferability of Contextual Representations (NAACL2019) [github]
- Probing What Different NLP Tasks Teach Machines about Function Word Comprehension (*SEM2019)
- BERT Rediscovers the Classical NLP Pipeline (ACL2019)
- Probing Neural Network Comprehension of Natural Language Arguments (ACL2019)
- Cracking the Contextual Commonsense Code: Understanding Commonsense Reasoning Aptitude of Deep Contextual Representations (EMNLP2019 WS)
- What do you mean, BERT? Assessing BERT as a Distributional Semantics Model
- Quantity doesn't buy quality syntax with neural language models (EMNLP2019)
- Are Pre-trained Language Models Aware of Phrases? Simple but Strong Baselines for Grammar Induction (ICLR2020)
- oLMpics -- On what Language Model Pre-training Captures
- How Much Knowledge Can You Pack Into the Parameters of a Language Model?
- What does BERT learn about the structure of language? (ACL2019)
- Open Sesame: Getting Inside BERT's Linguistic Knowledge (ACL2019 WS)
- Analyzing the Structure of Attention in a Transformer Language Model (ACL2019 WS)
- What Does BERT Look At? An Analysis of BERT's Attention (ACL2019 WS)
- Do Attention Heads in BERT Track Syntactic Dependencies?
- Blackbox meets blackbox: Representational Similarity and Stability Analysis of Neural Language Models and Brains (ACL2019 WS)
- Inducing Syntactic Trees from BERT Representations (ACL2019 WS)
- A Multiscale Visualization of Attention in the Transformer Model (ACL2019 Demo)
- Visualizing and Measuring the Geometry of BERT
- How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings (EMNLP2019)
- Are Sixteen Heads Really Better than One? (NeurIPS2019)
- On the Validity of Self-Attention as Explanation in Transformer Models
- Visualizing and Understanding the Effectiveness of BERT (EMNLP2019)
- Attention Interpretability Across NLP Tasks
- Revealing the Dark Secrets of BERT (EMNLP2019)
- Investigating BERT's Knowledge of Language: Five Analysis Methods with NPIs (EMNLP2019)
- The Bottom-up Evolution of Representations in the Transformer: A Study with Machine Translation and Language Modeling Objectives (EMNLP2019)
- A Primer in BERTology: What we know about how BERT works
- Do NLP Models Know Numbers? Probing Numeracy in Embeddings (EMNLP2019)
- How Does BERT Answer Questions? A Layer-Wise Analysis of Transformer Representations (CIKM2019)
- Whatcha lookin' at? DeepLIFTing BERT's Attention in Question Answering
- What does BERT Learn from Multiple-Choice Reading Comprehension Datasets?
- exBERT: A Visual Analysis Tool to Explore Learned Representations in Transformers Models [github]
- Multilingual Constituency Parsing with Self-Attention and Pre-Training (ACL2019)
- Language Model Pretraining (NeurIPS2019) [github]
- 75 Languages, 1 Model: Parsing Universal Dependencies Universally (EMNLP2019) [github]
- Zero-shot Dependency Parsing with Pre-trained Multilingual Sentence Representations (EMNLP2019 WS)
- Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT (EMNLP2019)
- How multilingual is Multilingual BERT? (ACL2019)
- How Language-Neutral is Multilingual BERT?
- Is Multilingual BERT Fluent in Language Generation?
- BERT is Not an Interlingua and the Bias of Tokenization (EMNLP2019 WS)
- Cross-Lingual Ability of Multilingual BERT: An Empirical Study (ICLR2020)
- Multilingual Alignment of Contextual Word Representations (ICLR2020)
- On the Cross-lingual Transferability of Monolingual Representations
- Unsupervised Cross-lingual Representation Learning at Scale
- Emerging Cross-lingual Structure in Pretrained Language Models
- Can Monolingual Pretrained Models Help Cross-Lingual Classification?
- Fully Unsupervised Crosslingual Semantic Textual Similarity Metric Based on BERT for Identifying Parallel Data (CoNLL2019)
- CamemBERT: a Tasty French Language Model
- FlauBERT: Unsupervised Language Model Pre-training for French
- Multilingual is not enough: BERT for Finnish
- BERTje: A Dutch BERT Model
- RobBERT: a Dutch RoBERTa-based Language Model
- Adaptation of Deep Bidirectional Multilingual Transformers for Russian Language
- BioBERT: a pre-trained biomedical language representation model for biomedical text mining
- Transfer Learning in Biomedical Natural Language Processing: An Evaluation of BERT and ELMo on Ten Benchmarking Datasets (ACL2019 WS)
- BERT-based Ranking for Biomedical Entity Normalization
- PubMedQA: A Dataset for Biomedical Research Question Answering (EMNLP2019)
- Pre-trained Language Model for Biomedical Question Answering
- How to Pre-Train Your Model? Comparison of Different Pre-Training Models for Biomedical Question Answering
- ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission
- Publicly Available Clinical BERT Embeddings (NAACL2019 WS)
- Progress Notes Classification and Keyword Extraction using Attention-based Deep Learning Models with BERT
- SciBERT: Pretrained Contextualized Embeddings for Scientific Text [github]
- PatentBERT: Patent Classification with Fine-Tuning a pre-trained BERT Model
- VideoBERT: A Joint Model for Video and Language Representation Learning (ICCV2019)
- ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks (NeurIPS2019)
- VisualBERT: A Simple and Performant Baseline for Vision and Language
- Selfie: Self-supervised Pretraining for Image Embedding
- ImageBERT: Cross-modal Pre-training with Large-scale Weak-supervised Image-Text Data
- Contrastive Bidirectional Transformer for Temporal Representation Learning
- M-BERT: Injecting Multimodal Information in the BERT Structure
- LXMERT: Learning Cross-Modality Encoder Representations from Transformers (EMNLP2019)
- Fusion of Detected Objects in Text for Visual Question Answering (EMNLP2019)
- Unified Vision-Language Pre-Training for Image Captioning and VQA [github]
- Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline
- VL-BERT: Pre-training of Generic Visual-Linguistic Representations (ICLR2020)
- Unicoder-VL: A Universal Encoder for Vision and Language by Cross-modal Pre-training
- UNITER: Learning UNiversal Image-TExt Representations
- Supervised Multimodal Bitransformers for Classifying Images and Text
- Weak Supervision helps Emergence of Word-Object Alignment and improves Vision-Language Tasks
- BERT Can See Out of the Box: On the Cross-modal Transferability of Text Representations
- BERT for Large-scale Video Segment Classification with Test-time Augmentation (ICCV2019WS)
- SpeechBERT: Cross-Modal Pre-trained Language Model for End-to-end Spoken Question Answering
- vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations
- Effectiveness of self-supervised pre-training for speech recognition
- Understanding Semantics from Speech Through Pre-training
- Towards Transfer Learning for End-to-End Speech Synthesis from Deep Pre-Trained Language Models
- Distilling Task-Specific Knowledge from BERT into Simple Neural Networks
- Patient Knowledge Distillation for BERT Model Compression (EMNLP2019)
- Small and Practical BERT Models for Sequence Labeling (EMNLP2019)
- Pruning a BERT-based Question Answering Model
- TinyBERT: Distilling BERT for Natural Language Understanding [github]
- DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter (NeurIPS2019 WS) [github]
- PoWER-BERT: Accelerating BERT inference for Classification Tasks
- WaLDORf: Wasteless Language-model Distillation On Reading-comprehension
- Extreme Language Model Compression with Optimal Subwords and Shared Projections
- BERT-of-Theseus: Compressing BERT by Progressive Module Replacing
- Compressing BERT: Studying the Effects of Weight Pruning on Transfer Learning
- MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers
- Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
- Q8BERT: Quantized 8Bit BERT (NeurIPS2019 WS)
- Cloze-driven Pretraining of Self-attention Networks
- Learning and Evaluating General Linguistic Intelligence
- To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks (ACL2019 WS)
- BERTScore: Evaluating Text Generation with BERT (ICLR2020)
- Machine Translation Evaluation with BERT Regressor
- SumQE: a BERT-based Summary Quality Estimation Model (EMNLP2019)
- Large Batch Optimization for Deep Learning: Training BERT in 76 minutes (ICLR2020)
- Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models (ICLR2020)
- A Mutual Information Maximization Perspective of Language Representation Learning (ICLR2020)
- Is BERT Really Robust? Natural Language Attack on Text Classification and Entailment (AAAI2020)
- Thieves on Sesame Street! Model Extraction of BERT-based APIs (ICLR2020)
- Graph-Bert: Only Attention is Needed for Learning Graph Representations
- CodeBERT: A Pre-Trained Model for Programming and Natural Languages
- Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping
- Extending Machine Language Models toward Human-Level Language Understanding
- Glyce: Glyph-vectors for Chinese Character Representations
- Back to the Future -- Sequential Alignment of Text Representations
- Improving Cuneiform Language Identification with BERT (NAACL2019 WS)
- BERT has a Moral Compass: Improvements of ethical and moral values of machines
- SMILES-BERT: Large Scale Unsupervised Pre-Training for Molecular Property Prediction (ACM-BCB2019)
- On the comparability of Pre-trained Language Models
- Transformers: State-of-the-art Natural Language Processing
- Evolution of transfer learning in natural language processing
-
arXiv:1810.04805, BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , Authors: Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova
-
arXiv:1812.06705, Conditional BERT Contextual Augmentation, Authors: Xing Wu, Shangwen Lv, Liangjun Zang, Jizhong Han, Songlin Hu
-
arXiv:1812.03593, SDNet: Contextualized Attention-based Deep Network for Conversational Question Answering, Authors: Chenguang Zhu, Michael Zeng, Xuedong Huang
-
arXiv:1901.02860, Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context, Authors: Zihang Dai, Zhilin Yang, Yiming Yang, William W. Cohen, Jaime Carbonell, Quoc V. Le and Ruslan Salakhutdinov.
-
arXiv:1901.04085, Passage Re-ranking with BERT, Authors: Rodrigo Nogueira, Kyunghyun Cho
-
arXiv:1902.02671, BERT and PALs: Projected Attention Layers for Efficient Adaptation in Multi-Task Learning, Authors: Asa Cooper Stickland, Iain Murray
-
arXiv:1904.02232, BERT Post-Training for Review Reading Comprehension and Aspect-based Sentiment Analysis, Authors: Hu Xu, Bing Liu, Lei Shu, Philip S. Yu, [code]
- google-research/bert, officical TensorFlow code and pre-trained models for BERT ,
-
codertimo/BERT-pytorch, Google AI 2018 BERT pytorch implementation,
-
huggingface/pytorch-pretrained-BERT, A PyTorch implementation of Google AI's BERT model with script to load Google's pre-trained models,
-
Separius/BERT-keras, Keras implementation of BERT with pre-trained weights,
-
soskek/bert-chainer, Chainer implementation of "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding",
-
innodatalabs/tbert, PyTorch port of BERT ML model
-
guotong1988/BERT-tensorflow, BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
-
dreamgonfly/BERT-pytorch, PyTorch implementation of BERT in "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"
-
CyberZHG/keras-bert, Implementation of BERT that could load official pre-trained models for feature extraction and prediction
-
soskek/bert-chainer, Chainer implementation of "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"
-
MaZhiyuanBUAA/bert-tf1.4.0, bert-tf1.4.0
-
dhlee347/pytorchic-bert, Pytorch Implementation of Google BERT,
-
kpot/keras-transformer, Keras library for building (Universal) Transformers, facilitating BERT and GPT models,
-
miroozyx/BERT_with_keras, A Keras version of Google's BERT model,
-
conda-forge/pytorch-pretrained-bert-feedstock, A conda-smithy repository for pytorch-pretrained-bert. ,
-
Rshcaroline/BERT_Pytorch_fastNLP, A PyTorch & fastNLP implementation of Google AI's BERT model.
-
nghuyong/ERNIE-Pytorch, ERNIE Pytorch Version,
-
dmlc/gluon-nlp, Gluon + MXNet implementation that reproduces BERT pretraining and finetuning on GLUE benchmark, SQuAD, etc,
-
dbiir/UER-py, UER-py is a toolkit for pre-training on general-domain corpus and fine-tuning on downstream task. UER-py maintains model modularity and supports research extensibility. It facilitates the use of different pre-training models (e.g. BERT), and provides interfaces for users to further extend upon.
-
thunlp/ERNIE, Source code and dataset for ACL 2019 paper "ERNIE: Enhanced Language Representation with Informative Entities", imporove bert with heterogeneous information fusion.
-
PaddlePaddle/LARK, LAnguage Representations Kit, PaddlePaddle implementation of BERT. It also contains an improved version of BERT, ERNIE, for chinese NLP tasks.
-
ymcui/Chinese-BERT-wwm, Pre-Training with Whole Word Masking for Chinese BERT https://arxiv.org/abs/1906.08101,
-
zihangdai/xlnet, XLNet: Generalized Autoregressive Pretraining for Language Understanding,
-
kimiyoung/transformer-xl, Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context, This repository contains the code in both PyTorch and TensorFlow for our paper.
-
GaoPeng97/transformer-xl-chinese, (transformer xl for text generation of chinese),
-
brightmart/bert_language_understanding, Pre-training of Deep Bidirectional Transformers for Language Understanding: pre-train TextCNN,
-
JayYip/bert-multiple-gpu, A multiple GPU support version of BERT,
-
HighCWu/keras-bert-tpu, Implementation of BERT that could load official pre-trained models for feature extraction and prediction on TPU,
-
Willyoung2017/Bert_Attempt, PyTorch Pretrained Bert,
-
Pydataman/bert_examples, some examples of bert, run_classifier.py
-
guotong1988/BERT-chinese, BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
-
zhongyunuestc/bert_multitask, 多任务task
-
Microsoft/AzureML-BERT, End-to-end walk through for fine-tuning BERT using Azure Machine Learning ,
-
bigboNed3/bert_serving, export bert model for serving,
-
yoheikikuta/bert-japanese, BERT with SentencePiece for Japanese text.
-
whqwill/seq2seq-keyphrase-bert, add BERT to encoder part for https://github.com/memray/seq2seq-keyphrase-pytorch,
-
algteam/bert-examples, bert-demo,
-
cedrickchee/awesome-bert-nlp, A curated list of NLP resources focused on BERT, attention mechanism, Transformer networks, and transfer learning.
-
brightmart/bert_customized, bert with customized features,
-
JayYip/bert-multitask-learning, BERT for Multitask Learning,
-
yuanxiaosc/BERT_Paper_Chinese_Translation, BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding 。Chinese Translation! https://yuanxiaosc.github.io/2018/12/…,
-
yaserkl/BERTvsULMFIT, Comparing Text Classification results using BERT embedding and ULMFIT embedding,
-
kpot/keras-transformer, Keras library for building (Universal) Transformers, facilitating BERT and GPT models,
-
cdathuraliya/bert-inference, A helper class for Google BERT (Devlin et al., 2018) to support online prediction and model pipelining.
-
gameofdimension/java-bert-predict, turn bert pretrain checkpoint into saved model for a feature extracting demo in java
-
allenai/scibert, A BERT model for scientific text. https://arxiv.org/abs/1903.10676,
-
MeRajat/SolvingAlmostAnythingWithBert, BioBert Pytorch
-
kexinhuang12345/clinicalBERT, ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission https://arxiv.org/abs/1904.05342
-
EmilyAlsentzer/clinicalBERT, repository for Publicly Available Clinical BERT Embeddings
-
zhihu/cuBERT, Fast implementation of BERT inference directly on NVIDIA (CUDA, CUBLAS) and Intel MKL
-
xmxoxo/BERT-train2deploy, Bert Model training and deploy,
-
sogou/SMRCToolkit, This toolkit was designed for the fast and efficient development of modern machine comprehension models, including both published models and original prototypes.,
-
benywon/ChineseBert, This is a chinese Bert model specific for question answering,
-
matthew-z/R-net, R-net in PyTorch, with BERT and ELMo,
-
nyu-dl/dl4marco-bert, Passage Re-ranking with BERT,
-
chiayewken/bert-qa, BERT for question answering starting with HotpotQA,
-
ankit-ai/BertQA-Attention-on-Steroids, BertQA - Attention on Steroids,
-
NoviScl/BERT-RACE, This work is based on Pytorch implementation of BERT (https://github.com/huggingface/pytorch-pretrained-BERT). I adapted the original BERT model to work on multiple choice machine comprehension.
-
allenai/allennlp-bert-qa-wrapper, This is a simple wrapper on top of pretrained BERT based QA models from pytorch-pretrained-bert to make AllenNLP model archives, so that you can serve demos from AllenNLP.
-
edmondchensj/ChineseQA-with-BERT, EECS 496: Advanced Topics in Deep Learning Final Project: Chinese Question Answering with BERT (Baidu DuReader Dataset)
-
graykode/toeicbert, TOEIC(Test of English for International Communication) solving using pytorch-pretrained-BERT model.,
-
graykode/KorQuAD-beginner, https://github.com/graykode/KorQuAD-beginner
-
krishna-sharma19/SBU-QA, This repository uses pretrain BERT embeddings for transfer learning in QA domain
-
maksna/bert-fine-tuning-for-chinese-multiclass-classification, use google pre-training model bert to fine-tuning for the chinese multiclass classification
-
fooSynaptic/BERT_classifer_trial, BERT trial for chinese corpus classfication
-
xieyufei1993/Bert-Pytorch-Chinese-TextClassification, Pytorch Bert Finetune in Chinese Text Classification,
-
liyibo/text-classification-demos, Neural models for Text Classification in Tensorflow, such as cnn, dpcnn, fasttext, bert ...,
-
circlePi/BERT_Chinese_Text_Class_By_pytorch, A Pytorch implements of Chinese text class based on BERT_Pretrained_Model,
-
kaushaltrivedi/bert-toxic-comments-multilabel, Multilabel classification for Toxic comments challenge using Bert,
-
lonePatient/BERT-chinese-text-classification-pytorch, This repo contains a PyTorch implementation of a pretrained BERT model for text classification.,
-
Chung-I/Douban-Sentiment-Analysis, Sentiment Analysis on Douban Movie Short Comments Dataset using BERT.
-
lynnna-xu/bert_sa, bert sentiment analysis tensorflow serving with RESTful API
-
HSLCY/ABSA-BERT-pair, Utilizing BERT for Aspect-Based Sentiment Analysis via Constructing Auxiliary Sentence (NAACL 2019) https://arxiv.org/abs/1903.09588,
-
songyouwei/ABSA-PyTorch, Aspect Based Sentiment Analysis, PyTorch Implementations.,
-
howardhsu/BERT-for-RRC-ABSA, code for our NAACL 2019 paper: "BERT Post-Training for Review Reading Comprehension and Aspect-based Sentiment Analysis",
-
brightmart/sentiment_analysis_fine_grain, Multi-label Classification with BERT; Fine Grained Sentiment Analysis from AI challenger,
-
kyzhouhzau/BERT-NER, Use google BERT to do CoNLL-2003 NER ! ,
-
king-menin/ner-bert, NER task solution (bert-Bi-LSTM-CRF) with google bert https://github.com/google-research.
-
macanv/BERT-BiLSMT-CRF-NER, Tensorflow solution of NER task Using BiLSTM-CRF model with Google BERT Fine-tuning ,
-
FuYanzhe2/Name-Entity-Recognition, Lstm-crf,Lattice-CRF,bert-ner
-
mhcao916/NER_Based_on_BERT, this project is based on google bert model, which is a Chinese NER
-
sberbank-ai/ner-bert, BERT-NER (nert-bert) with google bert,
-
kyzhouhzau/Bert-BiLSTM-CRF, This model base on bert-as-service. Model structure : bert-embedding bilstm crf. ,
-
Hoiy/berserker, Berserker - BERt chineSE woRd toKenizER, Berserker (BERt chineSE woRd toKenizER) is a Chinese tokenizer built on top of Google's BERT model. ,
-
Kyubyong/bert_ner, Ner with Bert,
-
jiangpinglei/BERT_ChineseWordSegment, A Chinese word segment model based on BERT, F1-Score 97%,
-
lemonhu/NER-BERT-pytorch, PyTorch solution of NER task Using Google AI's pre-trained BERT model.
-
nlpyang/BertSum, Code for paper Fine-tune BERT for Extractive Summarization,
-
santhoshkolloju/Abstractive-Summarization-With-Transfer-Learning, Abstractive summarisation using Bert as encoder and Transformer Decoder,
-
nayeon7lee/bert-summarization, Implementation of 'Pretraining-Based Natural Language Generation for Text Summarization', Paper: https://arxiv.org/pdf/1902.09243.pdf
-
dmmiller612/lecture-summarizer, Lecture summarizer with BERT
-
asyml/texar, Toolkit for Text Generation and Beyond https://texar.io, Texar is a general-purpose text generation toolkit, has also implemented BERT here for classification, and text generation applications by combining with Texar's other modules.
-
voidful/BertGenerate, Fine tuning bert for text generation,
-
Tiiiger/bert_score, BERT score for language generation,
-
sakuranew/BERT-AttributeExtraction, USING BERT FOR Attribute Extraction in KnowledgeGraph. fine-tuning and feature extraction.,
-
jkszw2014/bert-kbqa-NLPCC2017, A trial of kbqa based on bert for NLPCC2016/2017 Task 5, https://blog.csdn.net/ai_1046067944/article/details/86707784 ,
-
yuanxiaosc/Schema-based-Knowledge-Extraction, Code for http://lic2019.ccf.org.cn/kg,
-
yuanxiaosc/Entity-Relation-Extraction, Entity and Relation Extraction Based on TensorFlow.Schema based Knowledge Extraction, SKE 2019 http://lic2019.ccf.org.cn,
-
WenRichard/KBQA-BERT, https://zhuanlan.zhihu.com/p/62946533 ,
-
ianycxu/RGCN-with-BERT, Gated-Relational Graph Convolutional Networks (RGCN) with BERT for Coreference Resolution Task
-
isabellebouchard/BERT_for_GAP-coreference, BERT finetuning for GAP unbiased pronoun resolution
- jessevig/bertviz, Tool for visualizing BERT's attention,
-
GaoQ1/rasa_nlu_gq, turn natural language into structured data,
-
yuanxiaosc/BERT-for-Sequence-Labeling-and-Text-Classification, This is the template code to use BERT for sequence lableing and text classification, in order to facilitate BERT for more tasks. Currently, the template code has included conll-2003 named entity identification, Snips Slot Filling and Intent Prediction.
-
guillaume-chevalier/ReuBERT, A question-answering chatbot, simply.
-
hanxiao/bert-as-service, Mapping a variable-length sentence to a fixed-length vector using pretrained BERT model,
-
Kyubyong/bert-token-embeddings, Bert Pretrained Token Embeddings,
-
xu-song/bert_as_language_model, bert as language model, fork from https://github.com/google-research/bert,
-
yuanxiaosc/Deep_dynamic_word_representation, TensorFlow code and pre-trained models for deep dynamic word representation (DDWR). It combines the BERT model and ELMo's deep context word representation.,
-
imgarylai/bert-embedding, Token level embeddings from BERT model on mxnet and gluonnlp http://bert-embedding.readthedocs.io/,
-
whqwill/seq2seq-keyphrase-bert, add BERT to encoder part for https://github.com/memray/seq2seq-keyphrase-pytorch,
-
charles9n/bert-sklearn, a sklearn wrapper for Google's BERT model,
-
NVIDIA/Megatron-LM, Ongoing research training transformer language models at scale, including: BERT,
-
hankcs/BERT-token-level-embedding, Generate BERT token level embedding without pain
-
pengming617/bert_textMatching, 利用预训练的中文模型实现基于bert的语义匹配模型 数据集为LCQMC官方数据
-
Brokenwind/BertSimilarity, Computing similarity of two sentences with google's BERT algorithm
-
policeme/chinese_bert_similarity, bert chinese similarity
-
lonePatient/bert-sentence-similarity-pytorch, This repo contains a PyTorch implementation of a pretrained BERT model for sentence similarity task.
-
nouhadziri/DialogEntailment, The implementation of the paper "Evaluating Coherence in Dialogue Systems using Entailment" https://arxiv.org/abs/1904.03371
https://github.com/jeongukjae/KR-BERT-SimCSE
-
graykode/nlp-tutorial, Natural Language Processing Tutorial for Deep Learning Researchers https://www.reddit.com/r/MachineLearn…,
-
dragen1860/TensorFlow-2.x-Tutorials, TensorFlow 2.x version's Tutorials and Examples, including CNN, RNN, GAN, Auto-Encoders, FasterRCNN, GPT, BERT examples, etc. TF 2.0。,
- 깃허브에 sentence-transformers 다국어 모델과의 벤치마크 성능 비교를 기재해두었습니다) ko-sentence-transformers 라이브러리를 설치하시면 허깅페이스 허브에서 바로 다운받아 사용 가능합니다.
- 허깅페이스 모델: https://huggingface.co/jhgan/ko-sbert-multitask
- 깃허브 저장소: https://github.com/jhgan00/ko-sentence-transformers
종이: https://arxiv.org/pdf/2207.07116v1.pdf Github: https://github.com/lightdxy/bootmae
- Self-Supervised Learning for Time Series Analysis: Taxonomy, Progress, and Prospects, in arXiv 2023. [paper] [Website]
- A Survey on Graph Neural Networks for Time Series: Forecasting, Classification, Imputation, and Anomaly Detection, in arXiv 2023. [paper] [Website]
- Time series data augmentation for deep learning: a survey, in IJCAI 2021. [paper]
- Neural temporal point processes: a review, in IJCAI 2021. [paper]
- Time-series forecasting with deep learning: a survey, in Philosophical Transactions of the Royal Society A 2021. [paper]
- Deep learning for time series forecasting: a survey, in Big Data 2021. [paper]
- Neural forecasting: Introduction and literature overview, in arXiv 2020. [paper]
- Deep learning for anomaly detection in time-series data: review, analysis, and guidelines, in Access 2021. [paper]
- A review on outlier/anomaly detection in time series data, in ACM Computing Surveys 2021. [paper]
- A unifying review of deep and shallow anomaly detection, in Proceedings of the IEEE 2021. [paper]
- Deep learning for time series classification: a review, in Data Mining and Knowledge Discovery 2019. [paper]
- More related time series surveys, tutorials, and papers can be found at this repo.
- Make Transformer Great Again for Time Series Forecasting: Channel Aligned Robust Dual Transformer, in arXiv 2023. [paper]
- A Time Series is Worth 64 Words: Long-term Forecasting with Transformers, in ICLR 2023. [paper] [code]
- Crossformer: Transformer Utilizing Cross-Dimension Dependency for Multivariate Time Series Forecasting, in ICLR 2023. [paper]
- Scaleformer: Iterative Multi-scale Refining Transformers for Time Series Forecasting, in ICLR 2023. [paper]
- Non-stationary Transformers: Rethinking the Stationarity in Time Series Forecasting, in NeurIPS 2022. [paper]
- Learning to Rotate: Quaternion Transformer for Complicated Periodical Time Series Forecasting”, in KDD 2022. [paper]
- FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting, in ICML 2022. [paper] [official code]
- TACTiS: Transformer-Attentional Copulas for Time Series, in ICML 2022. [paper]
- Pyraformer: Low-Complexity Pyramidal Attention for Long-Range Time Series Modeling and Forecasting, in ICLR 2022. [paper] [official code]
- Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting, in NeurIPS 2021. [paper] [official code]
- Informer: Beyond efficient transformer for long sequence time-series forecasting, in AAAI 2021. [paper] [official code] [dataset]
- Temporal fusion transformers for interpretable multi-horizon time series forecasting, in International Journal of Forecasting 2021. [paper] [code]
- Probabilistic Transformer For Time Series Analysis, in NeurIPS 2021. [paper]
- Deep Transformer Models for Time Series Forecasting: The Influenza Prevalence Case, in arXiv 2020. [paper]
- Adversarial sparse transformer for time series forecasting, in NeurIPS 2020. [paper] [code]
- Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting, in NeurIPS 2019. [paper] [code]
- SSDNet: State Space Decomposition Neural Network for Time Series Forecasting, in ICDM 2021, [paper]
- From Known to Unknown: Knowledge-guided Transformer for Time-Series Sales Forecasting in Alibaba, in arXiv 2021. [paper]
- TCCT: Tightly-coupled convolutional transformer on time series forecasting, in Neurocomputing 2022. [paper]
- Triformer: Triangular, Variable-Specific Attentions for Long Sequence Multivariate Time Series Forecasting, in IJCAI 2022. [paper]
- AirFormer: Predicting Nationwide Air Quality in China with Transformers, in AAAI 2023. [paper] [official code]
- Earthformer: Exploring Space-Time Transformers for Earth System Forecasting, in NeurIPS 2022. [paper] [official code]
- Bidirectional Spatial-Temporal Adaptive Transformer for Urban Traffic Flow Forecasting, in TNNLS 2022. [paper]
- Spatio-temporal graph transformer networks for pedestrian trajectory prediction, in ECCV 2020. [paper] [official code]
- Spatial-temporal transformer networks for traffic flow forecasting, in arXiv 2020. [paper] [official code]
- Traffic transformer: Capturing the continuity and periodicity of time series for traffic forecasting, in Transactions in GIS 2022. [paper]
- HYPRO: A Hybridly Normalized Probabilistic Model for Long-Horizon Prediction of Event Sequences,in NeurIPS 2022. [paper] [official code]
- Transformer Embeddings of Irregularly Spaced Events and Their Participants, in ICLR 2022. [paper] [official code]
- Self-attentive Hawkes process, in ICML 2020. [paper] [official code]
- Transformer Hawkes process, in ICML 2020. [paper] [official code]
- CAT: Beyond Efficient Transformer for Content-Aware Anomaly Detection in Event Sequences, in KDD 2022. [paper] [official code]
- DCT-GAN: Dilated Convolutional Transformer-based GAN for Time Series Anomaly Detection, in TKDE 2022. [paper]
- Concept Drift Adaptation for Time Series Anomaly Detection via Transformer, in Neural Processing Letters 2022. [paper]
- Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy, in ICLR 2022. [paper] [official code]
- TranAD: Deep Transformer Networks for Anomaly Detection in Multivariate Time Series Data, in VLDB 2022. [paper] [official code]
- Learning graph structures with transformer for multivariate time series anomaly detection in IoT, in IEEE Internet of Things Journal 2021. [paper] [official code]
- Spacecraft Anomaly Detection via Transformer Reconstruction Error, in ICASSE 2019. [paper]
- Unsupervised Anomaly Detection in Multivariate Time Series through Transformer-based Variational Autoencoder, in CCDC 2021. [paper]
- Variational Transformer-based anomaly detection approach for multivariate time series, in Measurement 2022. [paper]
- TrajFormer: Efficient Trajectory Classification with Transformers, in CIKM 2022. [paper]
- TARNet : Task-Aware Reconstruction for Time-Series Transformer, in KDD 2022. [paper] [official code]
- A transformer-based framework for multivariate time series representation learning, in KDD 2021. [paper] [official code]
- Voice2series: Reprogramming acoustic models for time series classification, in ICML 2021. [paper] [official code]
- Gated Transformer Networks for Multivariate Time Series Classification, in arXiv 2021. [paper] [official code]
- Self-attention for raw optical satellite time series classification, in ISPRS Journal of Photogrammetry and Remote Sensing 2020. [paper] [official code]
- Self-supervised pretraining of transformers for satellite image time series classification, in IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 2020. [paper]
- Self-Supervised Transformer for Sparse and Irregularly Sampled Multivariate Clinical Time-Series, in ACM TKDD 2022. [paper] [official code]
840만 번역쌍으로 bart 기반으로 튜닝이 되어 있습니다. GPU가 있다면, transformers 에서 flash attention 2 와 사용하실 수도 있고 ctranslate2 버전도 있어 cpu에서도 충분히 빠르게 모델을 사용할 수도 있습니다. 저희 내부적인 전략이 영문 모델을 한글 튜닝하기 보다는 영문 모델을 기본으로 빠르게 follow-up 하되 앞뒤로 번역을 붙여서 쓰는 방식을 취하고 있는데, 관리차원에서 올린 모델중에 번역모델들이 꾸준하게 다운로드가 되고 있긴 하더라구요. https://huggingface.co/circulus/canvers-ko2en-v2 https://huggingface.co/circulus/canvers-en2ko-v2 https://huggingface.co/circulus/canvers-ko2en-ct2-v2 https://huggingface.co/circulus/canvers-en2ko-ct2-v2