layout
home

I am a (Now() - 04/2021).ceil().ordinal() year PhD student studying natural language processing (NLP) at University of Washington. I am fortunate to be advised by Prof. Yejin Choi and Prof. Hanna Hajishirzi. I am also a part-time researcher at the Allen Institute for AI.

My current research topics are inspecting massive text corpora, training data attribution, LM pretraining, and scaling laws. During my PhD, I have worked on commonsense knowledge generation and verification, automated theorem proving, RLHF, and text decoding.

Previously, I received B.S. in Computer Science from University of Illinois at Urbana-Champaign, where I worked with Prof. Julia Hockenmaier. I used to work in Facebook's Natural Language Generation (NLG) team.

My name in Chinese characters is 刘嘉程

Email: liujc [at] cs.washington.edu

[CV] [Google Scholar] [GitHub] [Twitter] [LinkedIn]

Research and other blogs: this website and [Zhihu]

Private pilot and other personal life VLOGs: [Bilibili] [YouTube]

Personal: [Facebook]

News

(2024.07) Infini-gram and PPO-MCTS are accepted to COLM 2024.
(2023.10) PPO-MCTS is featured by 机器之心 on WeChat!
(2023.10) Vera and Crystal are accepted to EMNLP 2023 (main conference).
(2023.09) The Inverse Scaling paper is accepted to TMLR! Check out our contributed dataset, memo-trap, where LLMs demonstrate the strongest inverse scaling trends.
(2023.07) I am awarded the Qualcomm Innovation Fellowship for academic year 2023-2024.
(2023.05) Invited talk the the MLNLP Seminar: Estimating the plausibility of commonsense statements.
(2023.02) Our submission to the Inverse Scaling Challenge, memo-trap, receives one of the 11 Third Prizes!

Publications

Preprints

AI as Humanity's Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text \ Ximing Lu, Melanie Sclar, Skyler Hallinan, Niloofar Mireshghallah, Jiacheng Liu, Seungju Han, Allyson Ettinger, Liwei Jiang, Khyathi Chandu, Nouha Dziri, Yejin Choi \ [Arxiv] [Demo]

Peer-Reviewed Papers

Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback \ Hamish Ivison, Yizhong Wang, Jiacheng Liu, Zeqiu Wu, Valentina Pyatkin, Nathan Lambert, Noah A Smith, Yejin Choi, Hannaneh Hajishirzi \ NeurIPS 2024 \ [Arxiv] [Code] [Models]

Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens \ Jiacheng Liu, Sewon Min, Luke Zettlemoyer, Yejin Choi, Hannaneh Hajishirzi \ COLM 2024 (Oral Spotlight, 2%) \ [Arxiv] [Project Page] [Demo]

Don't throw away your value model! Making PPO even better via Value-Guided Monte-Carlo Tree Search decoding \ Jiacheng Liu, Andrew Cohen, Ramakanth Pasunuru, Yejin Choi, Hannaneh Hajishirzi, Asli Celikyilmaz \ COLM 2024 \ [Arxiv] [Code]

Are machines better at complex reasoning? Unveiling human-machine inference gaps in entailment verification \ Soumya Sanyal, Tianyi Xiao, Jiacheng Liu, Wenya Wang, Xiang Ren \ ACL 2024 (Findings) \ [Arxiv] [Model]

MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts \ Pan Lu, Hritik Bansal, Tony Xia, Jiacheng Liu, Chunyuan Li, Hannaneh Hajishirzi, Hao Cheng, Kai-Wei Chang, Michel Galley, Jianfeng Gao \ ICLR 2024 (Oral); NeurIPS 2023 MATH-AI Workshop \ [Arxiv] [Project Page] [Code] [Dataset] [HF Dataset]

Crystal: Introspective Reasoners Reinforced with Self-Feedback \ Jiacheng Liu, Ramakanth Pasunuru, Hannaneh Hajishirzi, Yejin Choi, Asli Celikyilmaz \ EMNLP 2023 (Main Conference, Oral) \ [Arxiv] [Code] [Models: large 3b 11b] [Demo]

Vera: A General-Purpose Plausibility Estimation Model for Commonsense Statements \ Jiacheng Liu, Wenya Wang, Dianzhuo Wang, Noah A. Smith, Yejin Choi, Hannaneh Hajishirzi \ EMNLP 2023 (Main Conference, Oral) \ [Arxiv] [Code] [Model] [Demo] [Dataset]

Inverse Scaling: When Bigger Isn't Better \ Ian R McKenzie, ..., Jiacheng Liu, ..., Samuel R Bowman, Ethan Perez \ TMLR (2023.10) \ [Arxiv]

Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs \ Albert Qiaochu Jiang, Sean Welleck, Jin Peng Zhou, Timothee Lacroix, Jiacheng Liu, Wenda Li, Mateja Jamnik, Guillaume Lample, Yuhuai Wu \ ICLR 2023 (Oral, 5%) \ [Arxiv]

Rainier: Reinforced Knowledge Introspector for Commonsense Question Answering \ Jiacheng Liu, Skyler Hallinan, Ximing Lu, Pengfei He, Sean Welleck, Hannaneh Hajishirzi, Yejin Choi \ EMNLP 2022 (Main Conference) \ [Arxiv] [Code/Data] [Models: Policy Value] [Demo]

NaturalProver: Grounded Mathematical Proof Generation with Language Models \ Sean Welleck, Jiacheng Liu, Ximing Lu, Hannaneh Hajishirzi, Yejin Choi \ NeurIPS 2022 \ [Arxiv] [Code]

NaturalProver: Grounded Natural Language Proof Generation with Language Models \ Sean Welleck, Jiacheng Liu, Ximing Lu, Hannaneh Hajishirzi, Yejin Choi \ AITP 2022 (Contributed Talk) \ [Talk]

Generated Knowledge Prompting for Commonsense Reasoning \ Jiacheng Liu, Alisa Liu, Ximing Lu, Sean Welleck, Peter West, Ronan Le Bras, Yejin Choi, Hannaneh Hajishirzi \ ACL 2022 (Main Conference) \ [Arxiv] [Code] [Talk] [Poster]

Towards Grounded Natural Language Proof Generation \ Sean Welleck, Jiacheng Liu, Jesse Michael Han, Yejin Choi \ NeurIPS 2021 MATHAI4ED Workshop (Contributed Talk) \ [Talk] [Poster]

NaturalProofs: Mathematical Theorem Proving in Natural Language \ Sean Welleck, Jiacheng Liu, Ronan Le Bras, Hannaneh Hajishirzi, Yejin Choi, Kyunghyun Cho \ NeurIPS 2021 Datasets and Benchmarks Track (Oral, 1%) \ [Arxiv] [Data/Code/Models] [Project Page] [Talk] [Slides]

NaturalProofs: Mathematics meets Natural Language \ Sean Welleck, Jiacheng Liu, Ronan Le Bras, Hannaneh Hajishirzi, Yejin Choi, Kyunghyun Cho \ AITP 2021 (Contributed Talk) \ [Talk] [Slides]

Phrase Grounding by Soft-Label Chain Conditional Random Field \ Jiacheng Liu, Julia Hockenmaier \ EMNLP-IJCNLP 2019 (Oral) \ [Arxiv] [Code] [Slides]

CrossWeigh: Training Named Entity Tagger from Imperfect Annotations \ Zihan Wang, Jingbo Shang, Liyuan Liu, Lihao Lu, Jiacheng Liu, Jiawei Han \ EMNLP-IJCNLP 2019 (Oral) \ [Arxiv] [Code] [Slides]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

index.md

index.md

News

Publications

Preprints

Peer-Reviewed Papers

Files

index.md

Latest commit

History

index.md

File metadata and controls

News

Publications

Preprints

Peer-Reviewed Papers