Yafu Li (ęŽé›…å¤«)

I am a researcher in Shanghai / Pujiang AI Lab, under the supervision of Prof. Yu Cheng. I earned my PhD through joint training at Zhejiang University and Westlake University under the guidance of Prof. Yue Zhang. During my internship at Tencent AI Lab, I collaborated closely with Dr. Leyang Cui and Dr. Wei Bi.

I earned my Bachelorā€™s degree at Wuhan University and subsequently pursued a Masterā€™s degree at the University of Edinburgh, where I was supervised by Prof. Alex Lascarides. Prior to my PhD, I worked as an NLP researcher in Noah Ark's lab at Huawei, under the mentorship of Dr. Liangyou Li and Prof. Qun Liu.

Email  /  Google Scholar  /  Semantic Scholar  /  Twitter  /  Github

profile photo
Selected Publications

My research focuses on test-time scaling, trustworthy AI and machine translation. * denotes equal contributions.

Test-Time Scaling
PontTuset Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback
Yafu Li, Xuyang Hu, Xiaoye Qu, Linjie Li, Yu Cheng
preprint
Github / Paper

We present Test-Time Preference Optimization (TPO) that aligns LLMs during inference, surpassing strong baselines aligned during training.

PontTuset From Drafts to Answers: Unlocking LLM Potential via Aggregation Fine-Tuning
Yafu Li*, Zhilin Wang*, Tingchen Fu, Ganqu Cui, Sen Yang, Yu Cheng
preprint
Github / Paper

We present Aggregation Fine-Tuning (AFT) where the model learns to aggregate multiple drafts into a single answer.

Trustworthy AI
PontTuset MAGE: Machine-generated Text Detection in the Wild
Yafu Li, Qintong Li, Leyang Cui, Wei Bi, Longyue Wang, Linyi Yang, Shuming Shi, Yue Zhang
ACL, 2024
Github / Paper

We present a comprehensive benchmark dataset designed to assess the proficiency of machine-generated text detectors amidst real-world scenarios.

PontTuset Spotting AI's Touch: Identifying LLM-Paraphrased Spans in Text
Yafu Li*, Zhilin Wang*, Leyang Cui, Wei Bi, Shumin Shi, Yue Zhang
ACL Findings, 2024
Github / Paper

We propose a novel task to identify "AI-touched" text spans in a fine-grained manner.

PontTuset Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models
Yue Zhang, Yafu Li, Leyang Cui, Deng Cai, Lemao Liu, Tingchen Fu, Xinting Huang, Enbo Zhao, Yu Zhang, Yulong Chen, Longyue Wang, Anh Tuan Luu, Wei Bi, Freda Shi, Shuming Shi
preprint
Github / Paper

A survey of hallucination in LLMs.

Machine Translation
PontTuset Explicit Syntactic Guidance for Neural Text Generation
Yafu Li, Leyang Cui, Jianhao Yan, Yongjing Yin, Wei Bi, Shuming Shi, Yue Zhang
ACL, 2023, Best Paper Nomination (1.6%)
Github / Paper

We propose a syntax-guided generation schema, which generates the sequence guided by a constituency parse tree in a top-down direction.

PontTuset Multi-Granularity Optimization for Non-Autoregressive Translation
Yafu Li, Leyang Cui, Yongjing Yin, Yue Zhang
EMNLP, 2022
Github / Paper

We propose multi-granularity optimization for non-autoregressive translation, which collects model behaviors on translation segments of various granularities and integrates feedback for backpropagation.

PontTuset What Have We Achieved on Non-autoregressive Translation?
Yafu Li*, Huajian Zhang*, Jianhao Yan, Yongjing Yin, Yue Zhang
ACL Findings, 2024
Github / Paper

We present a systematic and comprehensive evaluation of NAT methods against AT.

PontTuset On Compositional Generalization of Neural Machine Translation
Yafu Li, Yongjing Yin, Yulong Chen, Yue Zhang
ACL, 2021
Github / Paper

Neural machine translation suffers poor compositionality.

Education

PhD in Computer Science, Zhejiang University and Westlake University (2020.9-2024.9).

Master of Science in Artificial Intelligence, University of Edinburgh (2017.9-2018.11).

Bachelor of Engineering in Electronic Information Engineering, Wuhan University (2013.9-2017.6).

Experience

Research Intern at Tencent AI lab (2022.10-2024.5).

Algorithem Engineer at Noah Ark's lab, Huawei (2018.12-2020.6).

Software Engineering Intern at VMware, Beijing (2016.9-2017.5).

Service

Reviewer: ACL, EMNLP, COLING, ACL ARR, IJCAI, TBD, TASLP, JAIR, TACL, TALLIP.

Honor

Outstanding Student Scholarship (Silver medal, Tencent Rhino-Bird Elite Program, 2024).

National Scholarship (Ministry of Education, 2023).

Dean's Medal (Westlake University, 2023).


Website's code is from Jon Barron.