Yuning Mao - Home

About Me

Hi, human / crawler!

I am Yuning Mao , a research scientist at Llama Post-training, Meta GenAI. I received my Ph.D degree in Computer Science from the University of Illinois at Urbana–Champaign (UIUC), where I was a member of Prof. Jiawei Han's Group. I received my Bachelor's degree from the IEEE Class, Shanghai Jiao Tong University (SJTU).

My research goal is to help humans acquire information and knowledge more effectively and efficiently. At Meta, I have been developing large language models (Llama 2, 3, 4🦙). I am one of the founding members of Meta GenAI org. Most recently, I've been focused on LLM for code generation, in particular generating interative websites, APPs, and games. Previously, I was working on all sorts of things LLM safety - I was in charge of safety eval for Llama2, then safety modeling during Meta AI initial launch, and in 2024 multilingual/i18n safety for Llama3/Meta AI as well as the development of Meta AI auto redteaming. During PhD, I worked on a range of topics such as text summarization and generation, question answering, parameter-efficient fine-tuning, and taxonomy construction.

Misc: I'm broadly interested in anyone who's into Chinese language and culture. If you are motivated to learn Mandarin, I'm happy to help!

What's New

-------------------------------- What's not-so-new --------------------------------

[2024-04] We released Llama 3 along with Llama Guard 2 and Meta AI assistant powered by Llama 3.

[2023-12] We released Llama Guard for LLM Input-Output Safetyguard under the purple llama initialtive.

[2023-09] We released Meta AI assistant in Instagram, Whatsapp, and Messenger, which is also coming to Ray-Ban Meta smart glasses and Quest 3.

[2023-07] We released Llama 2, a collection of pretrained and fine-tuned LLMs [paper].

[2023-05] We released LIMA, a LLaMA-based model fine-tuned on only 1,000 curated prompts + responses, which produces shockingly good performance [paper] [twitter] [linkedIn].

[2022-05] We released a new summarization dataset based on citation texts, CiteSum (featured in Paperswithcode newsletter). By pre-training on CiteSum, we achieve state-of-the-art zero-shot/few-shot performance in various downstream tasks of different domains [paper].

[2021-01] Check out how to get free gains for openQA by reranking retrieved passages without any training [paper].

[2020-10] Check out how to use constrained generation to preserve the factual consistency of abstractive summarization without worsening ROUGE [paper].

[2020-09] Check out how we achieve SOTA on Natural Questions and TriviaQA using "BM25" [paper].

[2020-09] Two papers on multi-document summarization and knowledge graph reasoning are accepted to EMNLP 2020.

[2020-05] Two papers on self-supervised taxonomy enrichment and knowledge collection of product knowledge graph are accepted to KDD 2020.

[2020-04] Our new metric for summarization, FAR (facet-aware evaluation) along with a detailed analysis of CNN/Daily Mail, has been accepted to ACL 2020.

[2020-04] #WWW2020 We released the first dataset on multi-document headline generation, NewSHead, along with the NHNet model in official TensorFlow repo.

[2018-08-25] After being stuck in Australia for 42 days due to US visa issue, I finally got back to the states... Check out my adventure in Oz (游记)!

[2018-04-20] Birthday Gift: Paper on Taxonomy Induction is accepted to ACL 2018.

[2017-12-10] Performed at the annual carol concert in Foellinger Great Hall.

[2017-11-16] We had a concert at McKinley Presbyterian Church!

What I Enjoy

Code

I enjoy coding to better my (and others') life. I love both engineering work (if it's fun) and research-oriented projects. My interests lie in NLP and Gen AI. I write 🐍 cuz life is short.

To my Github

Music

My love for note ♪ is equivalent to code. I enjoy 🎤 privately (~~and in public~~). I used to play 🎻 but play 🎹 and 🎸 more often now. I also compose (~~very basic~~) music at times. (I'm spending more time on music now)

To my music

Nature

I love all kinds of creatures in various sizes. Since I was a child, I have had 🐕, 🐈, 🐓, 🦆, 🕊, 🐌, 🐛, 🕷, 🐟, 🐢, 🌸, 🌱(🍆🌶🥒...🌽🍅🍠) ... I like insects, not "bugs". I am a "professional" 🐜 keeper!

To my Ants

Selected Publications

Only selected publications during my PhD are listed below
[Full list@Google Scholar]

Yuning Mao, Ming Zhong, Jiawei Han, "CiteSum: Citation Text-guided Scientific Extreme Summarization and Low-resource Domain Adaptation", EMNLP 2022 [paper] [code]
Yuning Mao, Lambert Mathias, Rui Hou, Amjad Almahairi, Hao Ma, Jiawei Han, Wen-tau Yih, Madian Khabsa, “UniPELT: A Unified Framework for Parameter-Efficient Language Model Tuning”, ACL 2022 [paper] [code]
Yiqing Xie, Jiaming Shen, Sha Li, Yuning Mao, Jiawei Han, “Eider: Evidence-enhanced Document-level Relation Extraction”, Findings of ACL 2022. [paper] [code]
Yuning Mao, Wenchang Ma, Deren Lei, Jiawei Han, Xiang Ren, "Extract, Denoise and Enforce: Evaluating and Improving Concept Preservation for Text-to-Text Generation", EMNLP 2021. [paper] [code]
Yuning Mao, Pengcheng He, Xiaodong Liu, Yelong Shen, Jianfeng Gao, Jiawei Han, Weizhu Chen, "Reader-Guided Passage Reranking for Open-Domain Question Answering", Findings of ACL 2021. [paper] [code]
Yuning Mao, Pengcheng He, Xiaodong Liu, Yelong Shen, Jianfeng Gao, Jiawei Han, Weizhu Chen, "Generation-Augmented Retrieval for Open-domain Question Answering", ACL 2021. [paper] [code]
Yuning Mao, Xiang Ren, Heng Ji, Jiawei Han, "Constrained Abstractive Summarization: Preserving Factual Consistency with Constrained Generation", arXiv 2020. [paper] [code]
Yuning Mao, Yanru Qu, Yiqing Xie, Xiang Ren and Jiawei Han, "Multi-document Summarization with Maximal Marginal Relevance-guided Reinforcement Learning", EMNLP 2020. [paper] [code]
Yuning Mao, Tong Zhao, Andrey Kan, Chenwei Zhang, Xin Luna Dong, Christos Faloutsos and Jiawei Han, "Octet: Online Catalog Taxonomy Enrichment with Self-Supervision", KDD 2020. [paper] [video]
Yuning Mao, Liyuan Liu, Qi Zhu, Xiang Ren, Jiawei Han, "Facet-Aware Evaluation for Extractive Summarization", ACL 2020. [paper] [code]
Xiaotao Gu, Yuning Mao, Jiawei Han, Jialu Liu, You Wu, Cong Yu, Daniel Finnie, Hongkun Yu, Jiaqi Zhai, Nicholas Zukoski, "Generating Representative Headlines for News Stories", WWW 2020. [paper] [code]
Yuning Mao, Jingjing Tian, Jiawei Han and Xiang Ren, "Hierarchical Text Classification with Reinforced Label Assignment", EMNLP 2019. [paper] [code]
Yuning Mao, Xiang Ren, Jiaming Shen, Xiaotao Gu and Jiawei Han, "End-to-End Reinforcement Learning for Automatic Taxonomy Induction", ACL 2018. [paper] [code]

Selected Experience

[Work Experience@LinkedIn]

Meta AI
Research Scientist
June 2022 -- Present
Working on some fun stuff for Llama 4🦙 now
Previously:
Led auto-redteaming for various GenAI products & models
Led multilingual/i18n safety of Llama 3 & Meta AI
Led safety modeling during Meta AI initial release
Led safety eval of Llama 2

Facebook AI
Research Intern, AI Integrity & FAIR
May 2021 -- Dec 2021
Hosts: Madian Khabsa, Scott Yih, Hao Ma
Topic: Parameter-Efficient Fine Tuning

Microsoft AI
Research Intern, D365 AI & MSR
May 2020 -- May 2021
Hosts: Pengcheng He, Xiaodong Liu, Yelong Shen, Weizhu Chen, Jianfeng Gao
Topic: Open-Domain Question Answering

Amazon Science
Research Intern, Product Graph Team
May 2019 -- Oct 2019
Hosts: Tong Zhao, Luna Dong
Topic: Taxonomy Construction

Microsoft Research
Research Intern, MSR Asia
Sept 2016 -- Feb 2017
Hosts: Dawei Zhang, Jun Yan
Topic: Relation Extraction

Awards

Yunni & Maxine Pao Memorial Fellowship 2021-2022
KDD'20 Student Travel Award 2020
China National Scholarship (twice, top 2 within major) 2015-2017
Academic Excellence Scholarship of SJTU (rank 1/72 for 3 of 6 semesters) 2014-2017
First Prize in RoboCup China (top 2 team) 2014, 2015

~Morning~