~Morning~

Never settle

-
-

About Me     headshot

Hi, human / crawler!

I am Yuning Mao , a research scientist at Meta GenAI. I received my Ph.D degree in Computer Science from University of Illinois at Urbana–Champaign (UIUC), where I was a member of Prof. Jiawei Han's Group. I received my Bachelor's degree from the IEEE Class, Shanghai Jiao Tong University (SJTU).

My research goal is to help humans acquire information and knowledge more effectively and efficiently. I have been working on a range of topics towards this goal, such as text summarization and generation, question answering, parameter-efficient fine-tuning, and taxonomy construction. Most recently, I am developing Meta's large language models (Llama 2 & 3, Meta AI assistant, etc.), especially around model safety.

[Intern hiring] We're looking for self-motivated research interns experienced in LLMs for summer 2025, with a range of potential topics such as LLM safety (auto redteaming, multilinguality, ...), data (data selection, synthetic data, analysis of data impact on LLM performance, ...), alignment (RLHF, steerability, agent, ...). If interested, pls drop me an email with the projects/directions you want to work on.
[FTE hiring] We are hiring research scientists (IC4-IC7) with strong LLM background for post-training (acorss prod & research, safety & helpfulness). Contact me if you're interested.

What's New

[2024-04] We released Llama 3 along with Llama Guard 2 and Meta AI assistant powered by Llama 3.

[2023-12] We released Llama Guard for LLM Input-Output Safetyguard under the purple llama initialtive.

[2023-09] We released Meta AI assistant in Instagram, Whatsapp, and Messenger, which is also coming to Ray-Ban Meta smart glasses and Quest 3.

[2023-07] We released Llama 2, a collection of pretrained and fine-tuned LLMs [paper].

[2023-05] We released LIMA, a LLaMA-based model fine-tuned on only 1,000 curated prompts + responses, which produces shockingly good performance [paper] [twitter] [linkedIn].

-------------------------------- What's not-so-new --------------------------------

[2022-05] We released a new summarization dataset based on citation texts, CiteSum (featured in Paperswithcode newsletter). By pre-training on CiteSum, we achieve state-of-the-art zero-shot/few-shot performance in various downstream tasks of different domains [paper].

[2021-01] Check out how to get free gains for openQA by reranking retrieved passages without any training [paper].

[2020-10] Check out how to use constrained generation to preserve the factual consistency of abstractive summarization without worsening ROUGE [paper].

[2020-09] Check out how we achieve SOTA on Natural Questions and TriviaQA using "BM25" [paper].

[2020-09] Two papers on multi-document summarization and knowledge graph reasoning are accepted to EMNLP 2020.

[2020-05] Two papers on self-supervised taxonomy enrichment and knowledge collection of product knowledge graph are accepted to KDD 2020.

[2020-04] Our new metric for summarization, FAR (facet-aware evaluation) along with a detailed analysis of CNN/Daily Mail, has been accepted to ACL 2020.

[2020-04] #WWW2020 We released the first dataset on multi-document headline generation, NewSHead, along with the NHNet model in official TensorFlow repo.

[2018-08-25] After being stuck in Australia for 42 days due to US visa issue, I finally got back to the states... Check out my adventure in Oz (游记)!

[2018-04-20] Birthday Gift: Paper on Taxonomy Induction is accepted to ACL 2018.

[2017-12-10] Performed at the annual carol concert in Foellinger Great Hall.

[2017-11-16] We had a concert at McKinley Presbyterian Church!

What I Enjoy

Code

I enjoy coding to better my (and others') life. I love both engineering work (if it's fun) and research-oriented projects. My interests lie in NLP and Gen AI. I write 🐍 cuz life is short.

To my Github
Music

My love for note ♪ is equivalent to code. I enjoy 🎤 privately (and in public). I used to play 🎻 but play 🎹 and 🎸 more often now. I also compose (very basic) music at times. (I wish I had more time on music)

To my music
Nature

I love all kinds of creatures in various sizes. Since I was a child, I have had 🐕, 🐈, 🐓, 🦆, 🕊, 🐌, 🐛, 🕷, 🐟, 🐢, 🌸, 🌱(🍆🌶🥒...🌽🍅🍠) ... I like insects, not "bugs". I am a "professional" 🐜 keeper!

To my Ants

Selected Publications

Only selected publications during my PhD are listed below
[Full list@Google Scholar]
  1. Yuning Mao, Ming Zhong, Jiawei Han, "CiteSum: Citation Text-guided Scientific Extreme Summarization and Low-resource Domain Adaptation", EMNLP 2022 [paper] [code]
  2. Yuning Mao, Lambert Mathias, Rui Hou, Amjad Almahairi, Hao Ma, Jiawei Han, Wen-tau Yih, Madian Khabsa, “UniPELT: A Unified Framework for Parameter-Efficient Language Model Tuning”, ACL 2022 [paper] [code]
  3. Yiqing Xie, Jiaming Shen, Sha Li, Yuning Mao, Jiawei Han, “Eider: Evidence-enhanced Document-level Relation Extraction”, Findings of ACL 2022. [paper] [code]
  4. Yuning Mao, Wenchang Ma, Deren Lei, Jiawei Han, Xiang Ren, "Extract, Denoise and Enforce: Evaluating and Improving Concept Preservation for Text-to-Text Generation", EMNLP 2021. [paper] [code]
  5. Yuning Mao, Pengcheng He, Xiaodong Liu, Yelong Shen, Jianfeng Gao, Jiawei Han, Weizhu Chen, "Reader-Guided Passage Reranking for Open-Domain Question Answering", Findings of ACL 2021. [paper] [code]
  6. Yuning Mao, Pengcheng He, Xiaodong Liu, Yelong Shen, Jianfeng Gao, Jiawei Han, Weizhu Chen, "Generation-Augmented Retrieval for Open-domain Question Answering", ACL 2021. [paper] [code]
  7. Yuning Mao, Xiang Ren, Heng Ji, Jiawei Han, "Constrained Abstractive Summarization: Preserving Factual Consistency with Constrained Generation", arXiv 2020. [paper] [code]
  8. Yuning Mao, Yanru Qu, Yiqing Xie, Xiang Ren and Jiawei Han, "Multi-document Summarization with Maximal Marginal Relevance-guided Reinforcement Learning", EMNLP 2020. [paper] [code]
  9. Yuning Mao, Tong Zhao, Andrey Kan, Chenwei Zhang, Xin Luna Dong, Christos Faloutsos and Jiawei Han, "Octet: Online Catalog Taxonomy Enrichment with Self-Supervision", KDD 2020. [paper] [video]
  10. Yuning Mao, Liyuan Liu, Qi Zhu, Xiang Ren, Jiawei Han, "Facet-Aware Evaluation for Extractive Summarization", ACL 2020. [paper] [code]
  11. Xiaotao Gu, Yuning Mao, Jiawei Han, Jialu Liu, You Wu, Cong Yu, Daniel Finnie, Hongkun Yu, Jiaqi Zhai, Nicholas Zukoski, "Generating Representative Headlines for News Stories", WWW 2020. [paper] [code]
  12. Yuning Mao, Jingjing Tian, Jiawei Han and Xiang Ren, "Hierarchical Text Classification with Reinforced Label Assignment", EMNLP 2019. [paper] [code]
  13. Yuning Mao, Xiang Ren, Jiaming Shen, Xiaotao Gu and Jiawei Han, "End-to-End Reinforcement Learning for Automatic Taxonomy Induction", ACL 2018. [paper] [code]

Selected Experience

    [Work Experience@LinkedIn]
  • Meta AI
    Research Scientist
    June 2022 -- Present
    Topics: NLP, Gen AI, LLM

  • Facebook AI
    Research Intern, AI Integrity & FAIR
    May 2021 -- Dec 2021
    Hosts: Madian Khabsa, Scott Yih, Hao Ma
    Topic: Parameter-Efficient Fine Tuning

  • Microsoft AI
    Research Intern, D365 AI & MSR
    May 2020 -- May 2021
    Hosts: Pengcheng He, Xiaodong Liu, Yelong Shen, Weizhu Chen, Jianfeng Gao
    Topic: Open-Domain Question Answering

  • Amazon Science
    Research Intern, Product Graph Team
    May 2019 -- Oct 2019
    Hosts: Tong Zhao, Luna Dong
    Topic: Taxonomy Construction

  • Microsoft Research
    Research Intern, MSR Asia
    Sept 2016 -- Feb 2017
    Hosts: Dawei Zhang, Jun Yan
    Topic: Relation Extraction

    Awards
  1. Yunni & Maxine Pao Memorial Fellowship 2021-2022
  2. KDD'20 Student Travel Award 2020
  3. China National Scholarship (twice, top 2 within major) 2015-2017
  4. Academic Excellence Scholarship of SJTU (rank 1/72 for 3 of 6 semesters) 2014-2017
  5. First Prize in RoboCup China (top 2 team) 2014, 2015