Hi, human / crawler!
I am Yuning Mao
, a research scientist at Meta
GenAI.
I received my Ph.D degree in Computer Science from University of Illinois at
Urbana–Champaign (UIUC), where I was a member of Prof.
Jiawei Han's Group.
I received my Bachelor's degree from the IEEE Class, Shanghai Jiao Tong University (SJTU).
My research goal is to help humans acquire information and knowledge more effectively and efficiently. I have been working on a range of topics towards this goal, such as text summarization and generation, question answering, parameter-efficient fine-tuning, and taxonomy construction. Most recently, I am developing Meta's large language models (Llama 2 & 3, Meta AI assistant, etc.), especially around model safety.
[2024-04] We released Llama 3 along with Llama Guard 2 and Meta AI assistant powered by Llama 3.
[2023-12] We released Llama Guard for LLM Input-Output Safetyguard under the purple llama initialtive.
[2023-09] We released Meta AI assistant in Instagram, Whatsapp, and Messenger, which is also coming to Ray-Ban Meta smart glasses and Quest 3.
[2023-07] We released Llama 2, a collection of pretrained and fine-tuned LLMs [paper].
[2023-05] We released LIMA, a LLaMA-based model fine-tuned on only 1,000 curated prompts + responses, which produces shockingly good performance [paper] [twitter] [linkedIn].
-------------------------------- What's not-so-new --------------------------------
[2022-05] We released a new summarization dataset based on citation texts, CiteSum (featured in Paperswithcode newsletter). By pre-training on CiteSum, we achieve state-of-the-art zero-shot/few-shot performance in various downstream tasks of different domains [paper].
[2021-01] Check out how to get free gains for openQA by reranking retrieved passages without any training [paper].
[2020-10] Check out how to use constrained generation to preserve the factual consistency of abstractive summarization without worsening ROUGE [paper].
[2020-09] Check out how we achieve SOTA on Natural Questions and TriviaQA using "BM25" [paper].
[2020-09] Two papers on multi-document summarization and knowledge graph reasoning are accepted to EMNLP 2020.
[2020-05] Two papers on self-supervised taxonomy enrichment and knowledge collection of product knowledge graph are accepted to KDD 2020.
[2020-04] Our new metric for summarization, FAR (facet-aware evaluation) along with a detailed analysis of CNN/Daily Mail, has been accepted to ACL 2020.
[2020-04] #WWW2020 We released the first dataset on multi-document headline generation, NewSHead, along with the NHNet model in official TensorFlow repo.
[2018-08-25] After being stuck in Australia for 42 days due to US visa issue, I finally got back to the states... Check out my adventure in Oz (游记)!
[2018-04-20] Birthday Gift: Paper on Taxonomy Induction is accepted to ACL 2018.
[2017-12-10] Performed at the annual carol concert in Foellinger Great Hall.
[2017-11-16] We had a concert at McKinley Presbyterian Church!
I enjoy coding to better my (and others') life. I love both engineering work (if it's fun) and research-oriented projects. My interests lie in NLP and Gen AI. I write 🐍 cuz life is short.
To my Github
My love for note ♪ is equivalent to code. I enjoy 🎤 privately (and in
public). I used to play 🎻 but play 🎹 and 🎸 more often now. I also
compose (very basic) music at times. (I wish I had more time on music)
I love all kinds of creatures in various sizes. Since I was a child, I have had 🐕, 🐈, 🐓, 🦆, 🕊, 🐌, 🐛, 🕷, 🐟, 🐢, 🌸, 🌱(🍆🌶🥒...🌽🍅🍠) ... I like insects, not "bugs". I am a "professional" 🐜 keeper!
To my Ants