DM2 Lab

Meng Jiang

I'm an Associate Professor in the Department of Computer Science and Engineering at the University of Notre Dame. I'm appointed as a Lucy Family Institute Fellow. My research fields are data mining, machine learning, and natural language processing. My data science research focuses on graph and text data for applications such as material discovery, recommender system, question answering, education, and mental health. [C.V.]

My recent projects focus on knowledge-augmented NLP, auto-instruct LLM, self-correct LLM, personalized LLM, harm-unlearned LLM, graph neural networks, graph data augmentation, and graph diffusion model.

I am directing the Data Mining towards Decision Making (DM2) Lab, supported by National Science Foundation (NSF), National Institutes of Health (NIH), Office of Naval Research (ONR), and Amazon.

What's New

April 2025: MIT News covered our recent ICLR paper on our molecular multimodal large language models: Llamole.
March 2025: I am happy to announce that I am appointed as the Co-Director of Foundation Models Lab at Lucy Family Institute for Data and Society (with Prof. Xiangliang Zhang)!
February 2025: Three new benchmarks Multimodal Unlearning (led by Frank), Instruction-following Hierarchy (led by Zhihan), and MultiChartQA were accepted to NAACL!
February 2025: MLLM for Molecular Design and Learning Molecular Representations in a Cell (both led by Gang) were accepted to ICLR!
January 2025: Gang Liu receives the 2024-2025 IBM PhD Fellowship for his work on Foundation Models. Congratulations!
December 2024: Qingkai Zeng successfully defended his thesis Improving Scientific Information Extraction with Text Generation. Congratulations, Dr. Zeng!
November 2024: Sequential Recommendation (led by Gang) was accepted to KDD 2025!
November 2024: Motif-aware Graph Pre-training (led by Eric) was accepted to LoG!
September 2024: Graph Diffusion Transformer (Graph DiT) (led by Gang) was accepted to NeurIPS!
September 2024: "Personalized PEFT" and "Collaborative PEFT" (led by Zhaoxuan), "Reflection Augmentation" (led by Zhihan), "Self-correct LLM" (led by Zhenyu), "Reference-free QG Evaluation" (led by Bang), and "Complex Instruction-following" (led by Noah) were accepted to EMNLP!
July 2024: "LLM for Taxonomy Induction" (led by Qingkai) was accepted to CIKM!
May 2024: "Cross-Lingual Instruction Tuning" (led by Zhihan) and "Machine Unlearning for LLM Safety" (led by Frank) were accepted to ACL!

Latest Publications

Protecting Privacy in Multimodal Large Language Models with MLLMU-Bench, NAACL, 2025.
IHEval: Evaluating Language Models on Following the Instruction Hierarchy, NAACL, 2025.
MultiChartQA: Benchmarking Vision-Language Models on Multi-Chart Problems, NAACL, 2025.
Benchmarking Language Model Creativity: A Case Study on Code Generation, NAACL, 2025.
Multimodal Large Language Models for Inverse Molecular Design with Retrosynthetic Planning, ICLR, 2025.
Learning Molecular Representation in a Cell, ICLR, 2025.
Limitations of the LLM-as-a-Judge Approach for Evaluating LLM Outputs in Expert Knowledge Tasks, IUI, 2025.
Learning Attribute as Explicit Relation for Sequential Recommendation, KDD, 2025.
Motif-aware Attribute Masking for Molecular Graph Pre-training, LoG, 2024.
Graph Diffusion Transformer for Multi-Conditional Molecular Generation, NeurIPS, 2024.
Large Language Models Can Self-Correct with Key Condition Verification, EMNLP, 2024.
Personalized Pieces: Efficient Personalized Large Language Models through Collaborative Efforts, EMNLP, 2024.
Democratizing Large Language Models via Personalized Parameter-Efficient Fine-tuning, EMNLP, 2024.
Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning, EMNLP, 2024.
Reference-based Metrics Disprove Themselves in Question Generation, Findings of EMNLP, 2024.
TOWER: Tree Organized Weighting for Evaluating Complex Instructions, Findings of EMNLP, 2024.
Chain-of-Layer: Iteratively Prompting Large Language Models for Taxonomy Induction from Limited Examples, CIKM, 2024.
PLUG: Leveraging Pivot Language in Cross-Lingual Instruction Tuning, ACL, 2024.
Towards Safer Large Language Models through Machine Unlearning, Findings of ACL, 2024.
Instructing Large Language Models to Identify and Ignore Irrelevant Conditions, NAACL, 2024. [project]
OpenFMNav: Towards Open-Set Zero-Shot Object Navigation via Vision-Language Foundation Models, Findings of NAACL, 2024. [project]
Get an A in Math: Progressive Rectification Prompting, AAAI, 2024. [project]

Advised PhD Dissertations

Daheng Wang: Learning Complementarity and Dynamics for Contextual Behavior Modeling (2021)
Tong Zhao: Learning to Augment Data in Graphs (2022)
Wenhao Yu: Knowledge Augmented Methods for NLP and Beyond (2023)
Qingkai Zeng: Improving Scientific Information Extraction with Text Generation (2024)

Talks and Abstracts

Lessons Learned from Enhancing Knowledge and Reasoning for (Large) Language Models (2024)
Effective and Efficient Knowledge-Intensive NLP (2023) [abstract]: cover RACo (EMNLP 2022), GenRead (ICLR 2023), and EDMem (EMNLP 2022).
Data Augmentation for Graph Regression (2023) [abstract]: cover GREA (KDD 2022), SGIR (KDD 2023), and DCT (NeurIPS 2023).
Enhancing Language Generation with Knowledge Graphs (2022) [abstract]: cover FASum (NAACL 2021), MoKGE (ACL 2022), and EDMem (EMNLP 2022).
Novel Methods that Learn to Augment Graph Data (2021) [abstract]: cover GAug (AAAI 2021), Eland (CIKM 2021), CFLP (ICML 2022), and GREA (KDD 2022).
Structured Knowledge is Still Essential to Understand Sciences (2020) [abstract]: cover SciKG (KDD 2019), MIMO (EMNLP 2019), Tablepedia (WWW 2020), TCN (WWW 2021), and GenTaxo (KDD 2021).
Graph Learning for Behavior Modeling (2020): cover TUBE (KDD 2019), M2TUBE (TNNLS 2022), CalendarGNN (KDD 2020), CoEvoGNN (DLG 2020 Best Paper / TKDE 2021), GAL (CIKM 2021), and PamFul (TNNLS 2021), including user profiling, recommendation, and fraud detection.

Last updated on April 10, 2025.