Exploring secure and trustworthy AI, from deepfake detection to robust large language models.

Based in Qingdao, China. 3nd-year Ph.D., School of Cyber Science and Technology at Shandong University.

Research Interests:

  • Trustworthy Machine Learning: Researching safety, robustness, and privacy across generative models and LLM agents.
  • Deepfake Forensics: Building attacks and defenses for facial forgery detection in practical pipelines.
  • Secure LLM Systems: Designing evaluation frameworks that expose risk interactions and support safer deployments.

📖 Education

  • 2023.09 - Present, Ph.D. Student, School of Cyber Science and Technology, Shandong University, Qingdao, China.
    Advisor: Xiaoyun Wang
  • 2020.09 - 2023.06, Master’s Degree, Shandong University (Thesis on robustness research for deepfake detection).
    Advisor: Shanqing Guo

🔥 News

  • 2025.11:  🎉 Featured by MIT Technology Review China - Media coverage of our latest LLM defense study.
  • 2025.10:  📄 Preprint: From Defender to Devil? - Investigating unintended risk interactions introduced by LLM defenses.
  • 2025.09:  🎉 ErrorTrace accepted at NeurIPS 2025 (spotlight) - Black-box traceability based on model family error space.
  • 2025.09:  🤝 Industry collaboration launched - Joint research project on LLM security testing and risk assessment with Topsec.
  • 2025.08:  📄 Preprint: Safe-Control - Safety patch for mitigating unsafe content in text-to-image generation models.
  • 2025.08:  🎉 DCMI accepted at CCS 2025 - Differential calibration membership inference against RAG.
  • 2025.03:  🎉 Fuzz-testing meets LLM-based agents accepted at IEEE S&P 2025 - Automated framework for jailbreaking text-to-image generation models.
  • 2024.11:  🏆 Outstanding master’s thesis - Recognized for thesis on robustness research for deepfake detection.

🎖 Honors and Awards

  • 2024.11 Outstanding master’s thesis - Recognized for thesis on robustness research for deepfake detection.

👔 Academic Services

  • 2025, Reviewer for IEEE Transactions on Information Forensics and Security (TIFS)

📝 Publications

2025

arXiv 2025
sym

From Defender to Devil? Unintended Risk Interactions Induced by LLM Defenses

Xiangtao Meng, Tianshuo Cong, Li Wang, Wenyu Chen, Zheng Li✉, Shanqing Guo✉, Xiaoyun Wang✉

arXiv · LLM Safety Risk Analysis

Paper

  • Investigating unintended risk interactions introduced by LLM defenses.
NeurIPS 2025
sym

ErrorTrace: A Black-Box Traceability Mechanism Based on Model Family Error Space

Chuanchao Zang, Xiangtao Meng, Wenyu Chen, Tianshuo Cong, Zha Yaxing, Dong Qi, Zheng Li, Shanqing Guo

NeurIPS (Spotlight) · Model Provenance

Link

  • Black-box traceability based on model family error space.
arXiv 2025
sym

Safe-Control: A Safety Patch for Mitigating Unsafe Content in Text-to-Image Generation Models

Xiangtao Meng, Yingkai Dong, Ning Yu, Li Wang, Zheng Li✉, Shanqing Guo✉

arXiv · T2I Safety Defense

Paper

  • Safety patch for mitigating unsafe content in text-to-image generation models.
CCS 2025
sym

DCMI: A Differential Calibration Membership Inference Attack Against Retrieval-Augmented Generation

Xinyu, Xiangtao Meng✉, Yingkai Dong, Zheng Li✉, Shanqing Guo✉

CCS · RAG Security

Paper

  • Differential calibration membership inference against RAG.
IEEE S&P 2025
sym

Fuzz-testing meets LLM-based agents: An automated and efficient framework for jailbreaking text-to-image generation models

Yingkai Dong, Xiangtao Meng, Ning Yu, Li Wang, Zheng Li✉, Shanqing Guo✉

IEEE S&P · Adversarial Testing

Paper

  • Automated framework for jailbreaking text-to-image generation models.

2024

IEEE S&P 2024
sym

AVA: Inconspicuous Attribute Variation-based Adversarial Attack bypassing DeepFake Detection

Xiangtao Meng, Li Wang, Shanqing Guo✉, Lei Ju, Qingchuan Zhao

IEEE S&P · Deepfake Attack

Paper Code

  • Inconspicuous attribute variation-based adversarial attack bypassing deepfake detection.
ACM TOPS
sym

DEEPFAKER: A Unified Evaluation Platform for Facial Deepfake and Detection Models

Li Wang, Xiangtao Meng, Dan Li, Xuhong Zhang, Shouling Ji, Shanqing Guo✉

ACM TOPS · Benchmark CCF B

Paper

  • A unified evaluation platform for facial deepfake and detection models.