Exploring secure and trustworthy AI, from deepfake detection to robust large language models.
Based in Qingdao, China. 3nd-year Ph.D., School of Cyber Science and Technology at Shandong University.
Research Interests:
- Trustworthy Machine Learning: Researching safety, robustness, and privacy across generative models and LLM agents.
- Deepfake Forensics: Building attacks and defenses for facial forgery detection in practical pipelines.
- Secure LLM Systems: Designing evaluation frameworks that expose risk interactions and support safer deployments.
📖 Education
- 2023.09 - Present, Ph.D. Student, School of Cyber Science and Technology, Shandong University, Qingdao, China.
Advisor: Xiaoyun Wang - 2020.09 - 2023.06, Master’s Degree, Shandong University (Thesis on robustness research for deepfake detection).
Advisor: Shanqing Guo
🔥 News
- 2025.11: 🎉 Featured by MIT Technology Review China - Media coverage of our latest LLM defense study.
- 2025.10: 📄 Preprint: From Defender to Devil? - Investigating unintended risk interactions introduced by LLM defenses.
- 2025.09: 🎉 ErrorTrace accepted at NeurIPS 2025 (spotlight) - Black-box traceability based on model family error space.
- 2025.09: 🤝 Industry collaboration launched - Joint research project on LLM security testing and risk assessment with Topsec.
- 2025.08: 📄 Preprint: Safe-Control - Safety patch for mitigating unsafe content in text-to-image generation models.
- 2025.08: 🎉 DCMI accepted at CCS 2025 - Differential calibration membership inference against RAG.
- 2025.03: 🎉 Fuzz-testing meets LLM-based agents accepted at IEEE S&P 2025 - Automated framework for jailbreaking text-to-image generation models.
- 2024.11: 🏆 Outstanding master’s thesis - Recognized for thesis on robustness research for deepfake detection.
🎖 Honors and Awards
- 2024.11 Outstanding master’s thesis - Recognized for thesis on robustness research for deepfake detection.
👔 Academic Services
- 2025, Reviewer for IEEE Transactions on Information Forensics and Security (TIFS)
📝 Publications
2025

From Defender to Devil? Unintended Risk Interactions Induced by LLM Defenses
Xiangtao Meng, Tianshuo Cong, Li Wang, Wenyu Chen, Zheng Li✉, Shanqing Guo✉, Xiaoyun Wang✉
arXiv · LLM Safety Risk Analysis
- Investigating unintended risk interactions introduced by LLM defenses.

ErrorTrace: A Black-Box Traceability Mechanism Based on Model Family Error Space
Chuanchao Zang, Xiangtao Meng, Wenyu Chen, Tianshuo Cong, Zha Yaxing, Dong Qi, Zheng Li, Shanqing Guo
NeurIPS (Spotlight) · Model Provenance
- Black-box traceability based on model family error space.

Safe-Control: A Safety Patch for Mitigating Unsafe Content in Text-to-Image Generation Models
Xiangtao Meng, Yingkai Dong, Ning Yu, Li Wang, Zheng Li✉, Shanqing Guo✉
arXiv · T2I Safety Defense
- Safety patch for mitigating unsafe content in text-to-image generation models.

DCMI: A Differential Calibration Membership Inference Attack Against Retrieval-Augmented Generation
Xinyu, Xiangtao Meng✉, Yingkai Dong, Zheng Li✉, Shanqing Guo✉
CCS · RAG Security
- Differential calibration membership inference against RAG.

Yingkai Dong, Xiangtao Meng, Ning Yu, Li Wang, Zheng Li✉, Shanqing Guo✉
IEEE S&P · Adversarial Testing
- Automated framework for jailbreaking text-to-image generation models.
2024

AVA: Inconspicuous Attribute Variation-based Adversarial Attack bypassing DeepFake Detection
Xiangtao Meng, Li Wang, Shanqing Guo✉, Lei Ju, Qingchuan Zhao
IEEE S&P · Deepfake Attack
- Inconspicuous attribute variation-based adversarial attack bypassing deepfake detection.

DEEPFAKER: A Unified Evaluation Platform for Facial Deepfake and Detection Models
Li Wang, Xiangtao Meng, Dan Li, Xuhong Zhang, Shouling Ji, Shanqing Guo✉
ACM TOPS · Benchmark CCF B
- A unified evaluation platform for facial deepfake and detection models.