Evolve the Method, Not the Prompts: Evolutionary Synthesis of Jailbreak Attacks on LLMs Paper • 2511.12710 • Published 23 days ago • 36
Flames: Benchmarking Value Alignment of LLMs in Chinese Paper • 2311.06899 • Published Nov 12, 2023 • 2
From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities Paper • 2401.15071 • Published Jan 26, 2024 • 37
MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal Large Language Models Paper • 2406.07594 • Published Jun 11, 2024 • 1
ESC-Eval: Evaluating Emotion Support Conversations in Large Language Models Paper • 2406.14952 • Published Jun 21, 2024
Reflection-Bench: probing AI intelligence with reflection Paper • 2410.16270 • Published Oct 21, 2024 • 6
Safety at Scale: A Comprehensive Survey of Large Model Safety Paper • 2502.05206 • Published Feb 2 • 3
SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law Paper • 2507.18576 • Published Jul 24 • 8
A Rigorous Benchmark with Multidimensional Evaluation for Deep Research Agents: From Answers to Reports Paper • 2510.02190 • Published Oct 2 • 18
The Other Mind: How Language Models Exhibit Human Temporal Cognition Paper • 2507.15851 • Published Jul 21
A Mousetrap: Fooling Large Reasoning Models for Jailbreak with Chain of Iterative Chaos Paper • 2502.15806 • Published Feb 19 • 2
Argus Inspection: Do Multimodal Large Language Models Possess the Eye of Panoptes? Paper • 2506.14805 • Published Jun 3 • 3
A Rigorous Benchmark with Multidimensional Evaluation for Deep Research Agents: From Answers to Reports Paper • 2510.02190 • Published Oct 2 • 18