DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models Paper • 2406.11633 • Published Jun 17, 2024 • 1
SurveyForge: On the Outline Heuristics, Memory-Driven Generation, and Multi-dimensional Evaluation for Automated Survey Writing Paper • 2503.04629 • Published Mar 6, 2025 • 19
TrustGeoGen: Scalable and Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving Paper • 2504.15780 • Published Apr 22, 2025 • 6
NovelSeek: When Agent Becomes the Scientist -- Building Closed-Loop System from Hypothesis to Verification Paper • 2505.16938 • Published May 22, 2025 • 121
InternAgent-1.5: A Unified Agentic Framework for Long-Horizon Autonomous Scientific Discovery Paper • 2602.08990 • Published 1 day ago • 19
InternAgent Collection InternAgent aims to facilitate AI+X innovative research across diverse scientific domains • 14 items • Updated about 20 hours ago • 5
SciEvalKit: An Open-source Evaluation Toolkit for Scientific General Intelligence Paper • 2512.22334 • Published Dec 26, 2025 • 35
SciEvalKit: An Open-source Evaluation Toolkit for Scientific General Intelligence Paper • 2512.22334 • Published Dec 26, 2025 • 35
SciEvalKit: An Open-source Evaluation Toolkit for Scientific General Intelligence Paper • 2512.22334 • Published Dec 26, 2025 • 35
SciEvalKit: An Open-source Evaluation Toolkit for Scientific General Intelligence Paper • 2512.22334 • Published Dec 26, 2025 • 35
SciEvalKit: An Open-source Evaluation Toolkit for Scientific General Intelligence Paper • 2512.22334 • Published Dec 26, 2025 • 35
EarthSE: A Benchmark for Evaluating Earth Scientific Exploration Capability of LLMs Paper • 2505.17139 • Published May 22, 2025 • 2
Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights Paper • 2506.16406 • Published Jun 19, 2025 • 130