Running 1 Automated Evaluation For VMCBench ๐ 1 This is a automated evaluation for VMCBench test and dev set
Running Qworld Evaluation Criteria Generator ๐ Generate detailed evaluation criteria for any question
Running Qworld Evaluation Criteria Generator ๐ Generate detailed evaluation criteria for any question
Democratizing AI scientists using ToolUniverse Paper โข 2509.23426 โข Published Sep 27, 2025 โข 40
MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research Paper โข 2503.13399 โข Published Mar 17, 2025 โข 22
Running 1 Automated Evaluation For VMCBench ๐ 1 This is a automated evaluation for VMCBench test and dev set