view article Article IBM and UC Berkeley Diagnose Why Enterprise Agents Fail Using IT-Bench and MAST 1 day ago • 7
view article Article IBM and UC Berkeley Diagnose Why Enterprise Agents Fail Using IT-Bench and MAST 1 day ago • 7
ITBench: Evaluating AI Agents across Diverse Real-World IT Automation Tasks Paper • 2502.05352 • Published Feb 7, 2025 • 2
view article Article OpenEnv in Practice: Evaluating Tool-Using Agents in Real-World Environments +3 8 days ago • 24
Enterprise Agents and Benchmarks Collection Enterprise agent ecosystem featuring AssetOpsBench (industrial) and ITBench (SRE, FinOps, CISO), CUGA to accelerate AI Automation • 10 items • Updated 5 days ago • 14
view article Article AssetOpsBench: Bridging the Gap Between AI Agent Benchmarks and Industrial Reality 30 days ago • 31
From Benchmarks to Business Impact: Deploying IBM Generalist Agent in Enterprise Production Paper • 2510.23856 • Published Oct 27, 2025 • 5