LLM2Vec-Gen: Generative Embeddings from Large Language Models Paper • 2603.10913 • Published 10 days ago • 40
Humans and LLMs Diverge on Probabilistic Inferences Paper • 2602.23546 • Published 23 days ago • 13
Augmenting LLM Reasoning with Dynamic Notes Writing for Complex QA Paper • 2505.16293 • Published May 22, 2025 • 3
Privileged Information Distillation for Language Models Paper • 2602.04942 • Published Feb 4 • 26
WebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generation Paper • 2508.16763 • Published Aug 22, 2025 • 2
Improving GUI Grounding with Explicit Position-to-Coordinate Mapping Paper • 2510.03230 • Published Oct 3, 2025 • 4
BigCharts-R1: Enhanced Chart Reasoning with Visual Reinforcement Finetuning Paper • 2508.09804 • Published Aug 13, 2025
DRBench: A Realistic Benchmark for Enterprise Deep Research Paper • 2510.00172 • Published Sep 30, 2025 • 1
Grounding Computer Use Agents on Human Demonstrations Paper • 2511.07332 • Published Nov 10, 2025 • 107
ColMate: Contrastive Late Interaction and Masked Text for Multimodal Document Retrieval Paper • 2511.00903 • Published Nov 2, 2025
Grounding Computer Use Agents on Human Demonstrations Paper • 2511.07332 • Published Nov 10, 2025 • 107
Value Drifts: Tracing Value Alignment During LLM Post-Training Paper • 2510.26707 • Published Oct 30, 2025 • 13
WebMMU Collection WebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generation • 2 items • Updated Sep 16, 2025 • 2
How to Train Your LLM Web Agent: A Statistical Diagnosis Paper • 2507.04103 • Published Jul 5, 2025 • 52