Submitted by ydalva 36 Canvas-to-Image: Compositional Image Generation with Multimodal Controls · 8 authors 6
Submitted by Inevitablevalor 16 ENACT: Evaluating Embodied Cognition with World Modeling of Egocentric Interaction · 11 authors 41 2
Submitted by syp115 12 Agentic Learner with Grow-and-Refine Multimodal Semantic Memory · 12 authors 57 2
Submitted by txiong23 11 Multi-Crit: Benchmarking Multimodal Judges on Pluralistic Criteria-Following University of Maryland College Park 14 2
Submitted by tellarin 10 What does it mean to understand language? Massachusetts Institute of Technology 2