9 3 7

Chunyuan Li

Chunyuan24

https://chunyuan.li/

AI & ML interests

None yet

Recent Activity

upvoted a paper 6 days ago

UniT: Unified Multimodal Chain-of-Thought Test-time Scaling

authored a paper 7 days ago

UniT: Unified Multimodal Chain-of-Thought Test-time Scaling

liked a Space over 1 year ago

Tonic/Llava-Video

View all activity

Organizations

upvoted a paper 6 days ago

UniT: Unified Multimodal Chain-of-Thought Test-time Scaling

Paper • 2602.12279 • Published 12 days ago • 19

authored a paper 7 days ago

UniT: Unified Multimodal Chain-of-Thought Test-time Scaling

Paper • 2602.12279 • Published 12 days ago • 19

liked a Space over 1 year ago

Llava Video

🌋

interact with videos !

commented a paper over 1 year ago

Video Instruction Tuning With Synthetic Data

Paper • 2410.02713 • Published Oct 3, 2024 • 41 •

authored 2 papers over 1 year ago

LLaVA-Critic: Learning to Evaluate Multimodal Models

Paper • 2410.02712 • Published Oct 3, 2024 • 37

Video Instruction Tuning With Synthetic Data

Paper • 2410.02713 • Published Oct 3, 2024 • 41

upvoted 2 papers over 1 year ago

Video Instruction Tuning With Synthetic Data

Paper • 2410.02713 • Published Oct 3, 2024 • 41

LLaVA-Critic: Learning to Evaluate Multimodal Models

Paper • 2410.02712 • Published Oct 3, 2024 • 37

authored 6 papers over 1 year ago

MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines

Paper • 2409.12959 • Published Sep 19, 2024 • 38

SAM2Point: Segment Any 3D as Videos in Zero-shot and Promptable Manners

Paper • 2408.16768 • Published Aug 29, 2024 • 28

LLaVA-OneVision: Easy Visual Task Transfer

Paper • 2408.03326 • Published Aug 6, 2024 • 61

LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models

Paper • 2407.07895 • Published Jul 10, 2024 • 42

Long Context Transfer from Language to Vision

Paper • 2406.16852 • Published Jun 24, 2024 • 33

MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding

Paper • 2406.09411 • Published Jun 13, 2024 • 19

authored 2 papers about 2 years ago

TrustLLM: Trustworthiness in Large Language Models

Paper • 2401.05561 • Published Jan 10, 2024 • 69

LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models

Paper • 2312.02949 • Published Dec 5, 2023 • 14

authored 4 papers over 2 years ago

Visual In-Context Prompting

Paper • 2311.13601 • Published Nov 22, 2023 • 18

LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents

Paper • 2311.05437 • Published Nov 9, 2023 • 51

LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing

Paper • 2311.00571 • Published Nov 1, 2023 • 43

Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V

Paper • 2310.11441 • Published Oct 17, 2023 • 29

Chunyuan Li

AI & ML interests

Recent Activity

Organizations

Chunyuan24's activity

Llava Video