SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language
Models
Paper
• 2407.15841
• Published • 39
Paper
• 2407.14358
• Published • 26
PlacidDreamer: Advancing Harmony in Text-to-3D Generation
Paper
• 2407.13976
• Published • 5
Efficient Audio Captioning with Encoder-Level Knowledge Distillation
Paper
• 2407.14329
• Published • 5
GoldFinch: High Performance RWKV/Transformer Hybrid with Linear Pre-Fill
and Extreme KV-Cache Compression
Paper
• 2407.12077
• Published • 57
Click-Gaussian: Interactive Segmentation to Any 3D Gaussians
Paper
• 2407.11793
• Published • 3
Q-Sparse: All Large Language Models can be Fully Sparsely-Activated
Paper
• 2407.10969
• Published • 23
JPEG-LM: LLMs as Image Generators with Canonical Codec Representations
Paper
• 2408.08459
• Published • 45
Viewer
• Updated • 1.8k • 511
• 93
Viewer
• Updated • 54.6k • 1.11k
• 165
Viewer
• Updated • 3k • 9
• 1