Hyoung-Kyu Song's picture

Hyoung-Kyu Song

deepkyu

·

https://linktr.ee/deepkyu

AI & ML interests

Efficient model for image/video generation

Recent Activity

upvoted a paper 9 days ago

PaperBanana: Automating Academic Illustration for AI Scientists

upvoted a paper 4 months ago

Latent Diffusion Model without Variational Autoencoder

upvoted a paper 4 months ago

Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset

View all activity

Organizations

upvoted a paper 9 days ago

PaperBanana: Automating Academic Illustration for AI Scientists

Paper • 2601.23265 • Published 12 days ago • 169

upvoted 3 papers 4 months ago

Latent Diffusion Model without Variational Autoencoder

Paper • 2510.15301 • Published Oct 17, 2025 • 49

Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset

Paper • 2510.15742 • Published Oct 17, 2025 • 51

D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI

Paper • 2510.05684 • Published Oct 7, 2025 • 143

upvoted a paper 5 months ago

Lynx: Towards High-Fidelity Personalized Video Generation

Paper • 2509.15496 • Published Sep 19, 2025 • 12

upvoted a paper 7 months ago

JAM-Flow: Joint Audio-Motion Synthesis with Flow Matching

Paper • 2506.23552 • Published Jun 30, 2025 • 10

upvoted a paper 8 months ago

Seeing Voices: Generating A-Roll Video from Audio with Mirage

Paper • 2506.08279 • Published Jun 9, 2025 • 27

upvoted 2 papers 11 months ago

Efficient LLaMA-3.2-Vision by Trimming Cross-attended Visual Features

Paper • 2504.00557 • Published Apr 1, 2025 • 15

SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation

Paper • 2503.09641 • Published Mar 12, 2025 • 42

upvoted a collection 11 months ago

SANA-Sprint

🏃SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation • 6 items • Updated Sep 13, 2025 • 43

upvoted 9 papers about 1 year ago

Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization

Paper • 2412.17739 • Published Dec 23, 2024 • 41

OpenAI o1 System Card

Paper • 2412.16720 • Published Dec 21, 2024 • 37

FastVLM: Efficient Vision Encoding for Vision Language Models

Paper • 2412.13303 • Published Dec 17, 2024 • 73

FashionComposer: Compositional Fashion Image Generation

Paper • 2412.14168 • Published Dec 18, 2024 • 17

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 376

FitDiT: Advancing the Authentic Garment Details for High-fidelity Virtual Try-on

Paper • 2411.10499 • Published Nov 15, 2024 • 13

RedPajama: an Open Dataset for Training Large Language Models

Paper • 2411.12372 • Published Nov 19, 2024 • 56

FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality

Paper • 2410.19355 • Published Oct 25, 2024 • 24

LLaVA-o1: Let Vision Language Models Reason Step-by-Step

Paper • 2411.10440 • Published Nov 15, 2024 • 129

upvoted a paper over 1 year ago

GPT-4o System Card

Paper • 2410.21276 • Published Oct 25, 2024 • 87