Yasunori Ozaki's picture

In a Training Loop 🔄

Yasunori Ozaki PRO

alfredplpl

·

https://alfredplpl.github.io/en/index.html

AI & ML interests

Computer Vision, LLM

Recent Activity

liked a Space about 3 hours ago

Vchitect/VBench_Leaderboard

published a model about 8 hours ago

alfredplpl/z-image-modern-anime-lora

updated a model about 9 hours ago

alfredplpl/z-image-modern-anime-lora

View all activity

Organizations

upvoted a paper 6 days ago

SkyReels-V4: Multi-modal Video-Audio Generation, Inpainting and Editing model

Paper • 2602.21818 • Published 8 days ago • 52

upvoted a collection 9 days ago

Qwen3.5

Qwen3.5 is Qwen's new model family including Qwen3.5 Small: 0.8B, 2B, 4B, 9B and Qwen3.5 Medium: 35B-A3B, 27B, 122B-A10B and 397B-A17B. • 25 items • Updated 3 days ago • 84

upvoted 2 collections 13 days ago

GPT-OSS-Swallow-v0.1

4 items • Updated 13 days ago • 13

Qwen3-Swallow-v0.2

12 items • Updated 10 days ago • 8

upvoted an article 15 days ago

Article

NVIDIA Nemotron 2 Nano 9B Japanese: 日本のソブリンAIを支える最先端小規模言語モデル

15 days ago

•

23

upvoted a collection 16 days ago

BitDance

BitDance: Open-source autoregressive model with binary visual tokens. A research project for building powerful multimodal autoregressive model. • 10 items • Updated 3 days ago • 11

upvoted a collection 17 days ago

Qwen3.5

21 items • Updated 2 days ago • 939

upvoted a collection 23 days ago

NEST-Ja

Japanese speech self-supervised learning model developed by SB Intuitions. • 2 items • Updated 23 days ago • 1

upvoted a collection about 2 months ago

ArrowIdeative-series

GRPOのみを事後学習に使用したモデルです。 • 5 items • Updated Jan 11 • 1

upvoted 2 papers about 2 months ago

Atlas: Orchestrating Heterogeneous Models and Tools for Multi-Domain Complex Reasoning

Paper • 2601.03872 • Published Jan 7 • 43

LTX-2: Efficient Joint Audio-Visual Foundation Model

Paper • 2601.03233 • Published Jan 6 • 154

upvoted 2 papers 2 months ago

Yume-1.5: A Text-Controlled Interactive World Generation Model

Paper • 2512.22096 • Published Dec 26, 2025 • 60

LLaDA2.0: Scaling Up Diffusion Language Models to 100B

Paper • 2512.15745 • Published Dec 10, 2025 • 87

upvoted 2 papers 3 months ago

DeContext as Defense: Safe Image Editing in Diffusion Transformers

Paper • 2512.16625 • Published Dec 18, 2025 • 25

IC-Effect: Precise and Efficient Video Effects Editing via In-Context Learning

Paper • 2512.15635 • Published Dec 17, 2025 • 20

upvoted a collection 3 months ago

Qwen-Image

14 items • Updated Dec 31, 2025 • 80

upvoted 2 papers 3 months ago

Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length

Paper • 2512.04677 • Published Dec 4, 2025 • 173

Self-Improving VLM Judges Without Human Annotations

Paper • 2512.05145 • Published Dec 2, 2025 • 20

upvoted a collection 3 months ago

Z-Image

7 items • Updated Jan 28 • 142

upvoted a changelog 3 months ago

Hugging Face Changelog

Duplicate Datasets

Dec 3, 2025

• 104