1 12 8

JunhaSong

junha1125

AI & ML interests

None yet

Recent Activity

upvoted a paper 11 days ago

Sommelier: Scalable Open Multi-turn Audio Pre-processing for Full-duplex Speech Language Models

commentedon a paper 15 days ago

SpatialBoost: Enhancing Visual Representation through Language-Guided Reasoning

upvoted a paper 27 days ago

Grounding World Simulation Models in a Real-World Metropolis

View all activity

Organizations

None yet

upvoted a paper 11 days ago

Sommelier: Scalable Open Multi-turn Audio Pre-processing for Full-duplex Speech Language Models

Paper • 2603.25750 • Published 24 days ago • 36

commented a paper 15 days ago

SpatialBoost: Enhancing Visual Representation through Language-Guided Reasoning

Paper • 2603.22057 • Published 21 days ago • 45 •

upvoted a paper 27 days ago

Grounding World Simulation Models in a Real-World Metropolis

Paper • 2603.15583 • Published 27 days ago • 153

upvoted a paper 3 months ago

InsertAnywhere: Bridging 4D Scene Geometry and Diffusion Models for Realistic Video Object Insertion

Paper • 2512.17504 • Published Dec 19, 2025 • 99

upvoted 3 papers 4 months ago

Infinite-Homography as Robust Conditioning for Camera-Controlled Video Generation

Paper • 2512.17040 • Published Dec 18, 2025 • 29

Vector Prism: Animating Vector Graphics by Stratifying Semantic Structure

Paper • 2512.14336 • Published Dec 16, 2025 • 32

EgoX: Egocentric Video Generation from a Single Exocentric Video

Paper • 2512.08269 • Published Dec 9, 2025 • 123

liked a Space 5 months ago

Qwen Image Edit Camera Control

🎬

2.17k

Fast 4 step inference with Qwen Image Edit 2509

upvoted a paper 5 months ago

PHUMA: Physically-Grounded Humanoid Locomotion Dataset

Paper • 2510.26236 • Published Oct 30, 2025 • 30

upvoted a paper 6 months ago

ACG: Action Coherence Guidance for Flow-based VLA models

Paper • 2510.22201 • Published Oct 25, 2025 • 37

authored 2 papers 6 months ago

EcoTTA: Memory-Efficient Continual Test-time Adaptation via Self-distilled Regularization

Paper • 2303.01904 • Published Mar 3, 2023

RL makes MLLMs see better than SFT

Paper • 2510.16333 • Published Oct 18, 2025 • 49

upvoted a paper 6 months ago

RL makes MLLMs see better than SFT

Paper • 2510.16333 • Published Oct 18, 2025 • 49

upvoted 2 papers 9 months ago

DesignLab: Designing Slides Through Iterative Detection and Correction

Paper • 2507.17202 • Published Jul 23, 2025 • 51

Token Bottleneck: One Token to Remember Dynamics

Paper • 2507.06543 • Published Jul 9, 2025 • 20

liked a model 9 months ago

AILab-CVC/seed-x-17b-instruct

Updated Sep 21, 2024 • 2 • 1

liked a model 11 months ago

nvidia/DAM-3B

Image-Text-to-Text • Updated May 7, 2025 • 31.9k • 129

upvoted a collection 12 months ago

ProLIP

Collection

Official ProLIP weights, Probabilistic Language-Image Pre-Training (ICLR 2025) • 7 items • Updated Apr 18, 2025 • 10

liked 2 models about 1 year ago

mistralai/Mistral-Small-3.1-24B-Instruct-2503

Updated Dec 22, 2025 • 532k • 1.36k

LGAI-EXAONE/EXAONE-Deep-32B

Text Generation • 32B • Updated Feb 6 • 4.34k • 300

JunhaSong

AI & ML interests

Recent Activity

Organizations

junha1125's activity

Qwen Image Edit Camera Control