6944 6

GuoLiangTang

Tommy930

https://github.com/TommyTang930

AI & ML interests

LLM，NLP，ML

Recent Activity

upvoted a paper about 2 hours ago

OmniWeaving: Towards Unified Video Generation with Free-form Composition and Reasoning

upvoted a paper about 3 hours ago

Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs?

upvoted a paper about 3 hours ago

CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents

View all activity

Organizations

None yet

upvoted a paper about 2 hours ago

OmniWeaving: Towards Unified Video Generation with Free-form Composition and Reasoning

Paper • 2603.24458 • Published about 14 hours ago • 1

upvoted 5 papers about 3 hours ago

Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs?

Paper • 2603.24472 • Published about 14 hours ago • 9

CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents

Paper • 2603.24440 • Published about 14 hours ago • 2

UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience

Paper • 2603.24533 • Published about 13 hours ago • 1

CarePilot: A Multi-Agent Framework for Long-Horizon Computer Task Automation in Healthcare

Paper • 2603.24157 • Published about 20 hours ago • 3

When Models Judge Themselves: Unsupervised Self-Evolution for Multimodal Reasoning

Paper • 2603.21289 • Published 4 days ago • 5

upvoted a paper about 24 hours ago

VTAM: Video-Tactile-Action Models for Complex Physical Interaction Beyond VLAs

Paper • 2603.23481 • Published 1 day ago • 4

upvoted 12 papers 1 day ago

RealMaster: Lifting Rendered Scenes into Photorealistic Video

Paper • 2603.23462 • Published 1 day ago • 23

SIMART: Decomposing Monolithic Meshes into Sim-ready Articulated Assets via MLLM

Paper • 2603.23386 • Published 1 day ago • 33

ThinkJEPA: Empowering Latent World Models with Large Vision-Language Reasoning Model

Paper • 2603.22281 • Published 3 days ago • 12

Sparse but Critical: A Token-Level Analysis of Distributional Shifts in RLVR Fine-Tuning of LLMs

Paper • 2603.22446 • Published 3 days ago • 4

MinerU-Diffusion: Rethinking Document OCR as Inverse Rendering via Diffusion Decoding

Paper • 2603.22458 • Published 2 days ago • 113

Ego2Web: A Web Agent Benchmark Grounded in Egocentric Videos

Paper • 2603.22529 • Published 2 days ago • 4

ABot-PhysWorld: Interactive World Foundation Model for Robotic Manipulation with Physics Alignment

Paper • 2603.23376 • Published 1 day ago • 2

SpecEyes: Accelerating Agentic Multimodal LLMs via Speculative Perception and Planning

Paper • 2603.23483 • Published 1 day ago • 45

UniGRPO: Unified Policy Optimization for Reasoning-Driven Visual Generation

Paper • 2603.23500 • Published 1 day ago • 30

Rethinking Token-Level Policy Optimization for Multimodal Chain-of-Thought

Paper • 2603.22847 • Published 2 days ago • 20

From Static Templates to Dynamic Runtime Graphs: A Survey of Workflow Optimization for LLM Agents

Paper • 2603.22386 • Published 3 days ago • 44

PEARL: Personalized Streaming Video Understanding Model

Paper • 2603.20422 • Published 5 days ago • 36

upvoted a paper 2 days ago

OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis

Paper • 2603.20278 • Published 8 days ago • 71

GuoLiangTang

AI & ML interests

Recent Activity

Organizations

Tommy930's activity