ByteDance/Sa2VA-4B
Image-Text-to-Text
•
Updated
•
158k
•
95
Huggingace Model Zoo For Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos By Bytedance Seed CV Research
Note Techinical Report For Sa2VA.