VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding Paper • 2501.13106 • Published Jan 22, 2025 • 90
DAMO-NLP-SG/VideoLLaMA3-7B-Image Visual Question Answering • 8B • Updated Mar 20, 2025 • 534 • 10
DAMO-NLP-SG/VideoLLaMA3-2B-Image Visual Question Answering • 2B • Updated Mar 20, 2025 • 62 • 8
DAMO-NLP-SG/VL3-SigLIP-NaViT Image Feature Extraction • 0.4B • Updated Mar 20, 2025 • 16.4k • 10