Beijing Academy of Artificial Intelligence

non-profit

https://www.baai.ac.cn/english.html

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

01Matrix updated a dataset about 23 hours ago

BAAI/OPI-Struc

wwen1997 authored a paper 3 days ago

Turning the TIDE: Cross-Architecture Distillation for Diffusion Large Language Models

wwen1997 authored a paper 3 days ago

Sat3DGen: Comprehensive Street-Level 3D Scene Generation from Single Satellite Image

View all activity

Papers

ExoActor: Exocentric Video Generation as Generalizable Interactive Humanoid Control

AutoResearchBench: Benchmarking AI Agents on Complex Scientific Literature Discovery

View all Papers

01Matrix

updated a dataset about 23 hours ago

BAAI/OPI-Struc

Updated about 18 hours ago • 353 • 1

wwen1997

authored 2 papers 3 days ago

Turning the TIDE: Cross-Architecture Distillation for Diffusion Large Language Models

Paper • 2604.26951 • Published 22 days ago • 47

Sat3DGen: Comprehensive Street-Level 3D Scene Generation from Single Satellite Image

Paper • 2605.14984 • Published 7 days ago • 5

Paranioar

authored 3 papers 7 days ago

DynamicVLA: A Vision-Language-Action Model for Dynamic Object Manipulation

Paper • 2601.22153 • Published Jan 29 • 75

VISTA-Bench: Do Vision-Language Models Really Understand Visualized Text as Well as Pure Text?

Paper • 2602.04802 • Published Feb 4 • 2

SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture

Paper • 2605.12500 • Published 9 days ago • 184

A-PolarBear

updated a model 8 days ago

BAAI/OpenSeek-Mid-v1

Text Generation • 11B • Updated 8 days ago • 37 • 10

Paranioar

submitted a paper to Daily Papers 8 days ago

SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture

Paper • 2605.12500 • Published 9 days ago • 184

01Matrix

published a dataset 10 days ago

BAAI/OPI-Struc

Updated about 18 hours ago • 353 • 1

ldwang

updated a model 13 days ago

BAAI/OpenSeek-Mid-v1

Text Generation • 11B • Updated 8 days ago • 37 • 10

A-PolarBear

updated a collection 16 days ago

OpenSeek

Collection

open-source community driven next generation of AI models • 11 items • Updated 16 days ago • 10

A-PolarBear

published a model 16 days ago

BAAI/OpenSeek-Mid-v1

Text Generation • 11B • Updated 8 days ago • 37 • 10

Bitterdhg

authored a paper 17 days ago

UDM-GRPO: Stable and Efficient Group Relative Policy Optimization for Uniform Discrete Diffusion Models

Paper • 2604.18518 • Published Apr 20 • 7

tellarin

authored a paper 17 days ago

ExoActor: Exocentric Video Generation as Generalizable Interactive Humanoid Control

Paper • 2604.27711 • Published 21 days ago • 41

Sri-Vigneshwar-DJ

posted an update 18 days ago

Post

124

![Feather DB LongMemEval Results]( Hawky-ai/longmemeval-results)

We ran Feather DB v0.8.0 on LongMemEval (ICLR 2025) — 500 questions across real multi-session conversations, up to 115K tokens each.

**Score: 0.693** · GPT-4o full-context baseline: 0.640
Full 500-question run with Gemini-Flash: **$2.40**

Per-axis breakdown:
→ Info-extraction: **0.942**
→ Knowledge-update: **0.714**
→ Multi-session: **0.606**
→ Temporal: **0.477** ← the hard one, Phase 9 addresses this

Architecture: Hybrid BM25+dense · adaptive temporal decay · embedded (no server) · p50 = 0.19ms · MIT

pip install feather-db

Raw results + audit JSONs: Hawky-ai/longmemeval-results

tellarin

submitted a paper to Daily Papers 20 days ago

ExoActor: Exocentric Video Generation as Generalizable Interactive Humanoid Control

Paper • 2604.27711 • Published 21 days ago • 41

Yonghua

posted an update 25 days ago

Post

129

🚀 Run DeepSeek V4 on more AI GPUs with FlagOS

DeepSeek V4 just dropped with huge specs: 1.6T params, 1M context, MIT license.

But there’s a catch: the official weights use FP4+FP8 mixed precision, which mainly targets NVIDIA Blackwell / B200-class GPUs.

So we built DeepSeek-V4-FlagOS.

On Day 0, the FlagOS community completed multi-chip adaptation across 8 AI hardware platforms:

✅ NVIDIA H100/H20 — FP8/BF16
✅ Huawei Ascend — BF16
✅ Hygon DCU — BF16
✅ MetaX GPU — BF16
✅ Moore Threads MTT S5000 — FP8
✅ Kunlunxin XPU — BF16
✅ T-Head/Alibaba Zhenwu — BF16
✅ Iluvatar GPU — BF16

🔧 What makes it work?

1️⃣ FlagGems operator replacement
DeepSeek V4 operators — MoE routing, Attention, RMSNorm and more — are reimplemented with Triton, reducing dependency on CUDA-specific libraries.

New V4 operators include:
Act Quant, hc_split_sinkhorn, FP8 MatMul, Sparse Attention, Hadamard Transform.

2️⃣ Flexible tensor parallelism
DeepSeek V4 uses o_groups=8, which can limit TP.
We added an independent communication group for o-groups, while allowing the rest of the model to scale to higher TP, enabling deployment on 32GB/64GB cards.

3️⃣ FP4 → BF16 conversion
For hardware without native FP4, we provide ready-to-use BF16 conversion and pre-converted model releases.

📦 Pre-converted models are available on Hugging Face:
V4-Pro:
FlagRelease/DeepSeek-V4-Pro-nvidia-FlagOS
FlagRelease/DeepSeek-V4-Pro-metax-FlagOS
FlagRelease/DeepSeek-V4-Pro-mthreads-FlagOS
FlagRelease/DeepSeek-V4-Pro-hygon-FlagOS
FlagRelease/DeepSeek-V4-Pro-ascend-FlagOS

V4-Flash:
FlagRelease/DeepSeek-V4-Flash-nvidia-FlagOS
FlagRelease/DeepSeek-V4-Flash-zhenwu-FlagOS
FlagRelease/DeepSeek-V4-Flash-kunlunxin-FlagOS
FlagRelease/DeepSeek-V4-Flash-iluvatar-FlagOS

⚡ Performance on NVIDIA H20, V4-Flash FP8:
FlagGems C++ Wrapper + Triton: 70.7 tok/s
DeepSeek TileLang: 62.99 tok/s

That’s 12.24% faster.

👉 Try it here:
https://github.com/flagos-ai/DeepSeek-V4-FlagOS

Open models should run on open infrastructure