Unofficial Mistral Community [deprecated]

community

Activity Feed Request to join this org

AI & ML interests

Unofficial org for community upload of Mistral's Open Source models.

Recent Activity

nielsr submitted a paper 5 days ago

Scaling Test-Time Compute for Agentic Coding

nielsr submitted a paper 12 days ago

Geometric Context Transformer for Streaming 3D Reconstruction

nielsr submitted a paper 19 days ago

A Frame is Worth One Token: Efficient Generative World Modeling with Delta Tokens

View all activity

nielsr

submitted a paper to Daily Papers 5 days ago

Scaling Test-Time Compute for Agentic Coding

Paper • 2604.16529 • Published 13 days ago • 10

danielhanchen

posted an update 6 days ago

Post

5170

Qwen3.6-27B is out now! Run it locally on 18GB RAM. 💜

Qwen3.6-27B surpasses Qwen3.5-397B-A17B on all major coding benchmarks.

GGUFs to run: unsloth/Qwen3.6-27B-GGUF
Guide + MLX: https://unsloth.ai/docs/models/qwen3.6

danielhanchen

posted an update 12 days ago

Post

2777

Qwen3.6-35B-A3B can now be run locally! 💜

The model is the strongest mid-sized LLM on nearly all benchmarks.

Run on 23GB RAM via Unsloth Dynamic GGUFs.

GGUFs to run: unsloth/Qwen3.6-35B-A3B-GGUF
Guide: https://unsloth.ai/docs/models/qwen3.6

13 replies

nielsr

submitted a paper to Daily Papers 12 days ago

Geometric Context Transformer for Streaming 3D Reconstruction

Paper • 2604.14141 • Published 14 days ago • 19

nielsr

submitted a paper to Daily Papers 19 days ago

A Frame is Worth One Token: Efficient Generative World Modeling with Delta Tokens

Paper • 2604.04913 • Published 23 days ago • 11

danielhanchen

posted an update 21 days ago

Post

5410

You can now fine-tune Gemma 4 for free with our notebooks! 🔥

You just need 8GB VRAM to train Gemma 4 locally!

Unsloth trains Gemma4 1.5x faster with 50% less VRAM.
GitHub: https://github.com/unslothai/unsloth
Guide + Notebooks: https://unsloth.ai/docs/models/gemma-4/train

5 replies

nielsr

submitted a paper to Daily Papers 25 days ago

MDPBench: A Benchmark for Multilingual Document Parsing in Real-World Scenarios

Paper • 2603.28130 • Published 29 days ago • 11

danielhanchen

posted an update 26 days ago

Post

3773

Google releases Gemma 4. ✨
Gemma 4 introduces 4 models: E2B, E4B, 26B-A4B, 31B.
The multimodal reasoning models are under Apache 2.0.

Run E2B and E4B on ~6GB RAM, and on phones. Run 26B-A4B and 31B on ~18GB.

GGUFs: https://huggingface.co/collections/unsloth/gemma-4
Guide: https://unsloth.ai/docs/models/gemma-4

danielhanchen

posted an update 28 days ago

Post

2740

A new way to use Unsloth.

Coming soon...

MaziyarPanahi

posted an update 28 days ago

Post

1883

Training mRNA Language Models Across 25 Species for $165

We built an end-to-end protein AI pipeline covering structure prediction, sequence design, and codon optimization. After comparing multiple transformer architectures for codon-level language modeling, CodonRoBERTa-large-v2 emerged as the clear winner with a perplexity of 4.10 and a Spearman CAI correlation of 0.40, significantly outperforming ModernBERT. We then scaled to 25 species, trained 4 production models in 55 GPU-hours, and built a species-conditioned system that no other open-source project offers. Complete results, architectural decisions, and runnable code below.

https://huggingface.co/blog/OpenMed/training-mrna-models-25-species

danielhanchen

posted an update about 1 month ago

Post

922

You don’t need to set LLM parameters anymore! 🚀

llama.cpp uses only the context length + compute your local setup needs. Unsloth also auto-applies the correct model settings

Try in Unsloth Studio - now with precompiled llama.cpp binaries.

GitHub: https://github.com/unslothai/unsloth

2 replies

MaziyarPanahi

posted an update about 1 month ago

Post

2245

We annotated 119K medical images with two frontier VLMs (Qwen 3.5, Kimi K2.5), cross-validated at 93% agreement, and produced 110K training records, all for under $500. Fine-tuning 3 small models (2-3B params) improved all benchmarks: best model reaches +15.0% average exact match.

Everything is open-sourced: datasets, adapters, and code.

https://huggingface.co/blog/OpenMed/synthvision

2 replies

nielsr

submitted 3 papers to Daily Papers about 1 month ago

Do VLMs Need Vision Transformers? Evaluating State Space Models as Vision Encoders

Paper • 2603.19209 • Published Mar 19 • 5

V-JEPA 2.1: Unlocking Dense Features in Video Self-Supervised Learning

Paper • 2603.14482 • Published Mar 15 • 32

Omnilingual MT: Machine Translation for 1,600 Languages

Paper • 2603.16309 • Published Mar 17 • 21

danielhanchen

posted an update about 1 month ago

Post

3401

Introducing Unsloth Studio ✨
A new open-source web UI to train and run LLMs.

• Run models locally on Mac, Windows, Linux
• Train 500+ models 2x faster with 70% less VRAM
• Supports GGUF, vision, audio, embedding models
• Auto-create datasets from PDF, CSV, DOCX
• Self-healing tool calling and code execution
• Compare models side by side + export to GGUF

GitHub: https://github.com/unslothai/unsloth
Blog and Guide: https://unsloth.ai/docs/new/studio

Available now on Hugging Face, NVIDIA, Docker and Colab.

danielhanchen

posted an update about 2 months ago

Post

3927

We collaborated with NVIDIA to teach you about Reinforcement Learning and RL environments. 💚 Learn:

• Why RL environments matter + how to build them
• When RL is better than SFT
• GRPO and RL best practices
• How verifiable rewards and RLVR work

Blog: https://unsloth.ai/blog/rl-environments

4 replies

nielsr

authored a paper about 2 months ago

Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections

Paper • 2603.12180 • Published Mar 12 • 65

MaziyarPanahi

posted an update about 2 months ago

Post

4848

DNA, mRNA, proteins, AI. I spent the last year going deep into computational biology as an ML engineer. This is Part I of what I found. 🧬

In 2024, AlphaFold won the Nobel Prize in Chemistry.

By 2026, the open-source community had built alternatives that outperform it.

That's the story I find most interesting about protein AI right now. Not just the science (which is incredible), but the speed at which open-source caught up. Multiple teams, independently, reproduced and then exceeded AlphaFold 3's accuracy with permissive licenses. The field went from prediction to generation: we're not just modeling known proteins anymore, we're designing new ones.

I spent months mapping this landscape for ML engineers. What the architectures actually are (spoiler: transformers and diffusion models), which tools to use for what, and which ones you can actually ship commercially.

New post on the Hugging Face blog: https://huggingface.co/blog/MaziyarPanahi/protein-ai-landscape

Hope you all enjoy! 🤗

2 replies

merve

updated a Space about 2 months ago

README

👀

AI & ML interests

Recent Activity

Team members 24

mistral-community's activity

README