NVIDIA Nemotron 2 Nano 9B Japanese: State-of-the-Art Small Language Model Customized for Japanese Sovereign AI

Community Article Published February 17, 2026

NVIDIA Nemotron is advancing sovereign AI by providing not only open models, but also datasets, libraries, recipes, and cookbooks, and enabling developers to customize the models and adapt them for diverse use cases and languages.

Today, we're releasing NVIDIA Nemotron-Nano-9B-v2-Japanese, which achieves state-of-the-art performance in the sub-10B parameter category on the Nejumi leaderboard 4. This release represents a significant milestone for Japanese enterprise AI development, combining advanced Japanese language understanding with strong agentic capabilities in a deployment-friendly footprint.

This success builds on two key foundations: the proven architecture of Nemotron 2 Nano 9B and the high-quality Japanese synthetic data generation (SDG) methodology established by Nemotron-Personas-Japan.

By customizing our previously released Nemotron 2 Nano model for Japanese, we aim to inspire the community to develop and release such custom leading models for diverse use cases and languages. The Nemotron team is incorporating the learnings from this customization into future Nemotron releases, strengthening reasoning capabilities in the Japanese language.

Why SLM Matters for Japanese Enterprise

The Japanese enterprise AI landscape has a critical gap: There are no SLM options that possess both powerful Japanese capabilities and agentic task performance. This creates deployment challenges, particularly for:

On-premises deployment requirements: Enterprises handling confidential data need models that run within private networks. Sub-10B models drastically lower infrastructure barriers while maintaining production-grade performance.

Customization workflows: Starting from a strong Japanese base model with proven agentic capabilities reduces fine-tuning cycles. You can focus the compute budget on domain adaptation rather than building foundational capabilities from scratch.

Agent development velocity: The model's architecture and training enable rapid prototyping of multi-agent systems and complex workflows without the overhead of larger models.

Building on Proven Foundations

Nemotron 2 Nano: Architecture Excellence

Nemotron-Nano-9B-v2-Japanese is a fine-tuned model variant of NVIDIA Nemotron-Nano-9B-v2, which demonstrated an exceptional performance-to-size ratio across English benchmarks. Leveraging this efficient architecture as a foundation, we performed further customization to enhance its Japanese language capabilities. This architecture provides:

Optimized parameter efficiency for strong reasoning capabilities
Robust foundation for multilingual adaptation
Proven agentic task performance

By adapting this validated architecture to Japanese, we maintain the base model's strengths while achieving native-level Japanese capabilities.

Nemotron-Personas-Japan: The Seed for High-Quality SDG

Our data strategy centers on leveraging Nemotron-Personas-Japan, our open-source (CC BY 4.0) dataset of synthetically-generated personas grounded in real-world demographic, geographic, and personality trait distributions in Japan to capture the diversity and richness of the population. By bootstrapping from these culturally accurate personas, we constructed a training pipeline that is highly diverse, scalable, and robust. The rich variety of personas in the seed data allowed us to exponentially expand our synthetic dataset across a wide range of scenarios and nuances. This methodology ensured that our scaled data maintained the strict cultural alignment of the original personas while achieving the volume necessary for state-of-the-art training.

Specifically for Nemotron-Nano-9B-v2-Japanese, we utilized these personas as the seed for generating our tool-calling training data. This ensures that the model's agentic capabilities are not just functional, but grounded in culturally appropriate Japanese interactions and real-world use cases.

The Nemotron-Personas collection also includes datasets for the USA, India, Singapore, and Brazil, enabling the same methodology to be replicated across regions.

Training Pipeline

The model was trained using a combination of Japanese open-source corpora and NVIDIA’s Nemotron data stack, spanning continual pre-training, synthetic data generation, and post-training alignment.

Continual Pre-training datasets

Japanese OSS Corpus: Wikipedia, fineweb-2 Japanese, aozorabunko, sip3-ja-general-web-corpus
Nemotron-CC-v2.1
Nemotron-Pretraining-Specialized-v1

Supervised Fine-Tuning datasets

Tool-calling dataset generated using Nemotron-Personas-Japan as seed data
Nemotron-Post-Training-v3

Software used for Nemotron-Nano-9B-v2-Japanese

Megatron-LM for continued Pre-training and Supervised Fine-Tuning
NeMo Curator for accelerated data processing and filtering

To develop Nemotron 2 Nano 9B by enhancing Nemotron 2 Nano’s Japanese agentic AI capabilities, we adopted a two-stage training approach consisting of continual pre-training and supervised fine-tuning.

The continual pre-training stage aimed to maximize the model's Japanese potential and further enhance its proficiency in Japanese. We leveraged assets from LLM-jp, Japan's leading open-source LLM community, to the fullest extent. We also utilized Nemotron Pre-training Datasets to ensure the model's agentic capabilities were preserved.

Our tool-calling dataset used for SFT, seeded with Nemotron-Personas-Japan, proved to be exceptionally powerful. The performance improvements extended beyond just tool calling, enhancing a wide range of capabilities, including Japanese knowledge, QA, and instruction following, etc. Furthermore, the scale and diversity of Nemotron-Personas-Japan's 6 million personas enable us to scale SDG effectively while covering a broad spectrum of real-world scenarios with minimal duplication. As the Nemotron-Personas collection expands across countries, the same synthetic bootstrapping methodology can be applied to other languages and regulatory environments.

For model training, we inherited the training recipe established with Nemotron Nano 2. This enabled us to increase throughput without introducing training instability.

This approach ensures strong Japanese language modeling while developing robust tool-calling and reasoning capabilities in Japanese.

Benchmark Performance

Nemotron-Nano-9B-v2-Japanese ranks first among sub-10B models on the Nejumi leaderboard 4, Japan's most comprehensive LLM evaluation platform. Nejumi leaderboard evaluates models across approximately 40 benchmarks spanning:

Core language proficiency: Japanese understanding and generation
Agentic capabilities: Code, Math, Tool-use, etc
Alignment: Instruction following, Bias, Toxicity, Truthfulness, Robustness, etc

This multi-dimensional evaluation makes Nejumi leaderboard the go-to reference for developers selecting base models for customization or production deployment in Japanese environments.

Benchmark results confirm that Nemotron-Nano-9B-v2-Japanese successfully adds robust Japanese capabilities to the base Nemotron-Nano-9B-v2. These improvements extend beyond Japanese knowledge and QA to cover a wide range of tasks, including tool calling, coding, and alignment. Notably, it outperforms the similarly sized Qwen3-8B, achieving an exceptional performance-to-size ratio.

Technical Advantages

Inference efficiency: By inheriting the architecture of Nemotron 2 Nano (Transformer-Mamba), it achieves up to 6x throughput improvement over open-source alternative models while being deployable on edge GPUs. The figure above shows results measured in the Nemotron 2 Nano paper.
Context handling: Optimized for extended conversations and multi-turn tool interactions
Tool-calling reliability: Strong structured output generation for API calls and function execution
Fine-tuning efficiency: Parameter count allows full fine-tuning on accessible compute infrastructure

Deployment Options

Direct Adaptation

Deploy the model as-is for applications requiring strong Japanese understanding and tool use. The pre-trained capabilities support immediate integration into agentic workflows. Due to the identical architecture, the inference engines supported by Nemotron 2 Nano can be seamlessly migrated.

Domain Customization

Use Nemotron-Nano-9B-v2-Japanese as a base for domain-specific fine-tuning. The model's balanced performance across Nejumi leaderboard's diverse benchmarks provides a robust foundation for specialized applications. For customization, NeMo Framework (NeMo Megatron-Bridge, NeMo AutoModel, and NeMo-RL) is available.

Getting Started

Nemotron-Nano-9B-v2-Japanese is available now for developers building Japanese AI applications. Whether you're implementing customer-facing agents, internal automation tools, or domain-specific assistants, this model provides the performance-to-size ratio needed for production deployment.

The combination of Nemotron 2 Nano’s proven architecture and Nemotron-Personas-Japan as a seedset for a high-quality dataset makes it an efficient starting point for Japanese sovereign AI development.

We invite the community to explore our Nemotron models, datasets, recipes, and libraries, and to customize Nemotron models for even more languages and use cases. We’re excited to see what you will build!

Stay up to date on NVIDIA Nemotron by subscribing to NVIDIA news and following NVIDIA AI on LinkedIn, X, YouTube, and the Nemotron channel on Discord.

Access open Nemotron Models on Hugging Face and a collection of NIM microservices and Developer Examples on build.nvidia.com.

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote