Developers

Nemotron-Cascade-2-30B-A3B

Text

Nemotron-Cascade-2-30B-A3B

High-performance reasoning and coding model optimized for efficient agent workflows.

Efficient MoE Design. A 30B parameter Mixture-of-Experts model that activates ~3B parameters per token. Delivers strong reasoning performance without the cost of full dense models.
Reasoning First. Trained with cascade RL and distillation to excel at math, logic, and code. Achieves top-tier results on benchmarks like IMO, AIME, and IOI.
Dual Mode Operation. Supports a configurable Thinking mode (with <think> reasoning traces) and a standard Instruct mode for faster responses when reasoning isn’t needed.
Built for Coding & Agents. Strong performance on competitive programming and software tasks. Works well in tool-based and agent loops (optimized for OpenHands-style setups).
Long Context Ready. Supports up to ~262k tokens, enabling multi-turn conversations and large context workflows without heavy degradation.
Simple Integration. Uses ChatML format, runs cleanly on vLLM, and supports tool calling without complex role handling.

Why pick it for Norman AI?

Nemotron-Cascade-2 is a strong “reasoning-first” model for startups that need real problem-solving ability without running 70B+ models. It’s a good fit for coding agents, technical assistants, and workflows where the model actually needs to think — not just autocomplete.

messages = [
    {"role": "system", "content": "You are a helpful AI assistant."},
    {"role": "user", "content": "Can you provide ways to eat combinations of bananas and dragonfruits?"},
    {"role": "assistant",
     "content": "Sure! Here are some ways to eat bananas and dragonfruits together"},
    {"role": "user", "content": "What about solving an 2x + 3 = 7 equation?"},
]

response = await norman.invoke(
    {
        "model_name": "nemotron-cascade-2-30b-a3b",
        "inputs": [
            {
                "display_title": "Prompt",
                "data": messages
            }
        ]
    }
)

messages = [
    {"role": "system", "content": "You are a helpful AI assistant."},
    {"role": "user", "content": "Can you provide ways to eat combinations of bananas and dragonfruits?"},
    {"role": "assistant",
     "content": "Sure! Here are some ways to eat bananas and dragonfruits together"},
    {"role": "user", "content": "What about solving an 2x + 3 = 7 equation?"},
]

response = await norman.invoke(
    {
        "model_name": "nemotron-cascade-2-30b-a3b",
        "inputs": [
            {
                "display_title": "Prompt",
                "data": messages
            }
        ]
    }
)

messages = [
    {"role": "system", "content": "You are a helpful AI assistant."},
    {"role": "user", "content": "Can you provide ways to eat combinations of bananas and dragonfruits?"},
    {"role": "assistant",
     "content": "Sure! Here are some ways to eat bananas and dragonfruits together"},
    {"role": "user", "content": "What about solving an 2x + 3 = 7 equation?"},
]

response = await norman.invoke(
    {
        "model_name": "nemotron-cascade-2-30b-a3b",
        "inputs": [
            {
                "display_title": "Prompt",
                "data": messages
            }
        ]
    }
)

View Docs

Home

Developers

Join Us

Contact

Nemotron-Cascade-2-30B-A3B

Nemotron-Cascade-2-30B-A3B

Why pick it for Norman AI?