Green Fern
Green Fern

GPT-J-6B

Text

GPT-J 6B (6 B params, Apache-2.0)

Open-source workhorse that still fits on a single mid-range card.

  • Spec sheet. 28 decoder layers, 4 096-dim model, 16 heads, RoPE, 2 048-token window.

  • Trained on The Pile. 400 B mixed-domain tokens give it solid general knowledge and code skills.

  • Reasonable hardware. Needs ≈ 10.9 GB VRAM in FP16; int-4 quant drops to ≈ 2.7 GB, so a 12 GB gaming GPU or cheap cloud A10G works.

  • Outperforms its peers. Beats OPT/GPT-Neo models of similar size on HellaSwag, ARC, MMLU and basic code tasks.

  • Plug-and-play. First-class support in transformers, vLLM, llama.cpp (GGUF), Ollama, Triton, DeepSpeed, etc.—just from_pretrained("EleutherAI/gpt-j-6b") and go.

Why pick it for Norman AI?

Apache license, transparent training recipe, and sub-11 GB footprints make GPT-J 6B the easy upgrade path when Tiny-tier models aren’t enough but H100s are overkill—perfect for mid-cost inference tiers, quick fine-tunes, or as a control baseline in our benchmarking suite.


messages = [
    {"role": "system", "content": "You are a helpful AI assistant."},
    {"role": "user", "content": "Can you provide ways to eat combinations of bananas and dragonfruits?"},
    {"role": "assistant",
     "content": "Sure! Here are some ways to eat bananas and dragonfruits together"},
    {"role": "user", "content": "What about solving an 2x + 3 = 7 equation?"},
]

response = await norman.invoke(
    {
        "model_name": "gpt-j-6b",
        "inputs": [
            {
                "display_title": "Prompt",
                "data": messages
            }
        ]
    }
)