
GPT-J-6B
Text
GPT-J 6B (6 B params, Apache-2.0)
Open-source workhorse that still fits on a single mid-range card.
Spec sheet. 28 decoder layers, 4 096-dim model, 16 heads, RoPE, 2 048-token window.
Trained on The Pile. 400 B mixed-domain tokens give it solid general knowledge and code skills.
Reasonable hardware. Needs ≈ 10.9 GB VRAM in FP16; int-4 quant drops to ≈ 2.7 GB, so a 12 GB gaming GPU or cheap cloud A10G works.
Outperforms its peers. Beats OPT/GPT-Neo models of similar size on HellaSwag, ARC, MMLU and basic code tasks.
Plug-and-play. First-class support in transformers, vLLM, llama.cpp (GGUF), Ollama, Triton, DeepSpeed, etc.—just from_pretrained("EleutherAI/gpt-j-6b") and go.
Why pick it for Norman AI?
Apache license, transparent training recipe, and sub-11 GB footprints make GPT-J 6B the easy upgrade path when Tiny-tier models aren’t enough but H100s are overkill—perfect for mid-cost inference tiers, quick fine-tunes, or as a control baseline in our benchmarking suite.
