Nemotron 3 Ultra (free) - API Pricing & Benchmarks
NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model from NVIDIA, with 55B active parameters out of 550B total (MoE). $0 per million input tokens, $0 per million output tokens. 1,000,000 token context window, maximum output...
自家模型build算力应该是给够了,要不就是目前还没什么人用,无等待时间,40t/s
nemotron-3-ultra-550b-a55b Model by NVIDIA | NVIDIA NIM
Open, efficient hybrid Mamba-Transformer MoE with 1M context, excelling in agentic reasoning, coding, planning, tool calling, and more
1 个帖子 - 1 位参与者