Google Cloud Blog
TPU 8t and TPU 8i technical deep dive | Google Cloud Blog
The 8th generation TPUs are engineered with system-level co-design to accelerate the AI lifecycle. TPU 8t is built for frontier-model training and TPU 8i is built for large-scale inference and reinforcement learning.
看到Gemini 3.5 Flash这逆天的速度,突然就联想到了一个多月前发布的第八代TPU
合理猜测推理是基于TPU 8i的
甚至有理由猜测Gemini 3.5 Flash的实际推理成本可能远低于Gemini 3 Flash,如果真是这样的话那谷歌很有可能成为AI时代第一家真正健康赚钱的公司
总之Speed is all you need,如果真的能一直维持这样的速度那其他家也不用比了,这还玩鸡毛()
3 个帖子 - 3 位参与者