Central Development
On April 22, Google Cloud announced two new AI accelerators—TPU8t for model training and TPU8i for inference—as part of its eighth-generation TPU lineup, positioning the hardware to compete more directly with Nvidia in cloud AI infrastructure, according to TechCrunch. The eighth-generation designation and the training/inference split were detailed by Ars Technica. Even as it expands TPUs, Google Cloud continues to offer Nvidia GPUs in its portfolio, TechCrunch reported. Separately, Thinking Machines Lab signed a multi-billion-dollar deal with Google Cloud, deepening their commercial relationship, per TechCrunch.
Why It Matters
The TPU8 launch underscores Google’s bid to narrow performance and cost gaps with Nvidia-based options for both training and serving AI models. Google indicates the new chips are faster and cheaper than prior TPUs, with TPU8t aiming to reduce training cycles from months to weeks, according to Ars Technica. Maintaining Nvidia GPUs while pushing TPUs suggests a broad compute strategy aimed at customer flexibility, TechCrunch noted. The Thinking Machines Lab deal signals willingness by significant AI users to commit larger workloads to Google Cloud, though the specific hardware mix was not disclosed, per TechCrunch.
Perspective
Coverage diverges in emphasis: TechCrunch frames the TPUs as a competitive play against Nvidia while noting Google’s continued Nvidia offerings, whereas Ars Technica highlights generational context (eighth-gen TPUs following 2025’s Ironwood) and Google’s training-time claims. Independent benchmarks and customer case studies will be crucial to validate Google’s performance and cost assertions.
What to Watch
Third-party benchmarks comparing TPU8t/8i to Nvidia-based instances on training time, throughput, and cost.
- Customer adoption signals: new contracts, migration announcements, or reserved-capacity commitments.
- Availability details and pricing for TPU8-based services across regions.
- Additional large-scale partnerships that clarify hardware choices and workload distribution across TPUs and GPUs.



