Ironwood is Google admitting inference is the main event now
Google's seventh-generation TPU is purpose-built for inference and scales to 9,216 chips. The chip story is really a cost-and-capacity story for thinking models, agents, and the workloads that never stop running.