Google at its I/O 2024 on Tuesday introduced the 6th generation Google Cloud TPU (Tensor Processing Units) called Trillium, a custom AI-specific hardware that supports the company's latest generative AI models like Gemini 1.5 Flash, Imagen 3, and Gemma 2.0, which have been trained and served on Trillium TPU. Jeff Dean, Chief Scientist, Google DeepMind and Google Research stated, "Gemini 1.5 Pro is Google’s largest and most capable AI model and it was trained using tens of thousands of TPU accelerators. Our team is excited about the announcement of the 6th generation of TPUs and we’re looking forward to the increase in performance and efficiency for training and inference at scale of our Gemini models." Compared to the previous generation TPU v5e, Trillium offers a 4.7X increase in peak compute performance per chip with double the high bandwidth memory (HBM) and double the Interconnect (ICI) bandwidth. Trillium is equipped with third-generation SparseCore, a special accelerator, capable of processing ultra-large embeddings in advanced ranking and recommendation workloads. Google states that Trillium can train the next wave of AI models faster with reduced latency and lower cost, and it is also said to be Google's most sustainable TPU till date with over 67 per cent more energy efficiency compared to its predecessor. Trillium can scale up to 256 TPUs in a single high-bandwidth, low-latency pod and also supports multislice technology, enabling Google to interconnect thousands of chips to build a supercomputer, capable of creating a data centre network, which can process petabits of data in a second. Google introduced its very first TPU v1 back in 2013, followed by a cloud TPU in 2017, which has been powering various services like real-time voice search, photo object recognition, language translation, and more, which even power products like Nuro, an autonomous vehicle company. Trillium is already a part of Google's AI Hypercomputer, a supercomputer architecture, designed for handling cutting-edge AI workloads, and it is partnering with Hugging Face to optimise the hardware for open-source model training and serving.