Premium
This is an archive article published on May 15, 2024

I got a demo of Google’s Project Astra at I/O 2024, and here are the takeaways

Google calls Astra "the future of AI assistants," a universal AI agent that can "help in everyday life."

Google Project AstraAt Google I/O 2024, the company showed off Project Astra, a multimodal AI agent designed to help you. (Image credit: Anuj Bhatia/Indian Express).

As I waited for my turn to enter the demo zone to experience Project Astra, Google’s voice-operated AI assistant, at the company’s annual developer conference, I saw Google co-founder Sergey Brin enter the booth, and exit exactly 10 minutes. Brin came twice to check out the demo, and I wondered and asked myself what was going on in his mind as he was getting the demo of Project Astra, a multimodal AI agent designed to help you in everyday life.

For me, the biggest highlight of this year’s Google I/O was Project Astra. During my brief experience with Project Astra at the company’s annual developer conference in Mountain View, California, I could see where Google’s prototype AI assistant could take the technology being talked about on every platform.

Google co-founder Google co-founder Sergey Brin appeared at this year’s Google’s I/O developer conference. (Image credit: Anuj Bhatia/Indian Express)

Google DeepMind CEO Demis Hassabis describes Astra as “a universal agent helpful in everyday life.” Just imagine a super-charged Google Assistant with the smarts of Gemini built-in, with the capabilities of image recognition like what you get with Google Lens. This is exactly what Project Astra is. Simply put, Astra is a “multimodal” AI assistant, powered by an upgraded version of its Gemini Ultra model. It’s “multimoda” — it has been trained on audio, images, video, and text, and can generate data in all those formats. It can make sense of your surroundings using the device’s camera and takes audio, photos, and video input to respond to your queries and follow-up questions.

Google allowed four journalists at a time to experience Astra in a highly controlled demo zone — I was among the few to get a peek in person for the first time at Google I/O. As I entered the demo area, with a huge screen set up with a camera in front, two researchers from the Project Astra team at DeepMind gave us a preview of how the voice-operated assistant works. They walked us through four modes: Storyteller, Pictionary, Free-Form, and Alliteration. I tried different modes to see how accurate Astra was in its responses and whether it could hold a conversation like humans do, as Google promised in a pre-recorded demonstration at the keynote.

During the demo, the Google team placed some soft toys in front of the camera, and the assistant could transcribe the speaker’s words and create a story based on the objects. One of the team members from Google then placed another object in front of the camera, and the AI assistant continued with the story – only this time with additional details to the scenarios created by Gemini. It felt magical, as that one additional object became a new character in the story.

I then chose the Pictionary mode. It was meant to showcase the assistant’s prowess in interpreting drawings and guessing the object being depicted. It didn’t matter whether you had limited artistic skills, as Gemini would correctly identify what was drawn and name the object.

After trying different modes, what impressed me the most was that the interaction with the assistant felt natural and engaging, something I never experienced while using Google Assistant and Siri. More importantly, Astra’s capabilities go beyond what we have seen in existing AI assistants. The Google researchers told me that Astra uses built-in “memory,” meaning after it scans the objects, it could still “remember” where specific items were placed. As of now, Astra’s memory is limited to a short window, but if it gets expanded in the future, the possibilities are endless. Imagine if the AI assistant remembers where I left my phone on the table last night before going to bed, it would be totally insane.

Story continues below this ad

Fundamentally, Google’s Project Astra does the same thing as AI devices such as Meta’s Ray-Ban glasses, Rabbit R1, and Humane AI Pin: using their cameras to analyse your surroundings and provide information about what you are looking at. However, the response times are typically slow, and these devices also lack in functionality. However, I was surprised to see how quick and snappy Astra felt during the demo. The Rabbit R1 should have an app, and Google’s Project Astra proves why.

Project Astra Visitors are waiting in the queue to get a demo of Google’s Project Astra. (Image credit: Anuj Bhatia/Indian Express)

But I can already imagine Google will find a way to bring Astra to some new type of wearable in the future. In fact, Google has already teased the AI Assistant running with a pair of glasses. The possibilities are endless with Project Astra if Google gets it right. Maybe, Google Glass will make a comeback — this time with an AI twist.

For now, though, Project Astra is still in a “research preview”, Google already have plans to bring the advanced AI Assistant’s capabilities to some of these capabilities into products like the Gemini app later this year later this year.

My biggest takeaway from Project Astra is that we are moving into more evolved versions of ChatGPT and Gemini AI chatbots by adding a layer of visuals and audio on top. No matter how one chooses to pitch Project Astra, it is built on the concept of real-time, camera-based AI that identifies an object to spin a fictional story. That said, none of Astra’s capabilities make it like a human or sound like a human. After all, humans interact with the physical world differently from AI chatbots, which rely on language-centric AI models, and their learning comes from troves of data available on the web.

Story continues below this ad

The writer is attending I/O 2024 in Mountain View, California, at the invitation of Google India.

Anuj Bhatia is a seasoned personal technology writer at indianexpress.com with a career spanning over a decade. Active in the domain since 2011, he has established himself as a distinct voice in tech journalism, specializing in long-form narratives that bridge the gap between complex innovation and consumer lifestyle. Experience & Career: Anuj has been a key contributor to The Indian Express since late 2016. Prior to his current tenure, he served as a Senior Tech Writer at My Mobile magazine and held a role as a reviewer and tech writer at Gizbot. His professional trajectory reflects a rigorous commitment to technology reporting, backed by a postgraduate degree from Banaras Hindu University. Expertise & Focus Areas: Anuj’s reporting covers the spectrum of personal technology, characterized by a unique blend of modern analysis and historical context. His key focus areas include: Core Technology: Comprehensive coverage of smartphones, personal computers, apps, and lifestyle tech. Deep-Dive Narratives: Specializes in composing longer-form feature articles and explainers that explore the intersection of history, technology, and popular culture. Global & Local Scope: Reports extensively on major international product launches from industry titans like Apple and Google, while simultaneously covering the ecosystem of indie and home-grown tech startups. Niche Interests: A dedicated focus on vintage technology and retro gaming, offering readers a nostalgic yet analytical perspective on the evolution of tech. Authoritativeness & Trust Anuj is a trusted voice in the industry, recognized for his ability to de-jargonize trending topics and provide context to rapid technological advancements. His authority is reinforced by his on-ground presence at major international tech conferences and his nuanced approach to product reviews. By balancing coverage of the world's most valuable tech brands with emerging startups, he offers a holistic and objective view of the global technology landscape. Find all stories by Anuj Bhatia here. You can find Anuj on Linkedin. ... Read More

 

Latest Comment
Post Comment
Read Comments
Advertisement
Loading Taboola...
Advertisement