What are AI agents, which power OpenAI’s GPT4o and Google’s Project Astra?
Known as ‘AI agents’, GPT-4o and Project Astra have been touted as far superior to conventional voice assistants such as Alexa, Siri, and Google Assistant. The launch of these models marks a new phase in AI — the transition from chatbots to multimodal interactive AI agents.
AI agents perceive their environment via sensors, then process the information using algorithms or AI models, and subsequently, take actions. (Representational image/ FreePik)
The recently launched GPT-4o by OpenAI and Project Astra by Google have one thing in common: both are capable of processing the real world through audio and visual inputs and provide intelligent responses and assistance. In other words, the new AI models can have instant real-time conversations with a user.
Known as ‘AI agents’, GPT-4o and Project Astra have been touted as far superior to conventional voice assistants such as Alexa, Siri, and Google Assistant. The launch of these models marks a new phase in AI — the transition from chatbots to multimodal interactive AI agents.
You’ve Read Your Free Stories For Now
Sign up and keep reading more stories that matter to you.
AI agents are sophisticated AI systems that can engage in real-time, multi-modal (text, image, or voice) interactions with humans. Unlike conventional language models, which solely work on text-based inputs and outputs, AI agents can process and respond to a wide variety of inputs including voice, images, and even input from their surroundings.
“You’re not typing into a text box, waiting for a response and then reading the output. You’re actually interacting with the AI through voice just as you would a human,” according to Google CEO Sundar Pichai.
From the demonstration by OpenAI and Google, one can say that AI agents are nimble when it comes to adapting to new situations. This facet makes them incredibly versatile and capable of handling a wide range of situations.
AI agents perceive their environment via sensors, then process the information using algorithms or AI models, and subsequently, take actions. Currently, they are used in fields such as gaming, robotics, virtual assistants, autonomous vehicles, etc.
How are they different from large language models?
While large language models (LLMs) like GPT-3 and GPT-4 have the ability to only generate human-like text, AI agents make interactions more natural and immersive with the help of voice, vision, and environmental sensors. Unlike LLMs, AI agents are designed for instantaneous, real-time conversations with responses much similar to humans.
Story continues below this ad
LLMs lack contextual awareness, while AI agents can understand and learn from the context of interactions, allowing them to provide more relevant and personalised responses. Also, language models do not have any autonomy since they only generate text output. AI agents, however, can perform complex tasks autonomously such as coding, data analysis, etc. When integrated with robotic systems, AI agents can even perform physical actions.
What are the potential uses of AI agents?
AI agents can serve as intelligent and highly capable assistants. They are capable of handling an array of tasks, from offering personalised recommendations to scheduling appointments. Reports suggest that AI agents can be ideal for customer service as they can offer seamless natural interactions, and resolve queries instantly without actually the need for human interventions.
In the field of education and training, AI agents can act as personal tutors, customise themselves based on a student’s learning styles, and may even offer a tailored set of instructions. In healthcare, they could assist medical professionals by providing real-time analysis, diagnostic support, and even monitoring patients.
Are there any risks and challenges?
While AI agents showcase immense potential for the future, they are not without risks. Privacy and security are a key area of concern as AI agents gain access to more personal data and environmental information. Just like any AI model, AI agents can carry forward biases from their training data or algorithms, leading to harmful outcomes. As these systems become more common, appropriate regulations and governance frameworks should be laid out to ensure their responsible deployment.
Bijin Jose serves as an Assistant Editor at Indian Express Online in New Delhi. A seasoned technology journalist with a diverse portfolio, he brings over a decade of experience in the media industry to his coverage of the evolving digital landscape and emerging technologies.
Experience & Career
Bijin commenced his journalistic journey in 2013 as a citizen journalist with The Times of India. His career trajectory includes significant tenures at prestigious media organizations including India Today Digital and The Economic Times. This diverse professional background, ranging from legacy print institutions to dynamic digital platforms, culminated in his current leadership role at The Indian Express, where he helps shape the publication's technology narrative.
Expertise & Focus Areas
Bijin has transitioned from general reporting to a specialized focus on the intersection of technology and humanity. His key areas of expertise include:
Artificial Intelligence: deeply tracking developments in AI, providing nuanced perspectives on its ethical,industrial, and societal implications.
Tech Commentary: moving beyond product specifications to analyze how technology reshapes daily life.
Diverse Reporting Foundation: draws upon a robust background in crime reporting and cultural features to bring a human-centric approach to technical storytelling.
Authoritativeness & Trust
Bijin’s editorial voice is informed by a strong academic foundation, holding a Bachelor of Arts in English from Maharaja Sayajirao University, Vadodara, and a Master of Arts in English Literature. This literary background enables him to deconstruct complex technical jargon into accessible, compelling narratives. His steady progression through India’s top newsrooms underscores his reputation for editorial rigor and reliable journalism.
Find all stories by Bijin Jose here ... Read More