MIT scientists, including those of Indian origin, have developed a new system that allows robots to understand voice commands just like artificial intelligence (AI) assistants such as Siri and Alexa. Currently robots are very limited in what they can do. Their inability to understand the nuances of human language makes them mostly useless for more complicated requests.
For example, if you put a specific tool in a toolbox and ask a robot to “pick it up,” it would be completely lost. Picking it up means being able to see and identify objects, understand commands, recognise that the “it” in question is the tool you put down, go back in time to remember the moment when you put down the tool, and distinguish the tool you put down from other ones of similar shapes and sizes.
Researchers from Massachusetts Institute of Technology (MIT) have gotten closer to making this type of request easier. They have developed an Alexa-like system “ComText” – for “commands in context” – that allows robots to understand a wide range of commands that require contextual knowledge about objects and their environments.
“Where humans understand the world as a collection of objects and people and abstract concepts, machines view it as pixels, point-clouds, and 3D maps generated from sensors,” said Rohan Paul, “This semantic gap means that, for robots to understand what we want them to do, they need a much richer representation of what we do and say,” Paul said.
The team tested ComText on a two-armed humanoid robot Baxter. ComText can observe a range of visuals and natural language to learm about an object’s size, shape, position, type and even if it belongs to somebody. From this knowledge base, it can then reason, infer meaning and respond to commands.
“The main contribution is this idea that robots should have different kinds of memory, just like people,” said Barbu. With ComText, Baxter was successful in executing the right command about 90 per cent of the time. In the future, the team hopes to enable robots to understand more complicated information, such as multi-step commands, the intent of actions, and using properties about objects to interact with them more naturally.
By creating much less constrained interactions, this line of research could enable better communications for a range of
robotic systems, from self-driving cars to household helpers. “This work is a nice step towards building robots that can interact much more naturally with people,” said Luke Zettlemoyer, an associate professor at the University of Washington in the US, who was not involved in the research.
“In particular, it will help robots better understand the names that are used to identify objects in the world, and interpret instructions that use those names to better do what users ask,” Zettlemoyer said.