Premium
This is an archive article published on October 17, 2010

A machine to mimic human understanding

Researchers at Carnegie Mellon University are fine-tuning a computer system that is trying to master semantics by learning more like a human

Give a computer a task that can be crisply definedwin at chess,predict the weatherand the machine bests humans nearly every time. Yet when problems are nuanced or ambiguous,or require combining varied sources of information,computers are no match for human intelligence.

Few challenges in computing loom larger than unravelling semantics,understanding the meaning of language. One reason is that the meaning of words and phrases hinges not only on their context,but also on background knowledge that humans learn over years,day after day.

Since the start of the year,a team of researchers at Carnegie Mellon Universitysupported by grants from the Defence Advanced Research Projects Agency and Google,and tapping into a research supercomputing cluster provided by Yahoohas been fine-tuning a computer system that is trying to master semantics by learning more like a human. Its beating hardware heart is a sleek,silver-gray computercalculating 24 hours a day,seven days a weekthat resides in a basement computer centre at the university,in Pittsburgh. The computer was primed by the researchers with some basic knowledge in various categories and set loose on the Web with a mission to teach itself.

For all the advances in computer science,we still dont have a computer that can learn as humans do,cumulatively,over the long term, said the teams leader,Tom M. Mitchell,a computer scientist and chairman of the machine learning department.

The Never-Ending Language Learning system NELL has made an impressive showing so far. It scans hundreds of millions of Web pages for text patterns that it uses to learn facts,390,000 to date,with an estimated accuracy of 87 per cent. These facts are grouped into semantic categoriescities,companies,sports teams,actors,universities,plants and 274 others. The category facts are things like San Francisco is a city and sunflower is a plant.

NELL also learns facts that are relations between members of two categories. For example,Peyton Manning is a football player category. The Indianapolis Colts is a football team category. By scanning text patterns,NELL can infer with a high probability that Peyton Manning plays for the Indianapolis Coltseven if it has never read that Manning plays for the Colts. Plays for is a relation,and there are 280 kinds of relations. The number of categories and relations has more than doubled since earlier this year,and will steadily expand.

The learned facts are continuously added to NELLs growing database,which the researchers call a knowledge base. A larger pool of facts, Mitchell says,will help refine NELLs learning algorithms so that it finds facts on the Web more accurately and more efficiently over time.

Story continues below this ad

NELL is one project in a widening field of research and investment aimed at enabling computers to better understand the meaning of language. Many of these efforts tap the Web as a rich trove of text to assemble structured ontologiesformal descriptions of concepts and relationshipsto help computers mimic human understanding.

Today,ever-faster computers,an explosion of Web data and improved software techniques are opening the door to rapid progress. For example,IBMs question answering machine,Watson,shows remarkable semantic understanding in fields like history,literature and sports as it plays the quiz show Jeopardy!. Google Squared,a research project at the Internet search giant,demonstrates ample grasp of semantic categories as it finds and presents information from around the Web on search topics like US presidents and cheeses.

Still,artificial intelligence experts agree that the Carnegie Mellon approach is innovative. Many semantic learning systems,they note,are more passive learners,largely hand-crafted by human programmers,while NELL is highly automated. Whats exciting and significant about it is the continuous learning,as if NELL is exercising curiosity on its own,with little human help, said Oren Etzioni,a computer scientist at the University of Washington,who leads a project called TextRunner,which reads the Web to extract facts.

Computers that understand language,experts say,promise a big payoff someday. The potential applications range from smarter search supplying natural-language answers to search queries,not just links to Web pages to virtual personal assistants that can reply to questions in specific disciplines or activities like health,education,travel and shopping.

Story continues below this ad

The technology is really maturing,and will increasingly be used to gain understanding, said Alfred Spector,vice president of research for Google. Were on the verge now in this semantic world.

NELL,Mitchell explains,is designed to be able to grapple with words in different contexts,by deploying a hierarchy of rules to resolve ambiguity. This kind of nuanced judgement tends to flummox computers. But as it turns out,a system like this works much better if you force it to learn many things,hundreds at once, he said.

For example,the text-phrase structure I climbed XXX very often occurs with a mountain. But when NELL reads,I climbed stairs, it has previously learned with great certainty that stairs belongs to the category building part. It self-corrects when it has more information,as it learns more, Mitchell explained. So much of human language is background knowledge,knowledge accumulated over time. Thats where NELL is headed,and the challenge is how to get that knowledge.

His ideal,Mitchell said,was a computer system that could learn continuously with no need for human assistance. Were not there yet, he said. But you and I dont learn in isolation either.

 

Latest Comment
Post Comment
Read Comments
Advertisement
Advertisement
Advertisement
Advertisement