Deep Learning Beats Human in an IQ Test

Source: Technology Review, Jun 2015

IQ tests have become a standard feature of modern life and are used to determine children’s suitability for schools and adults’ ability to perform jobs.

These tests usually contain three categories of questions: logic questions such as patterns in sequences of images, mathematical questions such as finding patterns in sequences of numbers and verbal reasoning questions, which are based around analogies, classifications, as well as synonyms and antonyms.

Pose a verbal reasoning question to a natural language processing machine and its performance will be poor, much worse than the average human ability.

Today, that changes thanks to Huazheng and pals who have built a deep learning machine that outperforms the average human ability to answer verbal reasoning questions for the first time.

computer scientists have used data mining techniques to analyze huge corpuses of texts to find the links between words they contain. In particular, this gives them a handle on the statistics of word patterns, such as how often a particular word appears near other words. From this it is possible to work out how words relate to each other, albeit in a huge parameter space.

The end result is that words can be thought of as vectors in this high-dimensional parameter space. the advantage is that they can then be treated mathematically: compared, added, subtracted like other vectors. This leads to vector relations like this one: king – man + woman = queen.

But this approach has a well-known shortcoming: it assumes that each word has a single meaning represented by a single vector. Not only is that often not the case, verbal tests tend to focus on words with more than one meaning as a way of making questions harder.

Huazheng and pals tackle this by taking each word and looking for other words that often appear nearby in a large corpus of text. They then use an algorithm to see how these words are clustered. The final step is to look up the different meanings of a word in a dictionary and then to match the clusters to each meaning.

This can be done automatically because the dictionary definition includes sample sentences in which the word is used in each different way. So by calculating the vector representation of these sentences and comparing them to the vector representation in each cluster, it is possible to match them.

The overall result is a way of recognizing the multiple different senses that some words can have.

Huazheng and pals have another trick up their sleeve to make it easier for a computer to answer verbal reasoning questions. This comes about because these questions fall into several categories that require slightly different approaches to solve.

So their idea is to start by identifying the category of each question so that the computer then knows which answering strategy it should employ. This is straightforward since the questions in each category have similar structures.  

Spotting each type of question is relatively straightforward for a machine learning algorithm, given enough to examples to learn from. And this is exactly how Huazheng and co do it.

Having identified the type of question, Huazheng and buddies then devise an algorithm for solving each one using the standard vector methods but also the multi-sense upgrade they’ve developed.

They compare this deep learning technique with other algorithmic approaches to verbal reasoning tests and also with the ability of humans to do it. For this, they posed the questions to 200 humans gathered via Amazon’s Mechanical Turk crowdsourcing facility along with basic information about their ages and educational background.

And the results are impressive. “To our surprise, the average performance of human beings is a little lower than that of our proposed method,” they say.

Human performance on these tests tends to correlate with educational background. So people with a high school education tend to do least well, while those with a bachelor’s degree do better and those with a doctorate perform best. “Our model can reach the intelligence level between the people with the bachelor degrees and those with the master degrees,” say Huazheng and co.




Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.