By Karin Kloosterman

June 13, 2010, Updated September 14, 2012

With increasing numbers of people consulting web reviews and forums prior to making a purchase, it’s more important than ever to detect sarcasm online.

Prof. Ari Rappoport: “Sarcasm can mislead you, if you don’t understand it.”

Detecting sarcasm in emails, Tweets, and online product reviews can confuse even the savviest web users. That’s why Prof. Ari Rappoport of the Hebrew University (HU) in Jerusalem is sure that RevRank, a sarcasm-detecting online tool, could be useful to both consumers and online marketing analysts.

Along with his students, Rappoport, a cognitive science and computer science expert at the HU School of Engineering and Computer Science, has developed a way to detect sarcasm on the Internet. The RevRank tool is built on a powerful algorithm and has definite commercial potential, he believes.

RevRank was presented at a DC conference – the proceedings of the Association for the Advancement of Artificial Intelligence – in May. “It’s most useful for marketing. People use online reviews and forums where they go for people’s opinions, expressed on the web when making purchase decisions. This project is part of a larger project on automatic analysis – and sarcasm detection is important. [Sarcasm] can mislead you if you don’t understand it,” Rappoport declares.

“Sarcastic comments are usually made by intelligent people who might have something important to say. But it depends on their personal style. Sarcastic comments,” the professor tells ISRAEL21c are considered “high-quality opinions,” that provide extra information in an online world that is drowning in data. In terms of marketing potential, knowing what’s sarcastic or genuinely honest allows researchers and buyers to focus on high-quality information, he explains.

Trees died for this book?

It works both ways: “If you are a marketing person and want to understand what people are saying and feeling, through ‘sentiment analysis’ it’s an excellent marketing tool. There are startups already building such products, says Rappoport, himself an entrepreneur who has returned to academia.

He lists three Israeli startups among those he is aware of that focus on sentiment analysis, one of which boasts Yaron Galai, founder of Outbrain.

At the other end of the spectrum, he adds, consumers can utilize sentiment analysis, in this case sarcasm, as an aid to decision-making.

Consider reading the statement: “Trees died for this book?” If one read it on a book review site, one would probably assume that it was a sarcastic comment. But how about: “This iPod is the most brilliant in the world.”

How would you know whether the reviewer was being honest or sarcastic? Using text analysis alone, RevRank bridges the difference between computerized interpretation of text and human linguistics.

Accuracy rate of 77%

In the study, the researchers took information posted on Twitter, and used data from 66,000 Amazon product and book reviews. Searching for “optimal criticism” elements, their algorithm classified and ranked data based on a control of very high quality reviews to predict sarcastic remarks with a 77 percent accuracy rate, the researchers report.

The tool can go so far as to limit the length and depth of content in a review according to user preferences, and could affect the way we collect information from the Internet, surmise the researchers, who include students Oren Tsur and Dmitry Davidov.

The new RevRank algorithm is part of a larger study that Rappoport is investigating from the perspective of a computational linguist interested in natural language processing.

Previously, he and Tsur examined book reviews on Amazon and were able to develop an algorithm that could pick out the high-quality reviews from all the junk.

“They get buried. It’s a fact,” remarks Rappoport of Amazon’s review system, which is based on reader votes. It’s an inherently problematic approach, as the first reviews tend to garner many votes, while subsequent and potentially superior reviews are often buried deep down inside the website, making it harder for readers and potential book or product buyers to find them.

Researching language with no strings attached

The HU team developed a program to “automatically discover high-quality reviews,” states Rappoport, who is most interested in questions about language acquisition and how language is represented in cognitive science and computer science.

He says that the simple keyword searches employed by Google’s search engine are far from primitive, and that indicators actually suggest the opposite, but “certainly the way computers organize information can be greatly improved,” citing a study he authored that shows how his lab can answer numerical questions better than Google can, such as the height of America’s First Lady Michelle Obama.

Even when exact figures are not available online, using an existing database the HU team can compare other related data that they have on file. For example, they might have comparative data on other people standing beside Michelle, which they could include in their estimates.

The sarcasm detector has already been submitted to HU’s technology transfer arm Yissum. “We don’t know how it will turn out,” says Rappoport, who a few years ago helped to build manufacturing enterprise software through the startup Proficiency, which was sold to US company, International TechneGroup.

Today, that software is being used by companies such as Honda, Boeing and Airbus. After selling his share in the company Rappoport returned to academia, his true love. Happily ensconced at Hebrew University, he continues to ask questions about language, with no commercial entity connected to his work. “I prefer not to get funding because it comes with strings attached,” he concludes.