A team from IIT-Bombay has written a program which they claim can automatically detect if a person is drunk by reading their text messages.
“With accessible communication devices such as tablets and cellphones, the social media content reflects the day-to-day lives of people more than ever today. This includes sending messages when under the influence of alcohol. Sending text messages under the influence of alcohol has been called ‘drunk-texting’ in popular parlance. It’s a real phenomenon, with real impacts and there exists no implementation that uses a text-based analysis to predict drunk-texting. The goal of this program is to identify whether a given piece of text was written by an author under the influence of alcohol. This is a first-of-its-kind work that provides quantitative evidence that a text-based analysis may be useful for drunk-texting prediction,” said Aditya Joshi, PhD student from IITB-Monash Research Academy.
Joshi has worked on this project along with supervisors, Professor Pushpak Bhattacharyya from IIT-Bombay, and Professor Mark J Carman from Monash University, Melbourne. “We use a statistical classifier to predict whether a tweet is written by a user under the influence of alcohol. The current testing is on tweets, while it would typically apply to any user-generated text on social media,” said Joshi. And according to him, it was a message that a friend received from another friend one evening that sparked the idea of writing a program which could predict whether the sender is drunk or not. The paper, co-authored by Abhijit Mishra (IIT Bombay student), Balamurali AR (researcher in Marseille and an alumnus of IIT Bombay), Bhattacharyya and Carman, has appeared at the Conference for Association of Computational Linguistics 2015, a top rated conference, in Beijing in July 2015.
The team has used two sets of features, ‘N-gram’ based features, and ‘stylistic’ features that qualify typical styles of writing that drunk texts may have, such as high sentiment-bearing words, capitalisation and spelling mistakes, among others. “To obtain our labeled data set, we make use of hashtags that people add to tweets. So the writer of a tweet actually tells us through his or her hashtag whether he or she is drunk at the time of writing the tweet,” he said. According to the team, the project will be useful for a mental health professional or a relative trying to monitor a person’s behavior and those trying to avoid private information at workplace being leaked through their emails or texts.
The team members further said that it would be useful in drunk driving cases too.
We made interesting observations: it includes spelling mistakes, use of drunk-related words (like alcohol, tonight, drunk, etc) and capitalisation tend to be the topmost discriminating feature. Our algorithm, which was automatic, and hence, faster and more cost-effective, could predict in 64 per cent of the tweets if it was written under the influence of alcohol. This shows that we were able to imitate human behavior closely
“Say a person is suspected to be involved in a drunk driving episode. Can we identify if the tweets sent in this period were really drunk? This form an additional basis to justify that the person was drunk. The drunk-texting prediction system is an automatic tool that will run constantly in the background and allow, say, police officers, to identify who is sending drunk texts in their locality. With public Twitter timelines, the data is accessible through Twitter APIs (application program interfaces),” said Mishra.