Microsoft has been a world leader in technological innovation, building some of the most important products in both the software and the hardware industry. At this Redmond, Washington-based firm, research plays a crucial role in driving breakthroughs. Its researchers, scientists and engineers have influenced every product the company has released in the last three decades, including Xbox, Cortana, Azure ML, Office, HoloLens, Skype and Windows.
Microsoft has labs around the world, where researchers make breakthroughs on artificial intelligence, security, human-computer interaction, and more. The India lab—Microsoft Research India (MSR)—established in January 2005 in Bengaluru, is engaged in cutting-edge basic and applied research in multiple fields in advanced technology in computing, multilingual systems, sensor networks and geographical information systems. Its 50-odd researchers are curious and bold, and have a track record of pushing the frontiers of knowledge in theory as well as in real-world systems.
Sudhir Chowdhary looks at two key projects currently under way at the India lab; one, addressing the tuberculosis epidemic with cloud and mobile technology and, the other in the realm of mixed language research for better human-computer interaction.
Project 99 DOTS: Of pill packets, data corralling & cellphones
In order to help India eradicate tuberculosis (TB), Microsoft researchers enabled a technology project called ‘99 DOTS’ which focuses on patient adherence to anti-tuberculosis drugs. Initially, the team worked closely with Operation ASHA, (India’s largest NGO in TB treatment and prevention) and developed a system for the care provider to use a fingerprint reader and netbook computer or smartphone to log patient data, including symptoms and medication doses, during a check-up. Despite advancements, the method had gaps in treatment and monitoring which could lead to epidemiological impacts: A patient who doesn’t complete the course of treatment can develop drug-resistant TB.
“TB patients have to miss their home and work responsibilities; they have to travel long distances to treatment centres, stand in line with the stigma of the disease, and so on. Patients start feeling better after a few weeks, and it’s very hard to get them to continue the full course of treatment,” said Bill Thies, who has been working on TB with the Microsoft Research lab in India since 2008. Thies and his team have developed a game-changing treatment approach called ‘99 DOTS’ that involves pill packets, data corralling, and simple cellphones.
In the 99 DOTS system, treatment programmes wrap each anti-TB blister pack in a custom envelope, which hides phone numbers behind the medication. Patients can only see these hidden numbers after dispensing their pills. After taking daily medication, patients make a free call to the hidden phone number. The combination of the call and patient’s caller ID yields high
confidence that the dose was “in-hand” and they took the dose. Patients receive a series of daily reminders (via SMS and automated calls). Missed doses trigger SMS notifications to care providers, who follow up with personal, phone-based counselling.
Real-time adherence reports are also available on the web. 99DOTS offers 99% of the benefits of DOTS at a small fraction of the inconvenience and cost to patients. Through this treatment, patients are empowered to take control of their treatment and are motivated to follow the protocol when they know their nurses and doctors are monitoring the data. With 99 DOTS, patients can communicate to their care providers and be tracked without traveling to a clinic. And when patients stop taking the drugs early, the data, assembled in an analytics dashboard powered by Azure, shows the patterns of lapse. Providers can intervene where it is needed most with targeted training sessions to improve the adherence metrics.
Currently, the Indian government is using 99 DOTS for every patient who is co-infected with TB and HIV. It has also launched 99 DOTS for every TB patient in Mumbai, one of the hot-spots of TB in India, with an expected load of 30,000 patients per year.
Project Melange: Making machines more human
Developed over thousands of years, our vast array of languages now helps us do more than simply communicate. They help us express emotions, signal our identities, and convey the nature of our relationships. Monojit Choudhury, researcher at Microsoft Research India, said, “A unique element of human communication is the phenomenon of code-switching or code-mixing. Prevalent in most multilingual and multicultural societies, code-mixing is the fusion of two or more languages in everyday speech. Bilingual or multilingual people often mix words from their broad set of languages to convey messages on a deeper, more intimate level. A cacophony of languages is becoming more common on the internet as social media platforms bring us closer than ever.”
While code-mixing comes naturally to people who speak multiple languages, online tools and social platforms are yet to detect and translate code-mixed messages effectively. Researchers (Kalika Bali and Monojit Choudhury) at Microsoft’s Project Mélange studied this phenomenon in an attempt to build tools and machines that can go deeper than the spoken word and tap into a deeper layer of human communication. By applying code-mixing/ code-switching to virtual assistants, Microsoft is trying to figure out how they might be able to respond to a user switching between languages (English/ Hindi) in a sentence or conversation. The Microsoft team looks at every aspect of code-mixing, including text, speech, understanding and recognition.
A recent study by the Microsoft Research team on 1.25 million tweets from Hindi-English bilinguals revealed that the users were more likely to switch code to English when talking about formal or factual concepts. Usage of Hindi was more common when users wanted to reinforce a sentiment or be sarcastic. For example, the Hindi words in the following sentence, “Best wishes to the Indian team. Tiranga aapke saath hai!” are only used to reinforce a positive sentiment of encouragement. Mixing codes for narrative, evaluative or reinforcement purposes was most common, making up 21.64% and 19.24% of all code-mixed tweets in the data set. However, the accuracy of these tools varied. Code-mixing in similar or closely related languages such as Spanish and Catalan were more difficult to detect than mixes in dissimilar languages such as Hindi and English. A broader range of languages also compromised accuracy.
The Microsoft researchers team is applying more data and research to create better translators and language detection software in order to improve the algorithms over time and eventually lead to machines that can speak to humans on a more personal and relatable level.