Data, data everywhere, and everyone is taking notice. Not just tech companies and start-ups, but even governments are seeking to utilise the enormous amount of data being generated by the country’s epochal transition to a “Digital India”. The Niti Aayog has laid out a vision for making available anonymised data across sectors. The Economic Survey 2018-19 dedicated an entire chapter to the topic of data — “of the people, by the people, for the people” — making a bold call to harness data as a “public good” in the service of the people.
We welcome this conversation and concur with the idea that governments should harness data and digital platforms to enable more efficient service delivery, product innovation and evidence-based policy-making. But we also believe that with big data comes big responsibility. As demonstrated by Cambridge Analytica and numerous other data breaches, poorly designed systems create risks for individuals, businesses and governments.
One “big idea” that the Survey discusses at length is the creation of a centralised welfare database of citizens that links different government-held data repositories about citizens. The sharing of information, facilitated by this database, can improve welfare delivery, empower citizens with information and “democratise” data. Many states have already taken steps in this direction through the creation of massive databases of information on every resident.
While this is a bold idea, we believe there is much to be cautious about as we commence on the collection and use of data at scale as it can lead to loss of personal data, both intentional and unintentional. For example, recently, the Andhra Pradesh government websites publicly displayed the Aadhaar number of women, their reproductive history, whether they had an abortion and so on. Another website exposed the name and number of every person who purchased medicines from government-run stores, including those buying pills for erectile dysfunction.
Whether or not one believes that Indians care about privacy in general, it is obvious that no Indian would want such information to be publicly available. Researchers at CGAP, Dalberg and Dvara Research spoke to ordinary Indians across India and found overwhelming public concern about the security of the data they share with banks, hospitals and other institutions. An individual’s lack of control over data should therefore not be misinterpreted as indifference. The government must design the proposed databases in ways that allow anonymised personal data to serve its highest purpose, while protecting an individual’s agency over data.
The Survey rightly acknowledges the importance of protecting personal information and proposes an architecture that relies on obtaining individual “consent”. But evidence shows that “consent”, while noble in theory, is deeply flawed in practice. A recent survey by researchers at the National Institute of Public Finance and Policy (NIPFP) shows that even English-speaking postgraduate law students struggle to understand the privacy policies before clicking “I Agree”. Therefore, any large-scale data collection must be preceded by extensive on-ground research on how consent can be made meaningful to the individual.
And we need to go beyond consent. Consent must be supplemented with a full range of individual data rights, including the right to delete one’s data. Any data should be subject to what lawyers call “collection limitation”, which means that a service provider should only collect minimal personal data that is proportionate to the stated purpose.
The databases should be designed in a manner that a department is not able to see the data that it does not need, irrespective of whether citizens give their consent or not. For example, the Ministry of Chemicals and Fertilisers does not need to have access to an individual’s medical records. Access to each additional data field should be carefully evaluated.
The Survey’s emphasis on data security and encryption is encouraging. However, encryption is not a silver bullet. The government should implement bolder technical safeguards. One such feature is decentralised storage of data — for example in the individual’s personal device — rather than in a central database. Another is anonymisation at source, wherein the data is stripped of any personal information as soon as it is created. These will prevent the creation of data honeypots that can be attacked by hackers or breached accidentally. In addition, research shows that changing the default option — asking people if they want to “opt-in” to (as against “opt-out” of) data sharing requests can make a big difference to how much data gets shared.
And finally, citizens must have a time-bound and easily-accessible recourse to any data breaches or harms. They must be able to reach out to an adjudicatory body through multiple offline and online modes. This body must be empowered to penalise both public and private bodies that use the databases, and redress must be speedy.
India stands at the cusp of a major opportunity, one where data and digital platforms can become an enabler of a meaningful life for every Indian. This is also the opportunity for India to become a global leader and present a new approach that other countries can emulate. But to achieve this, the boldness of our vision must be tempered with a thoughtfulness of approach. Maximising public good but also safeguarding against harm must be the mantra for the new digital India.
This article first appeared in the print edition on July 31, 2019 under the title ‘Big data, big responsibility’. The writers work at Omidyar Network India, an investment firm focussed on social impact through equity investments and grants, with an emphasis on technology.