A new data-related controversy has erupted after the government aborted the publication of the report of the household consumer expenditure survey (CES) conducted by the National Sample Survey Organisation (NSSO) during 2017-18. This survey is one of the oldest series of surveys — undertaken by NSSO since the 1950s — and is the precursor to the present Living Standard Measurement Surveys, highly favoured and supported by international agencies like the World Bank for estimating poverty. In India, the data from this survey has been the basis for estimating poverty numbers ever since the topic of poverty took centrestage in our political and economic discourse. Most Indian economists will be familiar with the CES data and its limitations. All along, there were also concerns about the potential under-reporting and reliability of the consumption data due to the increasing divergence between the household-level data and the corresponding consumption data provided by the national accounts. It appears from the government’s press note that it has also checked the report with the actual production of goods and services. The late B S Minhas, who was chairman of the NSSO governing council, was the first to explore these divergences. His findings did show that the divergences were not entirely due to underreporting in the surveys.
That the collection of data to arrive at the monthly household consumption expenditure estimate on all goods and services is not an easy task is well recognised. Economists and survey experts have spent considerable time to understand the data limitations and to improve the data collection procedures. In fact, the CES data and the survey methodology have generated a large amount of literature, some of which is documented in The Great Indian Poverty Debate edited by Angus Deaton and Valerie Kozel. The under-reporting of consumption due to a lapse in recall and the adoption of an appropriate recall period was also studied in great detail by NSSO. This writer was once part of a large pilot study where the respondents were provided with containers to measure cereals, pulses and milk consumed by them and a notebook to write down the quantity consumed on a daily basis. Households were also given a packet of salt considering that salt consumption was invariant to income levels. The salt remaining at the end of the week was measured to get the most accurate estimate of salt consumption as a control variable. These were genuine efforts to understand the reporting limitations raised by data users that peaked after the 1999-2000 survey when the NSSO used two recall periods.
Now, fast-forward to the present. The CES report for the year 2017-18 has been kept pending since June 2019 for what now transpires to be an internal examination of the divergence with other sources. This examination has purportedly led to recommendations for several refinements in the survey methodology for implementation in future surveys. The ministry has, therefore, decided not to release the survey results pending these refinements. We now have to wait till possibly 2023 to know changes in the living standards since 2011-12.
The NSSO surveys are designed under the guidance of external and internal experts. The fieldwork and data processing are done by professionals and the reports are prepared following well-established procedures for data checking and cleaning. If there were data quality issues, it would have been discovered long before the report was drafted. Even assuming severe inconsistencies in the data collected, the right course would have been to publish a report with the findings and the perceived limitations, which could have been of use to researchers.
The junking of the NSSO survey also raises another question. Usually, all regular NSSO surveys are repeated by the state/UT governments following identical survey instruments and survey designs, using their own resources. The idea is that, by using the combined pooled samples, we can get estimates at the district level. In this case, it is not clear if the surveys done by the states/UTs have also been junked. Hopefully, some state governments will come out with their reports in due course.
Government statistics always come attached with conceptual limitations, data collection problems, sampling and non-sampling errors and issues of comparability with other sources. But, now, a political dimension has been added. We have become painfully aware of this extra dimension in recent times starting with the GDP data, the employment data and now the consumption data. The statistical and economic aspects of data that can be researched and debated openly are now being relegated to the background. Researchers are denied access to the data. We now see discussions aided by leaked reports and quick-fix social media comments in place of scientific data analysis. The once credible and open Indian statistical system is now turning away from objectivity and introspection. The institutions that were set up to safeguard its autonomy and independence are becoming insignificant.
The collection of data through surveys and census are publicly-funded exercises. Data collection costs and respondent fatigue, from intrusive data gathering, are on the rise. While the new census and surveys are announced with alarming ease, no proper statistical audits are ever done for these publicly-funded projects. Further, though we were an early votary of open government data, the types of data to be kept in the open remain the prerogative of the data collecting agency. We have seen no reports from the ongoing Periodic Labour Force Surveys (PLFS) since May.
These are challenging times for official statisticians. They are told that data is the new oil. But, the rising reluctance of respondents, a data guzzling media on a 24-hour watch, the data demands for international commitments like SDG monitoring, make the effort in digging for this new oil extremely difficult. That this oil should be acceptable to the government makes it a bigger challenge.
This article first appeared in the print edition on November 19, 2019 under the title ‘The politics of numbers.’ The writer is former acting head of the National Statistical Commission
- Sample surveys are important to validate administrative databases
The report shows the importance of sample surveys to validate administrative databases and the need to restrain the tendency to be overly confident on administrative…
- The pillar stands
Indian statistical system can withstand political pressures. Legislation and public scrutiny can strengthen it further. ..
- Because data is a public good
My resignation from National Statistical Commission was the last act in a long story of disregard for its reports ..