Economist Pronab Sen was the first Chief Statistician of India, acting as the functional and technical head of the national statistical system in India, as well as secretary of the Ministry of Statistics & Programme Implementation, Government of India (2007-2010).
Why do we need a Chief Statistician of India (CSI)?
Because we have a decentralised and federated statistical system, so, other than national level Census and Surveys, all other data has really been allocated to ministries depending on what they are looking after. Now, when you have this kind of a decentralised system, you can have a lot of noise happening. You need a central coordinating mechanism to make sure that you don’t have contradictions or duplication happening. For the longest time we did without a CSI. I was the first CSI appointed in 2007. This function was done by the Central Statistical Organisation (CSO). But because the CSO itself had no authority over other Ministries, the coordination function is relatively weak.
At a time of the technological explosion, where anything that is numbers, reaches people instantaneously on cellphones. What is the role of a central statistical office? Has its role diminished or increased?
The responsibility has increased. Because there is so much data out there which may or may not satisfy basic statistical principles, e.g. on mobiles or some transactional data, the problem is, you don’t know whether what you have captured is a particular self-selected sub-set of the population. But you use that data to assume that this is how the country is. Now that is wrong. So now, if you make some generalised comments on what is going on, you need to adhere to some very basic statistical principles, which none of these data sources do at the moment.
You were mandated with statistical principles and checking on the fidelity of the data, how robust it is, what is your view on how this data must be used, stored or kept? Is there a principle on that?
Yes, there is a principle, and it is in fact embodied in our Collection of Statistics Act. The moral principle is that there is a distinction between data which is collected for regulatory compliance purposes and statistical data. When a regulator or the tax department collects data, they want to know who that entity is, by name, when one converts that into statistical data, so when that data is going outside the regulatory or government system to the public at large, there must be complete anonymity. Now, because there is a requirement of anonymity, it places an additional burden on the statistical system. Because no one else can verify what we have done. This means, when you do something with this data, you have to do it right, because no-one else can crosscheck it. The whole problem with big data is the way the data is collected makes it personalised. Now, we have the Supreme Court going into the question of Confidentiality, so the data is not confidential in that sense, But neither is it available to the country at large. This is serious. If you have data of this kind, you can say whatever you want after picking and choosing, depending upon whatever your preconceived desire is. There is no question that it is still very useful, but to make it into a statistical dataset, it needs to go through some statistical checks, which is never ever done.
What’s the relationship of the government of the day with the statistical system?
By and large, political system at the Centre has not interfered very much. There are pressures. Not to change the data, but sometimes they say just suppress or delay release. That pressure does exist. So the data we get is the best statisticians can get out of the system. Nevertheless, it was felt that there needs to be some distance between the government and the statistical system, which is why the National Statistical Commission was created. The NSC is the buffer. In states, things are not as good. There is fair amount of evidence of data manipulation, and it is quite understandable. There the issue goes beyond political imagery. Remember, a lot of the transfer of funds from the centre to the states is dependent on the data, like school enrolment, agriculture, and so on; all allocations are driven by the data. Now when that happens, there is a very strong incentive with state governments to fiddle with the data – there is good, genuine economic logic. One doesn’t blame them. In Agriculture data, it happens all the time, but the final data will usually be okay, but states would say, we are having a drought.
Some states want to shine and beef up statistics, you are citing other pressures too.
It depends on the political leadership, some want to project themselves, and others want more money. These two desires are contradictory.
Why is data on farmer suicides not out since 2016, who would take the call on that?
These are done by the respective states. The question is where it is being compiled. This would be the agriculture ministry.
What is lost when the country does not have the Chief Statistician?
Several things. One is the statistics that we have been producing, that is practically on auto-pilot, but if you try anything new, that would be held up. If you have a ministry which doesn’t like what the Ministry of Statistics is saying then it would want its own survey which would get blocked by the CSI and the NSC. So again, you are going to get dissonance. It is coordination, which is very important. The regulatory thing is the NSC’s job.
How robust is India data? Is it changing?
It remains robust, but some things are changing. What we do is robust and is high quality. The problem is what we don’t measure a lot of things, and for what we do, the frequency is not good enough. What we put out is good enough. e.g., on Employment figures, we are not getting anything figures, except every five years. That is not frequent enough. But what we put out is the gold standard. Now when we change data sources, because we have a better data source, then comparability is a problem. The GDP data is a good example. What you get is a certain amount of non-comparability but it does not mean this data is not better. The data is actually better but you cant really compare.
Why is the GDP back-series not out, to allow for the comparability?
This is a good question and I do not know the answer for that. Because they were committed to bringing out the back-series but the date has been delayed time and time again. They were supposed to have come out with it last October. Ultimately the buck stops with the CSI.
You mentioned employment, how tough is it to collect statistics when there is over 90 per cent of informalisation?
It seriously impacts the collection of data as in countries with just 10-15 per cent of informalisation in the economy, one can go the formal sector, one can survey the employer and collect data. With informalisation, it effectively becomes a household survey, which makes it a much bigger and tougher exercise.
Some countries have disbanded the Census. Is that a good thing?
It all depends upon how good your Civil Registration system is, which records births, deaths and marriages. If it is 100 per cent coverage, then with this and the immigration data, you can map the population perfectly. As in India, our civil registration works only for approximately 60 per cent of the population, we cannot do away with the Census so easily.