At the outset, I wholeheartedly agree with our former Chief Statistician Pronab Sen that “Statisticians aren’t stupid”. However, one must accept with all humility that statisticians can be wrong and quite sadly, in his argument, he is blatantly so! The simple point of my initial article (‘The Sample is Wrong’, IE, July 7) was that surveys in India systematically underestimate the level of urbanisation due to fundamental flaws in sampling methodology — thereby repeatedly providing us with biased estimates for various indicators of interest. The former Chief Statistician published a strident rebuttal (‘Statisticians aren’t stupid’, IE, July 10) in which he put forth two main points of criticism. Let me explain why both his points are misleading. His first point of criticism is simply incorrect and the second point is incomplete and hence, inaccurate.
The main point of Sen’s article is that the National Sample Survey (NSS) and the Census differ in their estimates of the proportion of rural because there are differences in definitions of rural/urban between the Census and the NSS. This is beyond bizarre. The exhaustive report of the NSS (the Golden Jubilee Publication, 2001) details the concepts and definitions used in the surveys — the report is available on the MOSPI website. The report defined rural and urban clearly — which I provide below verbatim — for the benefit of the readers’ judgement.
“2.1.6. RURAL AND URBAN AREAS
The rural and urban areas of the country are taken as adopted in the latest population census for which the required information is available with the Survey Design and Research Division of the NSSO. The lists of census villages as published in the Primary Census Abstracts (PCA) constitute the rural areas, and the lists of cities, towns, cantonments, non-municipal urban areas and notified areas constitute the urban areas.
2.1.6.1 URBAN AREA
The urban area of the country was defined in 1971 census as follow(s):
(a) All places with a municipality, corporation or cantonment and places notified as town area
(b) All other places which satisfied the following criteria:
(i) A minimum population of 5000,
(ii) At least 75 per cent of the male working population are non-agriculturists, and
(iii) A density of population of at least 1000 per sq. mile (390 per sq. km.).
However, there are urban areas which do not possess all the above characteristics uniformly. Certain areas were treated as urban on the basis of their possessing distinct urban characteristics, overall importance and contribution to the urban economy of the region.”
The report convincingly shows that NSS goes beyond the Statutory Towns (a) to include Census Towns (part b) in their definition of Urban Areas. At no point does the NSS document state that the NSS definition of urban is only the Statutory Towns of the Census. Moreover, NSS goes a step further and includes even those areas to be part of urban that contribute significantly to the urban economy of the region — which is over and beyond the census definition of Urban.
Based on the evidence presented, it is obviously wrong for Sen to say, “…although all surveys use Census as the sampling frame, Census Towns are treated as a part of the rural sector and are included in the rural sample.” My understanding of the Rural/Urban definition of the NSS is from the publicly available document, and I am not in a position to comment on documents that Sen might be privy to, in his past privileged position, that are not in the public domain.
Furthermore, when one carefully reads the notes on sample design and estimation procedure of the PLFS, another major national survey, it states,“When the next Population Census details will be available, the new frame will be used only when Urban Frame Survey blocks for all newly declared Census Towns and Statutory Towns are available for preparation of sampling frame…”, which once again clearly suggests that the Urban Frame of the Sample includes the Census Towns and is not restricted to Statutory Towns. All this information is publicly available on the MOSPI website. So it is somewhat surprising that the former Chief Statistician would confidently assert otherwise.
Now to his second point of criticism concerning response rate. Although Sen agrees that the estimates will indeed be biased downwards and more so in urban areas as compared to rural, he tries to dismiss it as a generic problem and tries to diminish the problem by comparing non-response rate of the US (30 per cent) to India’s 8 per cent. The devil, as always, hides in the details. One has to go beyond the average response rate and study the distribution. For example, while overall non-response rate of NFHS-5 was merely 5 per cent (which is great), it was as high as 36 per cent and 16 per cent for men in Chandigarh and Delhi respectively. Non-response is neither uniform nor random across areas. It is strongly and negatively correlated with wealth, according to our analysis.
Data quality is a serious concern which needs dedicated and continued effort for its improvement. Our statistical systems will only make progress if we honestly recognise the problem at hand and address these with all humility and transparency. Data systems are not designed to make governments look good or bad. They are meant to provide an objective picture of the ground realities across the country to anybody interested — policymakers, scholars and citizens alike. The sampling methodology of our surveys requires urgent upgradation to keep up with the needs of India’s dynamic economy. “Statisticians aren’t stupid”, but they need to be open-minded to improvement and innovation — else they can be very wrong.
The writer is Member, Economic Advisory Council to Prime Minister of India