Integrating surveys and administrative data

Administrative data are often used to update the farm register and to improve the sample design for agricultural surveys. At Statistics Canada, a possible integration of administrative data and surveys is foreseen: 'Efforts are underway to explore the wealth of data available and to determine how they can be used to better understand our respondents and to improve sample selection to reduce burden' (Korporal, 2005).

A review of statistical methodologies for integration of surveys and administrative data is given in the ESSnet Statistical Methodology Project on Integration of Surveys and Administrative Data (ESSnet ISAD, 2008), which focuses mainly on probabilistic record linkage, statistical matching and micro integration processing. For a review of the methods for measuring the quality of estimates when combining survey and administrative data, see Lavallee (2005).

Problems connected with confidentiality issues are frequent when data have to be retrieved from different registers as well as when survey data are combined with register data. One example is given by the Italian experiment on the measurement of self-employment income within EU-SILC (European Union Statistics on Income and Living Conditions); see ESSnet ISAD (2008). Since 2004, the Italian team has carried out multi-source data collection, based on face-to-face interview and on linkage of administrative with survey data in order to improve data quality on income components and relative earners by means of imputation of item non-responses and reduction of measurement errors. Administrative and survey data are integrated at micro level by linking individuals through key variables. However, 'the Personal Tax Annual Register, including all the Italian tax codes, cannot be used directly by Istat (the Italian Statistical Institute). Therefore, record linkage has to be performed by the tax agency on Istat's behalf. The exact linkage performed by the Tax Agency produces 7.8% unmatched records that are partially retrieved by means of auxiliary information (1.5%).'

