Data access and dissemination

In the framework of the FNAC, data access and dissemination represented a very important opportunity to experiment with new techniques and technologies. The Second 1 Training courses on hardware were concerned with personal computers, servers, Unix machines, networks, maintenance, while training courses on software dealt with operative systems (MS DOS, Unix, Windows, Windows NT), programming languages, spreadsheets, word processors, databases, and statistical software (especially SAS). 2 Many...

Agriculture statistics a centralized approach

The production and dissemination of national and provincial estimates on the agriculture sector is part of Statistics Canada's mandate. The statistical agency carries out monthly, quarterly, annual and or seasonal data collection activities related to crop and livestock surveys and farm finances as needed. It also conducts the quinquennial Census of Agriculture in conjunction with the Census of Population (to enhance the availability of 0033 947 0715& lang e socio-economic data for this...

Preconstruction analysis

Before building a new frame, analysis is conducted to determine which states are most in need of one. Generally three to four states are selected to receive a new frame each year. Data collected from approximately 11000 segments during the JAS is used to determine the extent to which the land-use stratification has deteriorated for each state. This involves comparing the coefficients of variation for the survey estimates of major items over the life of the frame. Typically states with the...

Farm structure surveys

The farm structure survey (FSS) is considered in UNECE countries to be the backbone of the agricultural statistics system. Together with agricultural censuses, FSSs make it possible to undertake policy and economic analysis at a detailed geographical level. This type of analysis at regular time intervals is considered essential. In the EU, several simplifications have been made in recent years. From 2010 the frequency of FSSs will be reduced from every two to every three years. The decennial...

Small and diversified farm operations

For agricultural statistics in general, and for the EU and the NASS survey programmes in particular, the coverage (for the different crops such as acres of corn or the number of cattle represented by the farms on the frame) is a very important issue. In the regulations used for EU statistics, the desired accuracy and coverage are described in detail. Furthermore, countries are requested to provide detailed metadata and quality information. In general, active records eligible for survey samples...

A simulated exercise

The proposed algorithm has been applied to simulated data to test its performances. The first stage of the simulation experiment consisted in generating 100 bivariate observations from a variable (X, Y) on a 10 x 10 regular lattice, with X uniformly distributed U (0,1) and Y generated according to with N(0, ae 1 a), considered henceforth as the true error distribution. In more detail, we obtain different observations that are classified in a certain number K of groups, by using different values...

ABARE broadacre survey data

The data used in these case studies was obtained from the annual broadacre survey run by the Australian Bureau of Agriculture and Resource Economics (ABARE) from 1977-78 to 2006-07 (ABARE, 2003). The survey covers broadacre agriculture across all of Australia (see Figure 20.1), but excludes small and hobby farms. Broadacre agriculture involves large-scale dryland cereal cropping and grazing activities relying on extensive areas of land. Most of the information outlined in Section 20.3 was...

Accuracy objectivity and costefficiency

In this section we discuss the main properties that we would like to find in an estimation method, in particular for agricultural statistics. The term accuracy corresponds in principle to the idea of small bias, but the term is often used in practice in the sense of small sampling error (Marriott, 1990). Here we use it to refer to the total error, including bias and sampling error. Non-sampling errors are generally more difficult to measure or model (Lessler and Kalsbeek, 1999 Gentle et al.,...

Accuracy of estimates

Hartley (1962) proposed to use the variance for proportional allocation in stratified sampling as an approximation of the variance of the post-stratified estimator of the population total with simple random sampling in the two frames (ignoring finite-population corrections) var(i> ) (ff2 (1 _ a) + p2a2 + _B (a2 (1 _ + q2a2b nA nB where ct2, o and are the population variances within the three domains, a Nab NA and 3 Nab NB. Under a linear cost function, the values for nA NA, p and nB NB...

Acknowledgements

This chapter is based on the in-depth review of agricultural statistics in the UNECE region prepared for the Conference on European Statistics (CES). In its third meeting of 2007 2008, the CES Bureau decided on an in-depth review of this topic. It was requested that the review took into account recent developments such as the increase in food prices and the impact of climate change, and incorporated the final conclusions and recommendations reached at the fourth International Conference of...

Agricultural monetary statistics

Collection and validation of these include the economic accounts for agriculture and forestry, the agricultural labour input statistics, and the agricultural price in absolute terms and in indices. The agricultural accounts data at the national level are regulated by legislation which prescribes the methodological concepts and definitions as well as data delivery. The accounts data at regional level and the agricultural price statistics are transmitted on the basis of gentlemen's agreements....

Agricultural Survey Methods

'G. d'Annunzio' University, Chieti-Pescara, Italy National Institute of Statistics (ISTAT), Rome, Italy A John Wiley and Sons, Ltd., Publication This edition first published 2010 2010 John Wiley & Sons Ltd John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, United Kingdom For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our...

Agrienvironmental indicators

The requirement to include environmental assessments in all policy areas has led to the collection in the EU of a set of 28 agri-environmental indicators these have been selected from a group of 75 indicators that are usually collected. Many relate to other environmental statistics already collected, except that they are broken down by the agricultural sector. Some of them relate to specific policy actions and are therefore available from administrative sources, where other indicators have been...

Algorithms for automatic localization of random errors

An overview of algorithms for solving the localization problem for random errors in numerical data automatically based on the Fellegi-Holt paradigm is given in De Waal and Coutinho (2005). In this section we briefly discuss a number of these algorithms. Fellegi and Holt (1976) describe a method for solving the error localization problem automatically. In this section we sketch their method for details we refer to the original article by Fellegi and Holt (1976). The method is based on generating...

Alternatives to direct tabulation

One approach to reducing the risk of bias due to undercoverage of administrative registers and, at the same time, avoiding double data acquisition is to sample farms from a complete and updated list and perform record linkage with the register in order to capture register data corresponding to farms selected from the list. If the register is considered unreliable for some variables, related data have to be collected through interviews as well as data not found in the register due to record...

An estimate of errors of commission and omission in the IACS data

Errors of commission and omission in the IACS data were estimated in the study carried out by Consorzio ITA (AGRIT 2000) in the Italian regions of Puglia and Sicily for durum wheat in 2000. In both regions, an area frame sample survey based on segments with permanent physical boundaries was done. The ITA estimates of durum wheat cultivation were 435 487.3 ha in Puglia (with a coefficient of variation (CV) of 4.8 ) and 374 658.6 ha in Sicily (CV 5.9 ). Then, for each segment, the area of durum...

An overview of automatic editing

When automatic editing is applied, records are edited by computer without human intervention. Automatic editing is the opposite of the traditional interactive approach to the editing problem, where each record is edited manually. We can distinguish two kinds of errors systematic ones and random ones. A systematic error is an error reported consistently by (some of) the respondents. It can be caused by the consistent misunderstanding of a question by (some of) the respondents. Examples are when...

Analysis

The analysis system is perhaps the module of interest to the broadest audience in the NASS. This module will provide the tools and functionality through which analysts in Headquarters and our SSOs will interact with the data. All processes prior to this point are ones with no manual intervention or, in the case of data capture, one in which only a few will touch the data. As one of our senior executives aptly put it 'All this other stuff - data capture, edit and imputation - will happen while...

Analysis of the 2006 AGRIT data

The aim of the AGRIT survey is to produce area estimates of the main crops, as given in Table 13.1, at a national, regional and provincial level. Direct estimates of the area LCTc covered by crop type c (c 1, ,C), estimated standard errors (SE) and coefficients of variation (CV) were obtained as described in Section 13.2 for the 103 Italian provinces. The auxiliary variable is the number of pixels classified as crop type c according to the satellite data in the small area d, d 1, , D, with D...

Area frames

When completeness is not guaranteed by the combined use of different registers, an area frame should be adopted in order to avoid the bias, since an area frame is always complete, and remains useful fora long time (Carfagna, 1998 see also Chapter 11 of this book). The completeness of area frames suggests their use in many cases if another complete frame is not available if an existing list of sampling units changes very rapidly if an existing frame is out of date if an existing frame was...

Assuring quality

Measuring quality, however difficult it might be in specific cases, is only a lower level of quality ambition. Setting and meeting quality goals are, both from the user's and the producer's perspective, a higher ambition. The term 'quality assurance' is often used for the process of ensuring that quality goals are consistently met. 16.3.1 Quality assurance as an agency undertaking Traditionally, we tend to think of quality as a property of a single variable. We measure this quality and try to...

Author Index

L. 372 Abuelhaj, T. 351 Adger, N. W. 341, 342 Adhikary, A. K. 138 Aggarwal, R. 111 Ahmad, Q. K. 346 Alinovi, L. 342 Allen, J. D. 195, 199, 213 Anderson, D. W. 111 Anderson, J. R. 324 Annoni, A. 158 Arino, O. 154 Arminger, G. 349 Arthur, W. B. 344 Atkinson, D. 68 Bamps, C. 155 Bankier, M. 122, 241, 254 Barnard, J. 218 Bartholome, E. 154 Bartholomew, D. J. 349 Bartlett, M. S. 352 Baruth, B. 376 Bassett, G. W. 330 Battese, G. E. 144, 145, 146, 199, 330 Berthelot, J. M 247, 249, 250,...

Automatic editing of systematic errors

In this section we discuss several classes of systematic errors and ways to detect and correct them. As already mentioned, a well-known class of systematic errors is the so-called thousand errors. These are cases where a respondent replies in units rather than in the requested thousands of units. The usual way to detect such errors is - similar to selective editing - by considering 'anticipated' values, which could, for instance, be values of the same variable from a previous period or values...

Balanced sampling

The fundamental property of a balanced sample is that the HT estimators of the totals of a set of auxiliary variables are equal to the totals we wish to estimate. This idea dates back to the pioneering work by Neyman (1934) and Yates (1946). More recently, balanced sampling designs have been advocated by Royall and Herson (1973) and Scott et al. (1978), who pointed out that such sampling designs ensure the robustness of the estimators of totals, where 'robustness' essentially means 'protection...

Calibration and regression estimators

Calibration and regression estimators combine more accurate and objective observations on a sample (e.g. ground observations) with the exhaustive knowledge of a less accurate Table 12.2 Unweighted and weighted (unbiased) confusion matrix in LUCAS photo-interpretation for stratification. Photo- Permanent Permanent Forest & interpretation Arable Crops Grass Wood Other Total or less objective source of information, or co-variable (classified images). A characteristic property of these...

Calibration weighting

One of the most relevant problems encountered in large-scale business (e.g. agricultural) surveys is finding estimators that are (i) efficient and (ii) derived in accordance with criteria of internal and external consistency (see below). The methodology presented in this section satisfies these requirements. The class of calibration estimators (Deville and Sarndal, 1992) is an instance of a very general approach to the use of auxiliary information in the estimation procedures in finite-...

Case study the province of Foggia

Forecasting agriculture production is of critical importance for policy-makers as for market stakeholders. In the context of globalization, it is of great value to obtain as quickly as possible accurate overall estimates of crop production on an international scale. For example, timely and accurate estimates of durum wheat yield are an important management instrument for the European Commission's Directorate-General for Agriculture and Rural Development (Baruth et al., 2008). The main objective...

Changing concepts of quality

The most significant event is that the concepts of quality are being customer or externally driven rather than determined internally by the statistical organization. The availability of official statistics on the Internet is increasing the audience of data users. In previous times, the data users were fewer in number and usually very familiar from long experience with the data they were being provided. The new audience of data users is increasingly becoming more sophisticated and also more...

Combining a list and an area frame

The most widespread way to avoid instability of estimates and to improve their precision is to adopt a multiple-frame sample survey design. For agricultural surveys, a list of very large operators and of operators that produce rare items is combined with the area frame. If this list is short, it is generally easy to construct and update. A crucial aspect of this approach is the identification of the area sample units included in the list frame. When units in the area frame and in the list...

Combining ex ante and ex post auxiliary information a simulated approach

In this section we measure the efficiency of the estimates produced when the sampling design uses only the ex ante, only the ex post or a mixture of the two types of auxiliary information. Thus, we performed some simulation experiments whose ultimate aim is to test the efficiency of some of the selection criteria discussed above. After giving a brief description of the archive at hand, we detail the Istat survey based on this archive. Then we review the technical aspects of the sample selection...

Complex sample designs

Complex designs are generally adopted in the different frames to improve the efficiency, and this affects the estimators. Hartley (1974) and Fuller and Burmeister (1972) considered the case in which at least one of the samples is selected by a complex design, such as stratified or multistage sampling. Skinner and Rao (1996) proposed alternative estimators under complex designs where the same weights are used for all the variables. In particular, they modified the estimator suggested by Fuller...

Concept and notation

In order to apply variable sampling rates to achieve the required sample size by sector, it is useful to begin by classifying areas into groups on the basis of which particular sector predominates in the area. The basic idea is as follows. For each sector, the corresponding 'stratum of concentration' is defined to consist of the set of area units in which that sector 'predominates' in the sense defined below. One such stratum corresponds to each sector. The objective of constructing such strata...

Conclusions

One of the key issues in edit development is determining what edits are essential to ensure the integrity of the data without over-editing. This is something that the edit group and Processing Sub-Team have struggled with. The team members represent an interesting blend of cultures. The longer-term, pre-census NASS staff developed within a culture of processing the returns from its sample surveys, where every questionnaire is hand-reviewed and corrected as necessary. While there is a need for...

Contents

1 The present state of agricultural statistics in developed countries situation 1.2 Current state and political and methodological context 4 1.2.2 Specific agricultural statistics in the UNECE region 6 1.3 Governance and horizontal issues 15 1.3.1 The governance of agricultural statistics 15 1.3.2 Horizontal issues in the methodology of agricultural statistics 16 1.4 Development in the demand for agricultural statistics 20 1.5 Conclusions 22 Acknowledgements 23 Reference 24 Part I Census,...

Coverage Evaluation Study

The snapshot records that were neither flagged as must-gets nor found among the census records composed the survey frame for a Coverage Evaluation Study (CES). A sample of these farms was selected, prepared, collected, matched and searched much as in the FCFU, but at a later time. The farms that were confirmed as active and not counted Table 5.1 Census of Agriculture coverage statistics, 2001 and 2006 Table 5.1 Census of Agriculture coverage statistics, 2001 and 2006 in the census were weighted...

Creating a farm register the population

When a statistical register is created, all relevant sources should be used so that the coverage will be as good as possible. When microdata from different sources are integrated many quality issues become clear that otherwise would have escaped notice. However, a common way of working is to use only one administrative source at a time. Figure 2.5 shows the consequences of this. The Swedish Business Register has been based on only one source, the Business Register of the Swedish Tax Board....

Data and modelling issues

Farm economic surveys may not capture or miss out on key output or input variables, which are essential for economic analysis purposes. If a database does not include the required economic information then it is of little use for economic analysis. Most econometric modelling makes the assumption of homogeneity of parameters. A number of statistical models have been developed to deal with this situation. One of the mostly widely used is the mixed regression model (Goldstein, 1995 Laird and Ware,...

Data by Str Con and sector aggregated over areas

Once the StrCon has been defined for each area unit, the basic information aggregated over areas on the numbers of units classified by sector and StrCon can be represented as in Table 6.1. By definition, there is a one-to-one correspondence between the sectors and the StrCon. Subscript i (rows 1 to I) refers to the sector, and j (columns 1 to I) to StrCon. Generally, any sector is distributed over various StrCon any StrCon contains establishments from various sectors, in fact it contains all...

Description of the basic design

Let us now consider some basic features of an area-based multi-stage sampling design for an integrated survey covering small-scale economic units of different types. The population of units comprises a number of sectors, such as different types of establishments. We assume that, on the basis of some criterion, each establishment can be assigned to one particular sector. Sample size requirements in terms of number of establishments n.i have been specified for each sector i. The available...

Design issues

Efficient methods of designing surveys for use with direct estimates of large-area totals or means have received a lot of attention over the past 50 years or so. But survey design issues that have an impact on small-area statistics should also be considered. Singh et al. (1994) proposed several methods for use at the design stage to minimize the use of indirect small-area estimates. Those methods include (i) replacing clusters by using list frames, (ii) use of many strata to provide better...

Direct tabulation of administrative data

Two interesting studies (Selander et al., 1998 Wallgren and Wallgren, 1999), financed jointly by Statistics Sweden and Eurostat, explored the possibility of producing statistics on crops and livestock through the Integrated Administrative and Control System (IACS, created for the European agricultural subsidies) and other administrative data. After a comparison of the IACS data with an updated list of farms, the first study came to the following conclusion 'The IACS register is generally not...

Disadvantages of direct tabulation of administrative data

When administrative data are used for statistical purposes, the first problem to be faced is that the information acquired is not exactly that which is needed, since questionnaires are designed for specific administrative purposes. Statistical and administrative purposes require different kinds of data to be collected and different acquisition methods (which strongly influence the quality of data). Strict interaction between statisticians and administrative departments is essential, although it...

Durum wheat yield forecast

In this subsection a regression model is presented and used for the spatial prediction of the yield of durum wheat in the province of Foggia. The idea is very simple the estimated regression equation obtained with a sample of observations is used for predicting the value of y for given values of - in other words, the geographical information (i.e. covariates) is available for each point of spatial domain under investigation, and by using this information and the estimated model, a spatial...

Economic and econometric specification

It is not uncommon for economists to depart from the proper economic and econometric specifications of the model because the observed variables are not precisely as required. For example, in an economic analysis of technology choice and efficiency on Australian dairy farms (Kompas and Che, 2006) a variable representing feed concentration was not available. Average grain feed (in kilograms per cow) was therefore used as a proxy for this variable in the inefficiency model. As a consequence, the...

Empirical strategy 2141 The Palestinian data set

The Palestinian Public Perception Survey (PPPS) is an inter-agency effort aimed at building understanding of the socio-economic conditions in the West Bank and Gaza Strip. The University of Geneva implemented the 11th PPPS in 2007, with the collaboration of several agencies, including the FAO for the food security component. Responsibility for the data collection rests with the Palestinian Central Bureau of Statistics. The PPPS provides a very rich data set, including key indicators relevant...

Estimates for small domains and areas

An issue already reflected on earlier in this chapter is the increasing demand for data for small domains. In agriculture, these small domains could be geographical areas or unique commodities. Legislators are more frequently seeking data at lower levels of aggregation. In order for survey-based estimates to be reliable, the sample sizes would be required to increase beyond the organization's capacity to pay. The NASS's approach has been to augment probability-based survey estimates with...

Estimation of a total

In a multiple-frame survey, probability samples are drawn independently from the frames A , , Aq, Q > 2. The union of the Q frames is assumed to cover the finite population of interest, U. The frames may overlap, resulting in a possible 2Q 1 non-overlapping domains. When Q 2, the survey is called a dual-frame survey. For simplicity, let us consider the case of two frames (A and B), both incomplete and with some duplication, which together cover the whole population. The frames A and B...

Evaluation criterion the effect of weights on sampling precision

Equation (6.2), if it can be applied, gives an equal probability or self-weighting sample for each sector i meeting the sample size requirements in terms of number of establishments n.i to be included. The constraint is that within each area different types of establishments are selected at the same rate gk determined by (6.3). This means that the establishment probabilities of selection have to vary within the same sector depending on the area units from which they come. This design is...

Examples of crop area estimation with remote sensing in large regions

The early Large Area Crop Inventory Experiment (LACIE Heydorn, 1984) analysed samples of segments (5 x 6 nautical miles), defined as pieces of Landsat MSS images, and focused mainly on the sampling errors. It soon became clear (Sielken and Gbur, 1984) that pixel counting entailed a considerable risk of bias linked to the errors of commission and omission. Remote sensing was still too expensive in the 1980s (Allen, 1990), but the situation changed in the 1990s with the reduction of cost, both...

Examples of crop yield estimationforecasting with remote sensing

The USDA publishes monthly crop production figures for the United States and the world. At NASS, remote sensing information is used for qualitative monitoring of the state of crops but not for quantitative official estimates. NASS research experience has shown that AVHRR-based crop yield estimates in the USA are less precise than existing surveys.1 Current research is centred on the use of MODIS data within biophysical models for yield simulation. The Foreign Agricultural Service (FAS)...

Expected accuracy of area estimates with the LUCAS 2006 scheme

Before launching any survey some idea is needed of the accuracy that can be achieved. The accuracy reached for the estimated area of land cover c mainly depends on the size D of the region and the proportion p of c. The results for each country have allowed simple linear regressions without intercept to be fitted for agricultural classes, SLUCAS 0.743sran + , r2 0.989, SLUCAS 1.182Sran + , r2 0.967, where sran(p) D x y p(l p) (n 1) is the standard error that would have been obtained with simple...

Farm Accounts Data Network

The Farm Accounts Data Network (FADN) is a specific EU instrument, developed and managed by the Directorate-General for Agriculture. The FADN is an important source for micro-economic data relating to commercial holdings. For purposes of aggregation, the FADN sample results are linked to population results derived from the FSS using groupings based on the community typology. The creation of unique identifiers in the context of the agricultural register would enhance this linkage and, if privacy...

Farm Coverage Followup

A snapshot of the farm register was taken on Census Day, 14 May 2001. The snapshot included only farm operations that were coded as in business at the time. Based on previous census data and recent survey data, the majority of the farms in the snapshot records were flagged as 'must-get' and in scope for the Farm Coverage Follow-up (FCFU) operation. Shortly after the collection and capture of census questionnaires, a process similar to the one described in Section 5.4 was in progress between the...

From concept to measurement 2131 The resilience framework

Figure 21.1 summarizes the rationale for attempting to measure resilience to food insecurity. Consistent with Dercon's (2001) framework, it is assumed that the resilience of a given household at a given point in time, T0, depends primarily on the options available to that household to make a living, such as its access to assets, income-generating activities, public services and social safety nets. These options represent a precondition for the household response mechanisms in the face of a...

Funding for agricultural statistics

Agricultural statistics and especially the farm structure surveys are an expensive method of data collection. In the EU, the European Commission co-finances the data collection work of the FSSs and also finances the LUCAS survey. For the 2010-2013 round of the FSSs, the European Commission has reserved a budget of around 100 million. However, an important part of the work has to be funded by the countries themselves. The funding situation for the NASS as a national statistical institute...

General characteristics of SDA

SDA is a set of computer programs for the documentation and web-based analysis of survey data.10 There are also procedures for creating and downloading customized subsets of data sets. The software is maintained by the CSM.11 Current version of SDA is release 3.2 at the time of our experiment version 1.2 was available. All the following information are related to version 1.2. Data analysis programs were designed to be run from a web browser. SDA provides the results of the analysis very quickly...

Households as sub systems of a broader food system and household resilience

Households are components of food systems and can themselves be conceived as (sub)systems. The household definition is consistent with Spedding (1988)'s definition of a system as 'a group of interacting components, operating together for a common purpose, capable of reacting as a whole to external stimuli it is affected directly by its own outputs and has a specified boundary based on the inclusion of all significant feedback'. Moreover, as the decision-making unit, the household is where the...

How does it work

The intent of the data warehouse was to provide statisticians with direct access to current and historical data and with the capability to build their own queries or applications. Traditional transactional databases are designed using complex data models that can be difficult for anyone but power users to understand, thus requiring programmer assistance and discouraging ad hoc analysis. The database design results in many database tables (often over 100 tables), which result in many table forms...

Imputation of the missing auxiliary variables 1331 An overview of the missing data problem

Let us denote by Y (D x C) the matrix of estimates of the areas covered by crop types and by Z (D x C) the matrix containing the number of pixels classified by crop types according to the satellite data in each small area. Y is considered fully observed, while satellite images often haver missing data. Outliers and missing data in satellite information are mainly due to cloudy weather that does not allow the identification or correct recognition of what is being gown from the acquired digital...

Integrated versus separate sectoral surveys

There are a number of other factors which make the design of economic surveys more complex than that of household surveys. Complexity arises from the possibility that the ultimate units used in sample selection may not be of the same type as the units involved in data collection and analysis. The two types of units may lack one-to-one correspondence. For instance, the ultimate sampling units may be (often are) households, each of which may represent no, one, or more than one establishment of...

Integrating agricultural and environmental information with LUCAS

Land cover and land use are of increasing importance in policy design and evaluation they constitute a key element in particular for climate change studies (Feddema et al., 2005). Environmental, agricultural and regional transport policies are more and more demanding two types of land cover data maps and statistics. A large number of land cover maps have been produced with satellite images examples at global level are Global Land Cover 2000 (known as GLC2000), based on SPOT VEGETATION images...

Introduction

Agricultural statistics in the UN Economic Commission for Europe (UNECE) region are well advanced. Information collected by farm structure surveys on the characteristics of farms, farmers' households and holdings is for example combined with a variety of information on the production of animal products, crops, etc. Agricultural accounts are produced on a regular basis and a large variety of indicators on agri-economic issues is available. At the level of the European Union as a whole the...

Issues in statistical analysis of farm survey data 2041 Multipurpose sample weighting

Since the sample of farms that contribute to a farm survey is typically a very small proportion of the farms that make up the agricultural sector of an economy, it is necessary to 'scale up' the sample data in order to properly represent the total activity of the sector. This scaling up is usually carried out by attaching a weight to each sample farm so that the weighted sum of the sample values of a survey variable is a 'good' estimate of the sector-based sum for the same variable. Methods for...

Landuse stratification

The process of land-use stratification is the delineation of land areas into land-use categories (strata). The purpose of stratification is to reduce the sampling variability by creating homogeneous groups of sampling units. Although certain parts of the process are highly subjective in nature, precision work is required of the personnel stratifying the land (called stratifiers) to ensure that overlaps and omissions of land area do not occur and land is correctly stratified. The stratification...

LUCAS 20012003 Target region sample design and results

Table 10.1 Some area estimates of LUCAS 2003 for EU15. Table 10.1 Some area estimates of LUCAS 2003 for EU15. are based on comparing each sample element with other sample elements geographically close to it. Wolter (1984) compares several estimators of this type for the one-dimensional case, some of which had been proposed by Yates (1949), Osborne (1942) and Cochran (1946). Matern (1986) proposes similar estimators for the two-dimensional case. The usual variance estimator of the mean for...

LUCAS 2006 a twophase sampling plan of unclustered points

After some encouraging tests of the Joint Research Centre (JRC) in collaboration with the Greek Ministry of Agriculture in 2004 and the previous experience of the Italian AGRIT program (Martino, 2003), Eurostat decided to change sampling scheme. The new scheme used a common map projection the Lambert azimuthal equal area recommended by the Infrastructure for Spatial Information in Europe (INSPIRE) initiative (Annoni et al., 2001). This decision improved the homogeneity of the sample layout, but...

Main approaches to using EO for crop area estimation

Carfagna and Gallego (2005) give a description of different ways to use remote sensing for agricultural statistics we give a very brief reminder. Stratification. Strata are often defined by the local abundance of agriculture. If a stratum can be linked to one specific crop, the efficiency is strongly boosted (Taylor et al., 1997). Pixel counting. Images are classified and the area of crop c is simply estimated by the area of pixels classified as c. An equivalent approach is photo-interpretation...

Managing accuracy

Processes described previously under 'relevance' determine which programmes are going to be carried out, their broad objectives, and the resource parameters within which they must operate. Within those 'programme parameters', the management of accuracy requires particular attention during three key stages of a survey process survey design survey implementation and assessment of survey accuracy. These stages typically take place in a project management environment, outlined in Section 17.4,...

Managing interpretability

Statistical information that users cannot understand - or can easily misunderstand - has no value and may be a liability. Providing sufficient information to allow users to properly interpret statistical information is therefore a responsibility of the Agency. 'Information about information' has come to be known as meta-information or metadata. Metadata are at the heart of the management of the interpretability indicator, by informing users of the features that affect the quality of all data...

Managing timeliness

The desired timeliness of information derives from considerations of relevance - for what period does the information remain useful for its main purposes The answer to this question varies with the rate of change of the phenomena being measured, with the frequency of measurement, and with how quickly users must respond using the latest data. Specific types of agriculture data require different levels of timeliness. Data on crop area, stocks and production, for example, must be available soon...

Matching different registers

Countries with a highly developed system of administrative registers can capture data from the different registers to make comparisons, to validate some data with some others and to integrate them. Of course, very good identification variables and a very sophisticated record linkage system are needed. The main registers used are the annual income verifications in which all employers give information on wages paid to all persons employed, the register of standardized accounts (based on annual...

Meat livestock and egg statistics

These traditional animal and poultry product statistics - resulting from traditional regular livestock surveys as well as meat, milk and eggs statistics - still play a key role in the design, implementation and monitoring of the EU Common Agricultural Policy and also contribute to ensuring food and feed safety in the EU. European statistics on animals and animal products are regulated by specific EU legislation. Member states are obliged to send monthly, annual and multi-annual data to the...

Methodological approaches

The framework described above can be estimated through multivariate analysis models. Equation (21.1) is a hierarchical model in which some variables are dependent on the one side and independent of the other. Unobservable (i.e. latent) variables also have to be dealt with. Figure 21.2 shows the path diagram of the model concerned. In the causal models literature (Spirtes et al. 2000), circles represent latent variables and boxes represent observed variables. Most of the hierarchical or...

More flexible models an empirical approach

As is clear from the above illustrations, depending on the numbers and distribution of units of different types and on the extent to which the required sampling rates by sector differ, a basic model like (6.8) may be too inflexible, and may result in large variations in design weights and hence in large losses in efficiency of the design. It may even prove impossible to satisfy the sample allocation requirements in a reasonable way. Iteration o deviation * design effect Figure 6.1 Design effect...

New data collection tools

Modern technologies for data collection for agricultural and land use statistics are being implemented in many countries. As in many surveys, the use of data collection via computer-assisted personal or telephone interviewing has become the rule rather than the exception. The NASS and many EU countries have used the Blaise software for such interviewing for many years. A more recent development is the use of Internet questionnaires mainly for annual and monthly inventories. Both NASS USDA and...

Nonsampling errors in LUCAS 2006

Non-sampling errors are generally more difficult to assess than sampling errors (Lesser and Kalsbeek, 1999). In this section we study the possible order of magnitude of the main sources of non-sampling errors. The most important source of non-sampling error in an area frame survey is the identification mistakes by enumerators this can happen as a result of (b) incorrect identification because of inadequate enumerator training - mainly for minor crops (c) misinterpretation of rules to label land...

Numerical illustrations and more flexible models 661 Numerical illustrations

Table 6.2 shows three simulated populations of establishments. The distribution of the establishments by sector is identical in the three populations - varying linearly from Table 6.2 Number of establishments, by economic sector and 'stratum of concentration' (three simulated populations). Stratum of concentration population 1 Table 6.2 Number of establishments, by economic sector and 'stratum of concentration' (three simulated populations). Stratum of concentration population 1 Stratum of...

Probability proportional to size sampling

Consider a set-up where the study variable y and a positive auxiliary variable x are strongly correlated. Intuitively, in such a framework it should be convenient to select the elements to be included in the sample with probability proportional to x. Probability proportional to size (PPS) sampling designs can be applied in two different set-ups fixed-size designs without replacement (nps) and fixed-size designs with replacement (pps). Only nps will be considered here an excellent reference...

Probability proportional to size selection of area units

National or otherwise large-scale household surveys are typically based on multi-stage sampling designs. Firstly, a sample of area units is selected in one or more stages, and at the last stage a sample of ultimate units (dwellings, households, persons, etc.) is selected within each sample area. Increasingly - especially in developing countries - a more or less standard two-stage design is becoming common. In this design the first stage consists of the selection of area units with probability...

Programme content and stakeholder input

National statistical offices work very hard to understand the needs of the data user community, although the future cannot always be anticipated. As the primary statistical agency for the USDA, the NASS services the data needs of many agencies inside and outside the Department. Partnerships have been in place with state departments of agriculture and land-grant universities through cooperative agreements since 1917 to ensure statistical services meet federal, state, and local needs without...

Quality management assessment

Quality management assessment at Statistics Canada encompasses key elements of the Quality Assurance Framework (QAF), a framework for reporting on data quality and the Integrated Metadata Base (Julien and Born, 2006). Within this structure, a systematic assessment of surveys, focusing on the standard set of processes used to carry them out, is crucial. It has several objectives to raise and maintain awareness of quality at the survey design and implementation levels to provide input into...

References

Carfagna, E. (1998) Area frame sample designs a comparison with the MARS project. In T.E. Holland and M.P.R. Van den Broecke (eds) Proceedings of Agricultural Statistics 2000, pp. 261-277. Voorburg, Netherlands International Statistical Institute. Carfagna, E. (2001a) Multiple frame sample surveys advantages, disadvantages and requirements. In International Statistical Institute, Proceedings, Invited papers, International Association of Survey Statisticians (IASS) Topics, Seoul, August 22-29,...

Sadasivan M 1975 Post Cluster Sampling

Bankier, M., Houle, A.M. and Luc, M. (1997) Calibration estimation in the 1991 and 1996 Canadian censuses. Proceedings of the Survey Research Methods Section, American Statistical Association, pp. 66-75. Benedetti, R., Espa, G. and Lafratta, G. (2008) A tree-based approach to forming strata in multipurpose business surveys. Survey Methodology, 34, 195-203. Benedetti R., Bee, M. and Espa, G. (2010) A framework for cut-off sampling in business survey design. Journal of Official Statistics, in...

Registers register systems and methodological issues

A register is a complete list of the objects belonging to a defined object set. The objects in the register are identified by identification variables. This makes it possible to update or match the register against other sources. A system of statistical registers consists of a number of registers that can be linked to each other. To make this exact linkage between records in different registers possible, the registers in the system must contain reference numbers or other identification...

Relative efficiency of the LUCAS 2006 sampling plan

We have compared the variance obtained with several single-stage sampling plans. 1. Simple random sampling (srs). We make an approximation of simple random sampling using the variance of random subsamples of the available systematic sample. 2. Pure systematic. A subsample was extracted by selecting in the LUCAS 2006 sampling plan the first eight replicates in all strata. We have used the non-stratified version of (10.3) and (10.4). 3. Post-stratified sample. The systematic sample of option 2...

Replicated sampling

NASS's area frames have been sampled using a replicated design since 1974. Replicated sampling is characterized by the selection of a number of independent subsamples or replicates from the same population using the same selection procedure for each replicate. Each replicate is therefore an unbiased representation of the population. A replicate for NASS's area frame sample design is a random sample of land areas (segments) selected within a land-use stratum. The sub-stratification within each...

Requirements of sample surveys for economic analysis

One of the most important requirements of sample surveys in general, not only farm surveys, is that sample sizes are large enough to enable sufficiently accurate estimates to be produced for policy analysis. Working against this are two important objectives the provision of timely results and the need to collect detailed economic data (Section 20.3), which often necessitates expensive face-to-face data collection methods. At the same time the sample often needs to be spread spatially (for...

Resilience

The concept of resilience was originally described in the ecological literature (Holling, 1973) and has recently been proposed as a way of exploring the relative persistence of different states in complex dynamic systems, including socio-economic ones (Levin et al., 1998). The concept has two main variants (Holling, 1996). Engineering resilience (Gunderson et al., 1997) is a system's ability to return to the steady state after a perturbation (O'Neill et al., 1986 Pimm, 1984 Tilman and Downing,...

Respondent reluctance privacy and burden concerns

Although the agricultural sector is somewhat unique and not directly aligned with the general population on a number of levels, concerns regarding personal security and privacy of information are similar across most population subgroups in the USA, Brazil, and Europe. Due to incidences of personal information being released by businesses and government agencies, respondents now have one more reason for not responding to surveys. While this is not the only reason for increasing non-response...

Response error

The literature (Groves, 1989 Lyberg et al., 1997) is fairly rich in discussions of various components of this type of error. Self-enumeration methods can be more susceptible to certain kinds of response errors, which could be mitigated if interviewer collection were employed. Censuses, because of their large size, are often carried out through self-enumeration procedures. The Office for National Statistics in Britain (Eldridge et al., 2000) has begun to employ cognitive interviewing techniques...

Rural development statistics

These are a relatively new domain and can be seen as a consequence of the reform of the Common Agricultural Policy, which accords great importance to rural development. Eurostat has started collecting indicators for a wide range of subjects - such as demography (migration), economy (human capital), accessibility to services (infrastructure), social well-being - from almost all member states at regional level. Most of the indicators are not of a technical agricultural nature. Data collected...

Sample allocation

The area frame sample is used to collect data on a wide range of agricultural items such as crop acreages, livestock inventories and economic data. Therefore, the allocation of the sample across states and within states to the land-use strata is extremely important. The NASS evaluates optimum allocations of the sample to obtain the most precision in the major survey estimates for a given budget. The number of sample segments allocated to each land-use stratum and state depends on factors such...

Sample estimation

This final section will briefly discuss the approaches used to estimate agricultural production with an area frame sample of segments. The NASS uses two area frame estimators, namely the closed and weighted segment estimators. Both require that the interviewer collect data for all farms that operate land inside each segment. (A farm is defined to be all land under one operating arrangement with gross farm sales of at least 1000 a year.) The portion of the farm that is inside the segment is...

Sample rotation

As mentioned earlier, the NASS uses a five-year rotation scheme for the sample segments. Rotation is accomplished by replacing segments from specified replicates within a land-use stratum with newly selected segments. Preferably, the number of replicates is a multiple of 5 to provide a constant workload for sample selection and preparation activities in the AFS and for data collection work in the state offices. Naturally, instances occur when the number of replicates is not a multiple of 5,...

Sampling different types of units in an integrated design

An integrated multi-stage design implies the selection of a common sample of areas to cover all types of units of interest in a single survey. The final sampling stage involves the selection of ultimate units (e.g. establishments) within each selected area. In such a survey, in practice it is often costly, difficult and error-prone to identify and separate out the establishments into different sectors and apply different sampling procedures or rates by sector within each sample area. Hence it...

Satellite images and vegetation indices for yield monitoring

The use of remote sensing within the crop yield forecasting process has a series of requirements for most of the applications. Information is needed for large areas while maintaining spatial and temporal integrity. Suitable spectral bands are needed that characterize the vegetation to allow for crop monitoring, whether to derive crop indicators to be used in a regression or to feed biophysical models. High temporal frequency is essential to follow crop growth during the season. Historical data...

Satellites and sensors

The main characteristics of sensors for use in agricultural statistics are as follows Spectral resolution. Most agricultural applications use sensors that give information for a moderate number of bandwidths (four to eight bands). Near infrared (NIR), short wave infra-red (SWIR) and red are particularly important for measuring the activity of vegetation red-edge bands (between red and NIR) also seem to be promising for crop identification (Mutanga and Skidmore, 2004). Panchromatic (black and...

Selection probabilities

There are two methods for selecting the ultimate sampling unit or segment - equal and unequal selection. Which method is used depends on the availability of adequate boundaries for segments. If good boundaries are plentiful so that segments can be made approximately the same size within a land-use stratum, then segments are selected with equal probability. If adequate boundaries are not available, then unequal probability of selection is used since segment sizes are allowed to vary greatly in...

Selective editing

Manual or interactive editing is time-consuming and therefore expensive, and adversely influences the timeliness of publications. Moreover, when manual editing involves recon-tacting the respondents, it also increases the response burden. Therefore, most statistical institutes have adopted selective editing strategies. This means that only records that potentially contain influential errors are edited manually, whereas the remaining records are edited automatically. In this way, manual editing...

Similarities and differences from household survey design

The type of sample designs used in 'typical' household surveys provides a point of departure in this discussion of multi-stage sampling of small-scale economic units. Indeed, there may often be a one-to-one correspondence between such economic units and households, and households rather than the economic units themselves may directly serve as the ultimate sampling units. Nevertheless, despite much common ground with sampling for population-based household surveys, sampling small-scale economic...