Preconstruction analysis

Before building a new frame, analysis is conducted to determine which states are most in need of one. Generally three to four states are selected to receive a new frame each year. Data collected from approximately 11000 segments during the JAS is used to determine the extent to which the land-use stratification has deteriorated for each state. This involves comparing the coefficients of variation for the survey estimates of major items over the life of the frame. Typically states with the...

Farm structure surveys

The farm structure survey (FSS) is considered in UNECE countries to be the backbone of the agricultural statistics system. Together with agricultural censuses, FSSs make it possible to undertake policy and economic analysis at a detailed geographical level. This type of analysis at regular time intervals is considered essential. In the EU, several simplifications have been made in recent years. From 2010 the frequency of FSSs will be reduced from every two to every three years. The decennial...

Small and diversified farm operations

For agricultural statistics in general, and for the EU and the NASS survey programmes in particular, the coverage (for the different crops such as acres of corn or the number of cattle represented by the farms on the frame) is a very important issue. In the regulations used for EU statistics, the desired accuracy and coverage are described in detail. Furthermore, countries are requested to provide detailed metadata and quality information. In general, active records eligible for survey samples...

A simulated exercise

The proposed algorithm has been applied to simulated data to test its performances. The first stage of the simulation experiment consisted in generating 100 bivariate observations from a variable (X, Y) on a 10 x 10 regular lattice, with X uniformly distributed U (0,1) and Y generated according to with N(0, ae 1 a), considered henceforth as the true error distribution. In more detail, we obtain different observations that are classified in a certain number K of groups, by using different values...

ABARE broadacre survey data

The data used in these case studies was obtained from the annual broadacre survey run by the Australian Bureau of Agriculture and Resource Economics (ABARE) from 1977-78 to 2006-07 (ABARE, 2003). The survey covers broadacre agriculture across all of Australia (see Figure 20.1), but excludes small and hobby farms. Broadacre agriculture involves large-scale dryland cereal cropping and grazing activities relying on extensive areas of land. Most of the information outlined in Section 20.3 was...

Accuracy objectivity and costefficiency

In this section we discuss the main properties that we would like to find in an estimation method, in particular for agricultural statistics. The term accuracy corresponds in principle to the idea of small bias, but the term is often used in practice in the sense of small sampling error (Marriott, 1990). Here we use it to refer to the total error, including bias and sampling error. Non-sampling errors are generally more difficult to measure or model (Lessler and Kalsbeek, 1999 Gentle et al.,...

Accuracy of estimates

Hartley (1962) proposed to use the variance for proportional allocation in stratified sampling as an approximation of the variance of the post-stratified estimator of the population total with simple random sampling in the two frames (ignoring finite-population corrections) var(i> ) (ff2 (1 _ a) + p2a2 + _B (a2 (1 _ + q2a2b nA nB where ct2, o and are the population variances within the three domains, a Nab NA and 3 Nab NB. Under a linear cost function, the values for nA NA, p and nB NB...

Acknowledgements

This chapter is based on the in-depth review of agricultural statistics in the UNECE region prepared for the Conference on European Statistics (CES). In its third meeting of 2007 2008, the CES Bureau decided on an in-depth review of this topic. It was requested that the review took into account recent developments such as the increase in food prices and the impact of climate change, and incorporated the final conclusions and recommendations reached at the fourth International Conference of...

Agricultural monetary statistics

Collection and validation of these include the economic accounts for agriculture and forestry, the agricultural labour input statistics, and the agricultural price in absolute terms and in indices. The agricultural accounts data at the national level are regulated by legislation which prescribes the methodological concepts and definitions as well as data delivery. The accounts data at regional level and the agricultural price statistics are transmitted on the basis of gentlemen's agreements....

Agricultural Survey Methods

'G. d'Annunzio' University, Chieti-Pescara, Italy National Institute of Statistics (ISTAT), Rome, Italy A John Wiley and Sons, Ltd., Publication This edition first published 2010 2010 John Wiley & Sons Ltd John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, United Kingdom For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our...

Algorithms for automatic localization of random errors

An overview of algorithms for solving the localization problem for random errors in numerical data automatically based on the Fellegi-Holt paradigm is given in De Waal and Coutinho (2005). In this section we briefly discuss a number of these algorithms. Fellegi and Holt (1976) describe a method for solving the error localization problem automatically. In this section we sketch their method for details we refer to the original article by Fellegi and Holt (1976). The method is based on generating...

An estimate of errors of commission and omission in the IACS data

Errors of commission and omission in the IACS data were estimated in the study carried out by Consorzio ITA (AGRIT 2000) in the Italian regions of Puglia and Sicily for durum wheat in 2000. In both regions, an area frame sample survey based on segments with permanent physical boundaries was done. The ITA estimates of durum wheat cultivation were 435 487.3 ha in Puglia (with a coefficient of variation (CV) of 4.8 ) and 374 658.6 ha in Sicily (CV 5.9 ). Then, for each segment, the area of durum...

An overview of automatic editing

When automatic editing is applied, records are edited by computer without human intervention. Automatic editing is the opposite of the traditional interactive approach to the editing problem, where each record is edited manually. We can distinguish two kinds of errors systematic ones and random ones. A systematic error is an error reported consistently by (some of) the respondents. It can be caused by the consistent misunderstanding of a question by (some of) the respondents. Examples are when...

Analysis of the 2006 AGRIT data

The aim of the AGRIT survey is to produce area estimates of the main crops, as given in Table 13.1, at a national, regional and provincial level. Direct estimates of the area LCTc covered by crop type c (c 1, ,C), estimated standard errors (SE) and coefficients of variation (CV) were obtained as described in Section 13.2 for the 103 Italian provinces. The auxiliary variable is the number of pixels classified as crop type c according to the satellite data in the small area d, d 1, , D, with D...

Area frames

When completeness is not guaranteed by the combined use of different registers, an area frame should be adopted in order to avoid the bias, since an area frame is always complete, and remains useful fora long time (Carfagna, 1998 see also Chapter 11 of this book). The completeness of area frames suggests their use in many cases if another complete frame is not available if an existing list of sampling units changes very rapidly if an existing frame is out of date if an existing frame was...

Assuring quality

Measuring quality, however difficult it might be in specific cases, is only a lower level of quality ambition. Setting and meeting quality goals are, both from the user's and the producer's perspective, a higher ambition. The term 'quality assurance' is often used for the process of ensuring that quality goals are consistently met. 16.3.1 Quality assurance as an agency undertaking Traditionally, we tend to think of quality as a property of a single variable. We measure this quality and try to...

Author Index

L. 372 Abuelhaj, T. 351 Adger, N. W. 341, 342 Adhikary, A. K. 138 Aggarwal, R. 111 Ahmad, Q. K. 346 Alinovi, L. 342 Allen, J. D. 195, 199, 213 Anderson, D. W. 111 Anderson, J. R. 324 Annoni, A. 158 Arino, O. 154 Arminger, G. 349 Arthur, W. B. 344 Atkinson, D. 68 Bamps, C. 155 Bankier, M. 122, 241, 254 Barnard, J. 218 Bartholome, E. 154 Bartholomew, D. J. 349 Bartlett, M. S. 352 Baruth, B. 376 Bassett, G. W. 330 Battese, G. E. 144, 145, 146, 199, 330 Berthelot, J. M 247, 249, 250,...

Automatic editing of systematic errors

In this section we discuss several classes of systematic errors and ways to detect and correct them. As already mentioned, a well-known class of systematic errors is the so-called thousand errors. These are cases where a respondent replies in units rather than in the requested thousands of units. The usual way to detect such errors is - similar to selective editing - by considering 'anticipated' values, which could, for instance, be values of the same variable from a previous period or values...

Balanced sampling

The fundamental property of a balanced sample is that the HT estimators of the totals of a set of auxiliary variables are equal to the totals we wish to estimate. This idea dates back to the pioneering work by Neyman (1934) and Yates (1946). More recently, balanced sampling designs have been advocated by Royall and Herson (1973) and Scott et al. (1978), who pointed out that such sampling designs ensure the robustness of the estimators of totals, where 'robustness' essentially means 'protection...

Calibration and regression estimators

Calibration and regression estimators combine more accurate and objective observations on a sample (e.g. ground observations) with the exhaustive knowledge of a less accurate Table 12.2 Unweighted and weighted (unbiased) confusion matrix in LUCAS photo-interpretation for stratification. Photo- Permanent Permanent Forest & interpretation Arable Crops Grass Wood Other Total or less objective source of information, or co-variable (classified images). A characteristic property of these...

Calibration weighting

One of the most relevant problems encountered in large-scale business (e.g. agricultural) surveys is finding estimators that are (i) efficient and (ii) derived in accordance with criteria of internal and external consistency (see below). The methodology presented in this section satisfies these requirements. The class of calibration estimators (Deville and Sarndal, 1992) is an instance of a very general approach to the use of auxiliary information in the estimation procedures in finite-...

Case study the province of Foggia

Forecasting agriculture production is of critical importance for policy-makers as for market stakeholders. In the context of globalization, it is of great value to obtain as quickly as possible accurate overall estimates of crop production on an international scale. For example, timely and accurate estimates of durum wheat yield are an important management instrument for the European Commission's Directorate-General for Agriculture and Rural Development (Baruth et al., 2008). The main objective...

Changing concepts of quality

The most significant event is that the concepts of quality are being customer or externally driven rather than determined internally by the statistical organization. The availability of official statistics on the Internet is increasing the audience of data users. In previous times, the data users were fewer in number and usually very familiar from long experience with the data they were being provided. The new audience of data users is increasingly becoming more sophisticated and also more...

Combined use of different frames

When various incomplete registers are available and information included in their records cannot be directly used for producing statistics, a sample survey has to be designed. Administrative data are most often used to create one single sampling frame, although on the basis of two or more lists. This approach should be used only if the different lists contribute essential information to complete the frame and the record matching gives extremely reliable results otherwise, the frame will be...

Combining ex ante and ex post auxiliary information a simulated approach

In this section we measure the efficiency of the estimates produced when the sampling design uses only the ex ante, only the ex post or a mixture of the two types of auxiliary information. Thus, we performed some simulation experiments whose ultimate aim is to test the efficiency of some of the selection criteria discussed above. After giving a brief description of the archive at hand, we detail the Istat survey based on this archive. Then we review the technical aspects of the sample selection...

Computation of D2 from the frame

It is useful to write the above equations in terms of weights of units in the population, so that different design strategies can be evaluated without actually having to draw different samples or, for sets of Nki units with the same uniform weight wk, Di V- ar-- U'k (6.5)

Conclusions

Data processing and dissemination constituted an important part of the FNAC activities. In order to permit the use of census data directly and to supply researchers with an instrument for complete, safe and adequate access to raw data in order to perform statistical analysis, the NBS decided to release a 1 sample data by also using SDA on the World Wide Web. With SDA it was possible to produce codebooks and to analyse the census data from any remote personal computer. The SDA experiment...

Contents

1 The present state of agricultural statistics in developed countries situation 1.2 Current state and political and methodological context 4 1.2.2 Specific agricultural statistics in the UNECE region 6 1.3 Governance and horizontal issues 15 1.3.1 The governance of agricultural statistics 15 1.3.2 Horizontal issues in the methodology of agricultural statistics 16 1.4 Development in the demand for agricultural statistics 20 1.5 Conclusions 22 Acknowledgements 23 Reference 24 Part I Census,...

Coverage Evaluation Study

The snapshot records that were neither flagged as must-gets nor found among the census records composed the survey frame for a Coverage Evaluation Study (CES). A sample of these farms was selected, prepared, collected, matched and searched much as in the FCFU, but at a later time. The farms that were confirmed as active and not counted Table 5.1 Census of Agriculture coverage statistics, 2001 and 2006 Table 5.1 Census of Agriculture coverage statistics, 2001 and 2006 in the census were weighted...

Creating a farm register the population

When a statistical register is created, all relevant sources should be used so that the coverage will be as good as possible. When microdata from different sources are integrated many quality issues become clear that otherwise would have escaped notice. However, a common way of working is to use only one administrative source at a time. Figure 2.5 shows the consequences of this. The Swedish Business Register has been based on only one source, the Business Register of the Swedish Tax Board....

Data and modelling issues

Farm economic surveys may not capture or miss out on key output or input variables, which are essential for economic analysis purposes. If a database does not include the required economic information then it is of little use for economic analysis. Most econometric modelling makes the assumption of homogeneity of parameters. A number of statistical models have been developed to deal with this situation. One of the mostly widely used is the mixed regression model (Goldstein, 1995 Laird and Ware,...

Data by Str Con and sector aggregated over areas

Once the StrCon has been defined for each area unit, the basic information aggregated over areas on the numbers of units classified by sector and StrCon can be represented as in Table 6.1. By definition, there is a one-to-one correspondence between the sectors and the StrCon. Subscript i (rows 1 to I) refers to the sector, and j (columns 1 to I) to StrCon. Generally, any sector is distributed over various StrCon any StrCon contains establishments from various sectors, in fact it contains all...

Description of the basic design

Let us now consider some basic features of an area-based multi-stage sampling design for an integrated survey covering small-scale economic units of different types. The population of units comprises a number of sectors, such as different types of establishments. We assume that, on the basis of some criterion, each establishment can be assigned to one particular sector. Sample size requirements in terms of number of establishments n.i have been specified for each sector i. The available...

Direct tabulation of administrative data

Two interesting studies (Selander et al., 1998 Wallgren and Wallgren, 1999), financed jointly by Statistics Sweden and Eurostat, explored the possibility of producing statistics on crops and livestock through the Integrated Administrative and Control System (IACS, created for the European agricultural subsidies) and other administrative data. After a comparison of the IACS data with an updated list of farms, the first study came to the following conclusion 'The IACS register is generally not...

Disadvantages of direct tabulation of administrative data

When administrative data are used for statistical purposes, the first problem to be faced is that the information acquired is not exactly that which is needed, since questionnaires are designed for specific administrative purposes. Statistical and administrative purposes require different kinds of data to be collected and different acquisition methods (which strongly influence the quality of data). Strict interaction between statisticians and administrative departments is essential, although it...

Durum wheat yield forecast

In this subsection a regression model is presented and used for the spatial prediction of the yield of durum wheat in the province of Foggia. The idea is very simple the estimated regression equation obtained with a sample of observations is used for predicting the value of y for given values of - in other words, the geographical information (i.e. covariates) is available for each point of spatial domain under investigation, and by using this information and the estimated model, a spatial...

Economic and econometric specification

It is not uncommon for economists to depart from the proper economic and econometric specifications of the model because the observed variables are not precisely as required. For example, in an economic analysis of technology choice and efficiency on Australian dairy farms (Kompas and Che, 2006) a variable representing feed concentration was not available. Average grain feed (in kilograms per cow) was therefore used as a proxy for this variable in the inefficiency model. As a consequence, the...

Empirical strategy 2141 The Palestinian data set

The Palestinian Public Perception Survey (PPPS) is an inter-agency effort aimed at building understanding of the socio-economic conditions in the West Bank and Gaza Strip. The University of Geneva implemented the 11th PPPS in 2007, with the collaboration of several agencies, including the FAO for the food security component. Responsibility for the data collection rests with the Palestinian Central Bureau of Statistics. The PPPS provides a very rich data set, including key indicators relevant...

Errors in administrative registers

A pillar of sampling theory is that, when a sample survey is carried out, much care can be devoted to the collection procedure and to the data quality control, since a relatively small amount of data is collected thus, non-sampling errors can be limited. At the same time, sampling errors can be reduced by adopting efficient sample designs. The result is that very accurate estimates can often be produced with a relatively small amount of data. The approach of administrative registers is the...

Evaluation criterion the effect of weights on sampling precision

Equation (6.2), if it can be applied, gives an equal probability or self-weighting sample for each sector i meeting the sample size requirements in terms of number of establishments n.i to be included. The constraint is that within each area different types of establishments are selected at the same rate gk determined by (6.3). This means that the establishment probabilities of selection have to vary within the same sector depending on the area units from which they come. This design is...

Examples of crop area estimation with remote sensing in large regions

The early Large Area Crop Inventory Experiment (LACIE Heydorn, 1984) analysed samples of segments (5 x 6 nautical miles), defined as pieces of Landsat MSS images, and focused mainly on the sampling errors. It soon became clear (Sielken and Gbur, 1984) that pixel counting entailed a considerable risk of bias linked to the errors of commission and omission. Remote sensing was still too expensive in the 1980s (Allen, 1990), but the situation changed in the 1990s with the reduction of cost, both...

Examples of crop yield estimationforecasting with remote sensing

The USDA publishes monthly crop production figures for the United States and the world. At NASS, remote sensing information is used for qualitative monitoring of the state of crops but not for quantitative official estimates. NASS research experience has shown that AVHRR-based crop yield estimates in the USA are less precise than existing surveys.1 Current research is centred on the use of MODIS data within biophysical models for yield simulation. The Foreign Agricultural Service (FAS)...

Expected accuracy of area estimates with the LUCAS 2006 scheme

Before launching any survey some idea is needed of the accuracy that can be achieved. The accuracy reached for the estimated area of land cover c mainly depends on the size D of the region and the proportion p of c. The results for each country have allowed simple linear regressions without intercept to be fitted for agricultural classes, SLUCAS 0.743sran + , r2 0.989, SLUCAS 1.182Sran + , r2 0.967, where sran(p) D x y p(l p) (n 1) is the standard error that would have been obtained with simple...

Forecasting yields

A baseline forecast can be produced as a simple average or trend of historical yield statistics. Various kinds of information are needed to improve the baseline forecast by capturing the main parameters that influence crop yield, including data on weather and weather anomalies, observations on the state of the crops and on relevant environmental conditions and reports on developments of crop markets (Piccard et al., 2002). Statistics on past regional yields are always necessary to calibrate the...

From concept to measurement 2131 The resilience framework

Figure 21.1 summarizes the rationale for attempting to measure resilience to food insecurity. Consistent with Dercon's (2001) framework, it is assumed that the resilience of a given household at a given point in time, T0, depends primarily on the options available to that household to make a living, such as its access to assets, income-generating activities, public services and social safety nets. These options represent a precondition for the household response mechanisms in the face of a...

General characteristics of SDA

SDA is a set of computer programs for the documentation and web-based analysis of survey data.10 There are also procedures for creating and downloading customized subsets of data sets. The software is maintained by the CSM.11 Current version of SDA is release 3.2 at the time of our experiment version 1.2 was available. All the following information are related to version 1.2. Data analysis programs were designed to be run from a web browser. SDA provides the results of the analysis very quickly...

How does it work

The intent of the data warehouse was to provide statisticians with direct access to current and historical data and with the capability to build their own queries or applications. Traditional transactional databases are designed using complex data models that can be difficult for anyone but power users to understand, thus requiring programmer assistance and discouraging ad hoc analysis. The database design results in many database tables (often over 100 tables), which result in many table forms...

Imputation of the missing auxiliary variables 1331 An overview of the missing data problem

Let us denote by Y (D x C) the matrix of estimates of the areas covered by crop types and by Z (D x C) the matrix containing the number of pixels classified by crop types according to the satellite data in each small area. Y is considered fully observed, while satellite images often haver missing data. Outliers and missing data in satellite information are mainly due to cloudy weather that does not allow the identification or correct recognition of what is being gown from the acquired digital...

Integrated economic and environmental accounting

At the aggregated level, sound indicators that give a good insight into the mechanism of agricultural society in relation to the economy and environment are needed. The integration of agricultural statistics with other statistics is a process that is tackled especially from the viewpoint of integrated economic and environmental accounting. The UNECE region is actively participating in preparations for the revision of the System of National Accounts (dating from 2008) where the relevance of...

Integrating agricultural and environmental information with LUCAS

Land cover and land use are of increasing importance in policy design and evaluation they constitute a key element in particular for climate change studies (Feddema et al., 2005). Environmental, agricultural and regional transport policies are more and more demanding two types of land cover data maps and statistics. A large number of land cover maps have been produced with satellite images examples at global level are Global Land Cover 2000 (known as GLC2000), based on SPOT VEGETATION images...

International coordination

The number of international and supranational organizations involved in agricultural statistics is rather limited. The FAO and Eurostat are the main international organizations involved. The OECD and UNECE were more involved, especially via the Inter-Secretariat Working Group on Agriculture. However, the activities of these organizations are currently limited and there is presently no forum to discuss issues concerned with agricultural statistics at the UNECE level (except for forestry...

Introduction

Agricultural statistics in the UN Economic Commission for Europe (UNECE) region are well advanced. Information collected by farm structure surveys on the characteristics of farms, farmers' households and holdings is for example combined with a variety of information on the production of animal products, crops, etc. Agricultural accounts are produced on a regular basis and a large variety of indicators on agri-economic issues is available. At the level of the European Union as a whole the...

Issues in statistical analysis of farm survey data 2041 Multipurpose sample weighting

Since the sample of farms that contribute to a farm survey is typically a very small proportion of the farms that make up the agricultural sector of an economy, it is necessary to 'scale up' the sample data in order to properly represent the total activity of the sector. This scaling up is usually carried out by attaching a weight to each sample farm so that the weighted sum of the sample values of a survey variable is a 'good' estimate of the sector-based sum for the same variable. Methods for...

Landuse stratification

The process of land-use stratification is the delineation of land areas into land-use categories (strata). The purpose of stratification is to reduce the sampling variability by creating homogeneous groups of sampling units. Although certain parts of the process are highly subjective in nature, precision work is required of the personnel stratifying the land (called stratifiers) to ensure that overlaps and omissions of land area do not occur and land is correctly stratified. The stratification...

LUCAS 20012003 Target region sample design and results

Table 10.1 Some area estimates of LUCAS 2003 for EU15. Table 10.1 Some area estimates of LUCAS 2003 for EU15. are based on comparing each sample element with other sample elements geographically close to it. Wolter (1984) compares several estimators of this type for the one-dimensional case, some of which had been proposed by Yates (1949), Osborne (1942) and Cochran (1946). Matern (1986) proposes similar estimators for the two-dimensional case. The usual variance estimator of the mean for...

LUCAS 2006 a twophase sampling plan of unclustered points

After some encouraging tests of the Joint Research Centre (JRC) in collaboration with the Greek Ministry of Agriculture in 2004 and the previous experience of the Italian AGRIT program (Martino, 2003), Eurostat decided to change sampling scheme. The new scheme used a common map projection the Lambert azimuthal equal area recommended by the Infrastructure for Spatial Information in Europe (INSPIRE) initiative (Annoni et al., 2001). This decision improved the homogeneity of the sample layout, but...

Main approaches to using EO for crop area estimation

Carfagna and Gallego (2005) give a description of different ways to use remote sensing for agricultural statistics we give a very brief reminder. Stratification. Strata are often defined by the local abundance of agriculture. If a stratum can be linked to one specific crop, the efficiency is strongly boosted (Taylor et al., 1997). Pixel counting. Images are classified and the area of crop c is simply estimated by the area of pixels classified as c. An equivalent approach is photo-interpretation...

Managing accuracy

Processes described previously under 'relevance' determine which programmes are going to be carried out, their broad objectives, and the resource parameters within which they must operate. Within those 'programme parameters', the management of accuracy requires particular attention during three key stages of a survey process survey design survey implementation and assessment of survey accuracy. These stages typically take place in a project management environment, outlined in Section 17.4,...

Managing coherence

Coherence of statistical data includes coherence between different data items pertaining to the same point in time, coherence between the same data items for different points in time, and international coherence. Three complementary approaches are used for managing coherence in Statistics Canada. The first approach to the first element is the development and use of standard frameworks, concepts, variables and classifications for all the subject-matter topics that are measured. This aims to...

Managing interpretability

Statistical information that users cannot understand - or can easily misunderstand - has no value and may be a liability. Providing sufficient information to allow users to properly interpret statistical information is therefore a responsibility of the Agency. 'Information about information' has come to be known as meta-information or metadata. Metadata are at the heart of the management of the interpretability indicator, by informing users of the features that affect the quality of all data...

Meat livestock and egg statistics

These traditional animal and poultry product statistics - resulting from traditional regular livestock surveys as well as meat, milk and eggs statistics - still play a key role in the design, implementation and monitoring of the EU Common Agricultural Policy and also contribute to ensuring food and feed safety in the EU. European statistics on animals and animal products are regulated by specific EU legislation. Member states are obliged to send monthly, annual and multi-annual data to the...

Meeting sample size requirements

The above equations can be applied to the total population or, as done in (6.5), separately to each sector i. As noted, it is assumed that subsampling within any area k is identical for all sectors i, implying uniform weights wk 1 fk for all types of units in the area. The average of Df values over I sectors, may be taken as an overall indicator of the inflation in variance due to weighting. The objective is to minimize this indicator by appropriately choosing the gk values satisfying the...

Methodological approaches

The framework described above can be estimated through multivariate analysis models. Equation (21.1) is a hierarchical model in which some variables are dependent on the one side and independent of the other. Unobservable (i.e. latent) variables also have to be dealt with. Figure 21.2 shows the path diagram of the model concerned. In the causal models literature (Spirtes et al. 2000), circles represent latent variables and boxes represent observed variables. Most of the hierarchical or...

Milk statistics

Milk statistics relate to milk produced by cows, ewes, goats and buffaloes. For the EU they are concerned with milk collected by dairies (monthly and annually) at national and regional level, milk produced in agricultural holdings (farms), the protein content and the supply balance sheets. Triennial statistics provide information on the structure of the dairies. Data collection and pre-validation are carried out through, for example, the use of the Web Forms system which ensures the management...

More flexible models an empirical approach

As is clear from the above illustrations, depending on the numbers and distribution of units of different types and on the extent to which the required sampling rates by sector differ, a basic model like (6.8) may be too inflexible, and may result in large variations in design weights and hence in large losses in efficiency of the design. It may even prove impossible to satisfy the sample allocation requirements in a reasonable way. Iteration o deviation * design effect Figure 6.1 Design effect...

New data collection tools

Modern technologies for data collection for agricultural and land use statistics are being implemented in many countries. As in many surveys, the use of data collection via computer-assisted personal or telephone interviewing has become the rule rather than the exception. The NASS and many EU countries have used the Blaise software for such interviewing for many years. A more recent development is the use of Internet questionnaires mainly for annual and monthly inventories. Both NASS USDA and...

Nonsampling errors in LUCAS 2006

Non-sampling errors are generally more difficult to assess than sampling errors (Lesser and Kalsbeek, 1999). In this section we study the possible order of magnitude of the main sources of non-sampling errors. The most important source of non-sampling error in an area frame survey is the identification mistakes by enumerators this can happen as a result of (b) incorrect identification because of inadequate enumerator training - mainly for minor crops (c) misinterpretation of rules to label land...

Numerical illustrations and more flexible models 661 Numerical illustrations

Table 6.2 shows three simulated populations of establishments. The distribution of the establishments by sector is identical in the three populations - varying linearly from Table 6.2 Number of establishments, by economic sector and 'stratum of concentration' (three simulated populations). Stratum of concentration population 1 Table 6.2 Number of establishments, by economic sector and 'stratum of concentration' (three simulated populations). Stratum of concentration population 1 Stratum of...

Probability proportional to size sampling

Consider a set-up where the study variable y and a positive auxiliary variable x are strongly correlated. Intuitively, in such a framework it should be convenient to select the elements to be included in the sample with probability proportional to x. Probability proportional to size (PPS) sampling designs can be applied in two different set-ups fixed-size designs without replacement (nps) and fixed-size designs with replacement (pps). Only nps will be considered here an excellent reference...

Probability proportional to size selection of area units

National or otherwise large-scale household surveys are typically based on multi-stage sampling designs. Firstly, a sample of area units is selected in one or more stages, and at the last stage a sample of ultimate units (dwellings, households, persons, etc.) is selected within each sample area. Increasingly - especially in developing countries - a more or less standard two-stage design is becoming common. In this design the first stage consists of the selection of area units with probability...

Programme content and stakeholder input

National statistical offices work very hard to understand the needs of the data user community, although the future cannot always be anticipated. As the primary statistical agency for the USDA, the NASS services the data needs of many agencies inside and outside the Department. Partnerships have been in place with state departments of agriculture and land-grant universities through cooperative agreements since 1917 to ensure statistical services meet federal, state, and local needs without...

Quality management assessment

Quality management assessment at Statistics Canada encompasses key elements of the Quality Assurance Framework (QAF), a framework for reporting on data quality and the Integrated Metadata Base (Julien and Born, 2006). Within this structure, a systematic assessment of surveys, focusing on the standard set of processes used to carry them out, is crucial. It has several objectives to raise and maintain awareness of quality at the survey design and implementation levels to provide input into...

References

Carfagna, E. (1998) Area frame sample designs a comparison with the MARS project. In T.E. Holland and M.P.R. Van den Broecke (eds) Proceedings of Agricultural Statistics 2000, pp. 261-277. Voorburg, Netherlands International Statistical Institute. Carfagna, E. (2001a) Multiple frame sample surveys advantages, disadvantages and requirements. In International Statistical Institute, Proceedings, Invited papers, International Association of Survey Statisticians (IASS) Topics, Seoul, August 22-29,...

Registers register systems and methodological issues

A register is a complete list of the objects belonging to a defined object set. The objects in the register are identified by identification variables. This makes it possible to update or match the register against other sources. A system of statistical registers consists of a number of registers that can be linked to each other. To make this exact linkage between records in different registers possible, the registers in the system must contain reference numbers or other identification...

Relative efficiency of the LUCAS 2006 sampling plan

We have compared the variance obtained with several single-stage sampling plans. 1. Simple random sampling (srs). We make an approximation of simple random sampling using the variance of random subsamples of the available systematic sample. 2. Pure systematic. A subsample was extracted by selecting in the LUCAS 2006 sampling plan the first eight replicates in all strata. We have used the non-stratified version of (10.3) and (10.4). 3. Post-stratified sample. The systematic sample of option 2...

Replicated sampling

NASS's area frames have been sampled using a replicated design since 1974. Replicated sampling is characterized by the selection of a number of independent subsamples or replicates from the same population using the same selection procedure for each replicate. Each replicate is therefore an unbiased representation of the population. A replicate for NASS's area frame sample design is a random sample of land areas (segments) selected within a land-use stratum. The sub-stratification within each...

Requirements of sample surveys for economic analysis

One of the most important requirements of sample surveys in general, not only farm surveys, is that sample sizes are large enough to enable sufficiently accurate estimates to be produced for policy analysis. Working against this are two important objectives the provision of timely results and the need to collect detailed economic data (Section 20.3), which often necessitates expensive face-to-face data collection methods. At the same time the sample often needs to be spread spatially (for...

Respondent reluctance privacy and burden concerns

Although the agricultural sector is somewhat unique and not directly aligned with the general population on a number of levels, concerns regarding personal security and privacy of information are similar across most population subgroups in the USA, Brazil, and Europe. Due to incidences of personal information being released by businesses and government agencies, respondents now have one more reason for not responding to surveys. While this is not the only reason for increasing non-response...

Rural development statistics

These are a relatively new domain and can be seen as a consequence of the reform of the Common Agricultural Policy, which accords great importance to rural development. Eurostat has started collecting indicators for a wide range of subjects - such as demography (migration), economy (human capital), accessibility to services (infrastructure), social well-being - from almost all member states at regional level. Most of the indicators are not of a technical agricultural nature. Data collected...

Sample allocation

The area frame sample is used to collect data on a wide range of agricultural items such as crop acreages, livestock inventories and economic data. Therefore, the allocation of the sample across states and within states to the land-use strata is extremely important. The NASS evaluates optimum allocations of the sample to obtain the most precision in the major survey estimates for a given budget. The number of sample segments allocated to each land-use stratum and state depends on factors such...

Sampling different types of units in an integrated design

An integrated multi-stage design implies the selection of a common sample of areas to cover all types of units of interest in a single survey. The final sampling stage involves the selection of ultimate units (e.g. establishments) within each selected area. In such a survey, in practice it is often costly, difficult and error-prone to identify and separate out the establishments into different sectors and apply different sampling procedures or rates by sector within each sample area. Hence it...

Satellite images and vegetation indices for yield monitoring

The use of remote sensing within the crop yield forecasting process has a series of requirements for most of the applications. Information is needed for large areas while maintaining spatial and temporal integrity. Suitable spectral bands are needed that characterize the vegetation to allow for crop monitoring, whether to derive crop indicators to be used in a regression or to feed biophysical models. High temporal frequency is essential to follow crop growth during the season. Historical data...

Selection probabilities

There are two methods for selecting the ultimate sampling unit or segment - equal and unequal selection. Which method is used depends on the availability of adequate boundaries for segments. If good boundaries are plentiful so that segments can be made approximately the same size within a land-use stratum, then segments are selected with equal probability. If adequate boundaries are not available, then unequal probability of selection is used since segment sizes are allowed to vary greatly in...

Selective editing

Manual or interactive editing is time-consuming and therefore expensive, and adversely influences the timeliness of publications. Moreover, when manual editing involves recon-tacting the respondents, it also increases the response burden. Therefore, most statistical institutes have adopted selective editing strategies. This means that only records that potentially contain influential errors are edited manually, whereas the remaining records are edited automatically. In this way, manual editing...

Similarities and differences from household survey design

The type of sample designs used in 'typical' household surveys provides a point of departure in this discussion of multi-stage sampling of small-scale economic units. Indeed, there may often be a one-to-one correspondence between such economic units and households, and households rather than the economic units themselves may directly serve as the ultimate sampling units. Nevertheless, despite much common ground with sampling for population-based household surveys, sampling small-scale economic...

Smallarea estimates

An important development is the need for up-to-date and accurate small-area estimates. The demand for early estimates for advance warning on crops, and for results for small domains, continues to increase. In agriculture these small domains could be geographical areas or unique commodities. Statistical methods are being used for small-area estimation that use models and modelling techniques borrowing strength from other data sources such as administrative data or other areas. The overview of...

Statistics Canadas agriculture statistics programme

Statistic Canada's agriculture statistics programme consists of a central farm register numerous annual, sub-annual and occasional surveys the use of administrative data, including tax data and a census conducted every 5 years. Each of these components is described in this section. Farm register The farm register is a database of farm operations and operators. Its creation dates back to the late 1970s, while its present design and technology were developed in the mid-1990s. It contains key...

Step 1 Match data from E to M

Every year, tax returns are matched to the farm register. Individuals' tax records are matched to operator records using key identifiers (name, sex, date of birth and address), while corporate tax records are matched to operations using other key identifiers (farm name and address). The matches are done using direct matching techniques. Very strong matches are accepted automatically weaker matches are verified and the weakest matches are automatically rejected. The thresholds between what is...

Step 3 Search for the potential farms from E on M

The matching step described in step 1 is based on the principle that a tax record is not matched until proven otherwise. For the purpose of improving the coverage of the farm register, this principle is reversed. In order to bring the tax record to the next step (collection), it must be determined with confidence that it is not on the register. The tax record must be searched by clerical staff using queries or printouts to find tax records that were not found previously due to incomplete...

Stratification

Stratification is one of the most widely used techniques in finite population sampling. Strata are disjoint subdivisions of a population U , and the union of the strata coincides with the universe U Uk iUi, Uh n Ui 0, h i e 1, , H . Each group contains a portion of the sample. Many business surveys employ stratified sampling procedures where simple random sampling without replacement is performed within each stratum see, for example Sigman and Monsour (1995) and, for farm surveys, Vogel (1995)....

Substratification

There is a further level of stratification which is applied to the frame. Sub-stratification is the process used to divide the population of sampling units within each stratum equally into categories (substrata). These substrata do not have a definition associated with them like strata do (e.g. 50 or more cultivated). Sampling units are placed into substrata based on likeness of agricultural content and, to a certain extent, location. Sub-stratification activities include ordering the PSUs,...

Synthetic and composite estimates

Suppose the population is divided into g large post-strata for which reliable direct estimates of the post-strata totals, Y.g, can be calculated from the survey data, where Y.g J2i Yig and Yig is the total of the characteristic of interest, y, for the units in small area i that belong to post-stratum g. Our interest is in estimat9ng the small-area totals Yi J2g Yig, i 1, ,m, using known auxiliary totals Xig. A synthetic estimate of Yi is given by where Y.g and X.g are reliable direct estimates...

Taking advantage of administrative data for censuses

When a census or a sample survey of farms has to be carried out, administrative registers may be of considerable value in updating the list to be used. The IACS data are the most important source of information for this purpose, although in some cases the same farm is included several times for different subsidies and clusters based on auxiliary variables have to be created in order to identify the farm that corresponds to the different records. Administrative data could also be used in order...

Testing resilience measurement 2151 Model validation with CART

A cross-validation process is used to assess whether the procedures adopted for estimating the resilience index are meaningful. The cross-validation process tests the original hypothesis that sets of different variables and indicators belonging to different dimensions of food insecurity, the social sector and public services are correlated with (i.e. contribute to) the overall resilience index. The CART methodology (see Steinberg and Colla, 1995 Breiman et al., 1984) is used to estimate the...

The AGRIT survey

AGRIT is a sample survey the aim of which is to produce estimates, taking into account the economic situation, on areas and yields of the main crops and on the main land uses (see also Chapter 13 of this book). This survey is carried out using techniques of spatial sampling, particularly of point typology. Generally speaking, the method is based on the integration of data collected in ground samples, and data acquired through remote sensing. The list is constituted by a set of points (point...

The effect of random weights

The design effect, which measures the efficiency of a sample design compared to a simple random sample of the same size, can be decomposed under certain assumptions into two factors the effect of sample weights the effect of other aspects of the sample design, such as clustering and stratification. We are concerned here with the first component - the effect of sample weights on precision. This effect is generally to inflate variances and reduce the overall efficiency of the design. The increase...

The Fellegi Holt paradigm

In this section we describe the error localization problem for random errors as a mathematical optimization problem, using the (generalized) Fellegi-Holt paradigm. This mathematical optimization problem is solved for each record separately. For each record (x , ,xn) in the data set that is to be edited automatically, we wish to determine - or, more precisely, to ensure the existence of - a synthetic record (xjj, , xj) such that (xj*, , x*) satisfies all edits j (j 1, , J) given by (15.1) or...

The GEOSS best practices document on EO for crop area estimation

GEOSS is an initiative to network earth observation data. GEOSS promotes technical standards for different aspects of remote sensing. One of them is dedicated to the use of remote sensing for crop area estimation (GEOSS, 2009) that gives a simplified summary of the methods and of kinds of satellite data that can be used in operational projects and those that still require research Satellite data. Synthetic aperture radar (SAR) images cannot be considered for crop area estimation, except for...

The role of resilience in measuring vulnerability

In most studies of poverty, vulnerability indicators consider the probability distribution of household consumption as their objective (Dercon, 2001). They consider consumption as a stochastic variable, try to estimate the deterministic part of it through regression models, and then calculate the probability of falling below a certain threshold, usually the poverty line or a proxy for food security. Other theoretical studies consider vulnerability as a function of people's exposure to risks and...

The transect survey in LUCAS 20012003

In each PSU of the 2001-2003 sample, a transect was defined as the 1200 m line joining the five points located at the north of the PSU. The transect was surveyed recording the intersections with linear elements (hedges, stone walls, etc.) and changes of major land cover types. Estimating the total length of linear elements is an application of the classical Buffon's needle problem (Wood and Robertson, 1998). An unbiased estimate of the total length is where is the number of transects of length...