Pattern Recognition

Pattern recognition (PR) is a process to classify an object by analyzing the numerical data that characterize the object. Various academic fields, such as image processing, medical engineering, criminology, speech recognition, and signature identification, have applied PR to classify objects of interest (Duda et al., 2001). However, PR techniques have not been exploited for drought prediction. The process of PR begins with the identification of a variable that can be used to define the object under study. In the case of agricultural drought, the object can be yield of a major crop in the region.

Various pattern recognition techniques are available in the literature (Jain and Flynn, 1993; Duda et al., 2001), but only a few techniques are relevant to the case of agricultural drought. First, potential variables that affect drought are derived from weather data and satellite data. For example, these variables can be average monthly temperature, total monthly precipitation, and variables based on satellite data during the growing season. V. K. Boken, in a just concluded analysis, derived 32 variables to develop pattern recognition models to predict drought for selected crop districts in Saskatchewan, Canada. Both two-variable and multiple-variable cases were considered. In the two-variable case, an error-correction (EC) procedure (Kumar et al., 1998; Duda et al., 2001) was applied and, in the multiple-variable case, both linear (linear discriminant analysis) and nonlinear (nearest neighbor analysis) techniques were attempted using SAS software (SAS Institute Inc., Cary, North Carolina, United States). In the case of the EC procedure, two variables were selected at a time to examine the presence of a solution vector to linearly separate drought and nondrought events. An iterative procedure was applied using a computer program, but no solution vector was found. This reiterates the complexity involved in the analysis of agricultural drought. To proceed further, the multiple-variable case was investigated and a subset of significant or most suitable variables was determined. To find the subset of significant variables, the STEPDISC procedure of SAS software was used. Using these significant variables for each crop district, the linear and nonlinear techniques were applied to develop models for classifying an event as drought or nondrought.

Linear Discriminant and Nearest Neighbor Analysis

A function was obtained by applying the linear discriminant technique on the subset of variables, which can be used to classify a subset of variables as drought or nondrought. To develop such a function, the whole data set for multiple years was used. The training set was used to develop the linear discriminant function (LDF), and the testing set was used to test the classification performance of the LDF. For applying linear discriminant analysis, the within-category distribution must be normal. A nonparamet-ric technique (nearest neighbor) was also attempted. Using this technique one can classify a subset of variables as drought or nondrought based on the category of the neighboring subsets, and the normality assumption is not required. Up to 83% of accuracy to classify or predict drought was obtained.

0 0

Post a comment