Because potential allergens cannot at present be accurately identified based on a single characteristic, the allergy assessment testing strategy, as originally proposed by the U.S. Food and Drug Administration (FDA) in 1992 and further modified by the Food and Agricultural Organization of the United Nations and World Health Organization (FAO/WHO) and the U.S. Codex Office, Food Safety and Inspection Service, U.S. Department of Agriculture (Codex) scientific panels,1518 recommends that all proteins introduced into crops be assessed for their similarity to a variety of structural and biochemical characteristics of known allergens. Since the primary method of disease management for food-allergic people is avoidance, a core principle of these recommended strategies is to experimentally determine whether candidate proteins for genetic engineering into foods represent potential food allergens. A multilevel, weight-of-evidence approach to the allergy assessment of foods derived from biotechnology crops takes into account the following information: bioinformatics searches, in vitro digestability assays, and IgE binding, if appropriate. Additional methods are under consideration and are described below.
The bioinformatics search process is a series of alignments at the amino acid level between a protein of interest (query sequence) and a large pool of amino acid sequences from proteins contained in public databases. The purpose of these analyses is to describe the biological and taxonomical relatedness of the query sequence to other functionally related proteins. In the context of allergy, the goal is to identify the level of amino acid similarity and structural relatedness between a protein of interest and sequences from known allergens. Sequences are aligned in a linear fashion in an attempt to describe the highest level of exact matching or similar amino acid residues between two sequences. Higher order structure may be inferred between two proteins by comparing levels of linear homology.19 The more closely related a query sequence is to an allergen, the higher the likelihood that the two proteins may share similar functions. Allergic potential may be inferred for a novel or transgenic protein sequence if there exists significant similarity of amino acid residues with a well described allergen.20 This bioinformatics approach forms a critical part of the multistep procedure in assessing the safety of biotechnology food proteins. Bioin-formatic searches are an important first step in safety assessments of genetically modified (GM) foods so that known protein allergens or other significantly related proteins are avoided during the biotechnology development process.
A bioinformatic sequence search against a large inclusive database, such as the SWISSPROT protein database, can be accomplished with an identity/similarity comparison algorithm, such as FASTA.21 A broad search can be viewed as an initial strategy that provides identity for a query sequence. Sequences from the public databases that have high levels of similarity with a query sequence can indicate the protein family as well as discrete levels of taxonomic relatedness. However, the sequences in public databases are not necessarily peer-reviewed and are many times not representative of intact proteins; thus, the search results require careful review.
A more refined and informative allergy-based search strategy can be performed with the same match comparison programs by searching against a database containing selected allergens such as those at the online sources of www.allergenonline. com and www.allergome.org.222 The goal of curated allergen databases is to include only sequences that have supporting documentation as to their clinical relevance as allergens. High-percentage identity matches between database sequences and a query sequence would suggest a probability that the query sequence could crossreact with IgE directed against that allergen. To distinguish among many matches, criteria can be used to judge the ranked scores produced by programs such as FASTA. For example, the most recent scientific panel (Codex Alimentarius23) recommended a percent identity score of at least 35% matched amino acid residues of at least 80 residues as being the lowest identity criteria for proteins derived from biotechnology that could suggest IgE crossreactivity with a known allergen.
The quality of sequence alignments that are detected between a query protein and an allergen can also be evaluated. The E-score (expectation score) is a statistical measure of the likelihood that the observed similarity score could have occurred by chance in a search. A larger E-score indicates a lower degree of similarity between the query sequence and the sequence from the database. Typically, alignments between two sequences will need to have an E-score of 1 x 10-5 or smaller to be considered to have significant homology. E-scores of ~1 are expected to occur for alignments between random, nonhomologous sequences.15
An additional bioinformatics approach can be taken by searching for 100% identity matches along short sequences contained in the query sequence as they are compared to sequences in a database. A short amino acid sequence search (sliding search window), if compared along the whole length of the query sequence in an overlapping fashion, is intended to represent the smallest sequence that could function as an IgE-binding epitope.3,24 If any exact matches between a known allergen and a transgenic sequence were found using this strategy, it could represent the most conservative approach to predicting potential for a peptide fragment to act as an allergen. Additional IgE binding studies could be conducted to determine whether this homology represented a biologically relevant homology in terms of allergy if appropriate patients and their sera were identified for collection and testing.
Critical to this type of search algorithm is the selection of the overlapping sequence length. As the length of the window of amino acids is shortened, the greater the chance for random, false positive matches. Although different window lengths have been recommended, a length of eight amino acids has been shown to be informative without acquiring a majority of matches against irrelevant sequences.25-27 To improve epitope sequence matching, a database of confirmed IgE-binding sequential epitopes needs to be expanded for existing allergens because many allergens that bind IgE in patient sera and are known to cause clinical allergy symptoms do not have B- and T-cell epitopes described for them in the scientific literature.24
At this time there is no database of epitope sequences which can fully describe epitopes for all of the protein allergens. In addition, the variability in epitope length for existing allergen epitopes makes assessments of biotechnology food protein sequences with an epitope database impractical at this time and is not recommended as a safety assessment strategy.27 Thus, further research regarding epitope identity and sequence length is required in order to make short amino acid search strategies informative beyond the theoretical identity matching strategy currently available.2728 Moreover, it has to be noted that many IgE-binding epitopes are conformational. The analysis of conformational IgE epitopes is difficult and involves methods such as site-directed mutagenesis of the full length allergen,29 mimicking conformational IgE-binding sites by short phage-displayed peptides,30 or even structural analysis of allergen immune complexes.31
Was this article helpful?
Do you hate the spring? Do you run at the site of a dog or cat? Do you carry around tissues wherever you go? Youre not alone. 51 Ways to Reduce Allergies can help. Find all these tips and more Start putting those tissues away. Get Your Copy Of 51 Ways to Reduce Allergies Today.