The primary amino acid sequence of the protein determines its secondary and tertiary structure and, therefore, the protein's biological activity. The information about amino acid sequence is usually deduced from the nucleotide sequence and consequently confirmed by N-terminal sequencing and peptide fingerprinting. N-terminal sequencing is based on Edman degradation chemistry, which allows the ordered amino acid composition of a protein's N-terminus to be confirmed. Usually up to 15 amino acids can be reliably obtained by N-terminal sequencing using a relatively small amount of protein. Several issues are associated with the detection of a protein's N-terminal sequence. Removal of the N-terminal methionine, catalyzed by methionine aminopeptidase, is by far the most common modification occurring on the vast majority of proteins.24 Methionine excision occurs co-translationally before completion of the nascent protein chain.
The N-terminal amino acid can also be modified covalently and thus be unavailable for sequencing. The most common type of covalent modification is acetylation catalyzed by N-terminal acetyltransferases.25 N-terminal acetylation is irreversible and occurs co-translationally on most eukaryotic proteins, but rarely on prokaryotic or archaebacterial proteins.
Finally, more than one sequence can be detected due to proteolytic activities released from plant cells during the purification procedure. Numerous endopepti-dases responsible for the processing of seed storage proteins during the germination process are released into solution during the protein purification procedure26 and can contribute to the nonspecific cleavage of N-terminal amino acids. The absence of a few amino acids from the N-terminus of the protein usually has no effect on protein structure or activity and thus has no impact on the outcome of the safety evaluation.
Peptide mass fingerprinting is another analytical technique utilized for protein identification. The protein of interest is cleaved into peptides by proteases that recognize highly specific cleavage sites (e.g., trypsin). Every unique protein will have a unique set of peptides, and hence a corresponding set of peptide masses that can serve as a unique protein identifier. The absolute masses of the peptides are determined with matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) or electrospray ionization time-of-flight (ESI-TOF) and compared to the theoretical peptide masses generated from a protein or DNA database. Identification is accomplished by matching the observed peptide masses to the theoretical masses. To unequivocally identify a protein, a minimum of five masses is required27; however, a significantly larger number of peptides are usually identified for the protein of interest using this technique.
10.4.2 Tests to Conrrm Equivalence of Protein Produced in
Heterologous systems versus Protein Expressed in Plants
To establish the equivalence of two proteins, their physico-chemical properties are compared. The purpose of this comparison is to demonstrate that the bacteria-produced protein is appropriately equivalent to the plant-expressed protein. Two proteins are usually compared using analytical methods that can detect differences in physico-chemical properties without completely elucidating each protein in absolute terms. Sets of data are evaluated using preset criteria to allow one to draw conclusions about protein equivalence. Typical parameters considered in demonstrating the equivalence between a protein that is produced in a plant and the same protein produced by bacteria include demonstrating equivalence of molecular weights, post-translational modifications (e.g. level of glycosylation), immunoequivalence, and functional activities.
For proteins, molecular weight is the physico-chemical parameter that is defined by protein covalent structure, post-translational modifications, and state of aggregation. It also provides information on the potential truncations and/or fragmentation of the protein of interest due to proteolytic activities. The comparison of relative molecular weights of the proteins produced in bacteria and purified from plant is usually performed by SDS-polyacrylamide gel electrophoresis (PAGE). The elec-trophoretic mobility of two proteins is evaluated using an appropriate percentage of SDS-polyacrylamide gels, defined molecular weight markers, robust staining procedures, and densitometric analysis. Direct determination of the molecular weight of two proteins is typically accomplished using MALDI-TOF or ESI-TOF mass spec-trometry. Although mass spectrometry is an extremely valuable tool for detecting the protein masses, parameters such as purity of the protein preparation, protein charge, and size can impact the effectiveness of this technique in protein comparative characterization.
Immunoreactivity of the protein with protein-specific antibody is another parameter that depends on protein identity, presence of antibody-specific epitopes, and their intactness. Comparison of the immunoreactivity of two proteins is typically assessed by Western Blot analysis utilizing protein-specific antibody. The conclusion of equal immunoreactivity is based on the demonstration of equal band intensities at the same apparent molecular weight on blot films.11-14 The conclusion about equal intensity is commonly made based on densitometric analysis and use of software such as Quantity One® (Bio-Rad, Hercules, CA) that allows quantification of the produced signal.
Many eukaryotic proteins are post-translationally modified with carbohydrate moieties.28 In contrast, prokaryotic organisms such as E. coli lack the necessary biochemical "machinery" required for protein glycosylation. Post-translational modifications such as glycosylation may have impact on the protein's allergenic potential because large carbohydrate complexes may alter the epitope structure or introduce glycan epitopes, which have been found to be crossreactive.29 Therefore, glycosylation analysis is usually utilized to determine whether the protein purified from plant is post-translationally modified with covalently bound carbohydrate moieties. Carbohydrate detection is typically performed directly on the PVDF membrane or in gels containing both plant- and bacteria-produced proteins and naturally glycosylated proteins, which are used as markers. The ultimate criterion for equivalence with respect to glycosylation is the absence of glycosylation for the protein purified from plant.
Functional activity is a very important parameter in establishing protein equivalence. Only proteins that have the same covalent structure, identical secondary and tertiary fold, and similar post-translational modifications essential to the protein's mode of action will exhibit equivalent functional activity. The activity tests are protein-specific and as a rule are validated for their accuracy, precision, and robustness.
The goal of the bioinformatic analysis is to determine whether the primary amino acid sequence of the introduced protein shares homology to known toxins, allergens, and pharmacologically active or antinutritional proteins. The extent of homology between the introduced protein and sequences in these databases can be assessed using the FASTA30 and BLAST31 sequence alignment tools utilizing various scoring matrices for comparison of levels of homology. The alignment data may be used to infer similarity in higher-order structures. Proteins that share a high degree of similarity throughout the entire length of their amino acid sequence are often homologous. Homologous proteins share similar secondary and tertiary structure, common three-dimensional fold, and related functional activity.32 Consequently, homologous proteins can potentially crossreact with IgE antibodies responsible for allergenic reactions to food. Although the criteria applied to bioinformatic searches for aller-genicity assessment are relatively well established (for details, see Chapter 8), there are no specific guidelines for bioinformatic searches aimed at evaluating protein similarity to toxins and pharmacologically active proteins.
To determine whether the introduced protein has homology to any known toxin, it would usually be compared to all proteins in publicly available databases (e.g., SWISSPROT) that have the word "toxic" in their description. It is a rather conservative approach since all protein sequences found in any toxic organism would fall into this category and, therefore, can provide a large amount of false positives which need to be sorted out by thorough examination of each positive hit. The most reliable approach to evaluation of protein homology is to assess the percent identity shared by protein sequences. At 25% sequence identity, proteins may belong to the same functional class, whereas sequence identity of at least 40% is required for proteins to have exactly the same function.33
Was this article helpful?