Protein Structure

The structures that proteins can assume following their synthesis in the cell can be divided into four categories: primary, secondary, tertiary, and quaternary. The amino acid sequence of a protein (primary structure) determines the capacity of a protein to fold into specific three-dimensional conformations that give the protein its unique structural and functional properties. Primary structure also includes cova-lently interconnected bonds between the sulfhydryl groups of cysteine molecules to form an intrachain cystine double bond (Figure 1.9). These bonds can be formed between cysteines on the same polypeptide chain, or between cysteines on different polypeptide chains to form multisubunit protein complexes. Disulfide bonds do not change the conformation of the protein, but do stabilize it.1

Understanding the primary structure of a protein, such as the hormone insulin, provides insight into how it is converted to its biologically active form after synthesis in the pancreas. Insulin is produced in pancreatic islet cells as a single-chain,

FIGURE 1.9 Disulfide bonds. This diagram illustrates how covalent disulfide bonds form between adjacent cysteine side chains. As indicated, these cross-linkages can join either two parts of the same polypeptide chain or two different polypeptide chains. Since the energy required to break one covalent bond is much larger than the energy required to break even a whole set of noncovalent bonds, a disulfide bond can have a major stabilizing effect on a protein.

FIGURE 1.9 Disulfide bonds. This diagram illustrates how covalent disulfide bonds form between adjacent cysteine side chains. As indicated, these cross-linkages can join either two parts of the same polypeptide chain or two different polypeptide chains. Since the energy required to break one covalent bond is much larger than the energy required to break even a whole set of noncovalent bonds, a disulfide bond can have a major stabilizing effect on a protein.

Proinsulin

Proinsulin

Connecting Peptide Removed, Leaving Complete Two-chain Insulin Molecule

Reduction Irreversibly Separates the Two Chains

Reduction Irreversibly Separates the Two Chains

FIGURE 1.10 Proteolytic cleavage in insulin assembly. The polypeptide hormone insulin cannot spontaneously re-form efficiently if its disulfide bonds are disrupted. It is synthesized as a larger protein (proinsulin) that is cleaved by a proteolytic enzyme after the protein chain has folded into a specific shape. Excision of part of the proinsulin polypeptide chain removes some of the information needed for the protein to fold spontaneously into its normal conformation once it has been denatured and its two polypeptide chains separated.

FIGURE 1.10 Proteolytic cleavage in insulin assembly. The polypeptide hormone insulin cannot spontaneously re-form efficiently if its disulfide bonds are disrupted. It is synthesized as a larger protein (proinsulin) that is cleaved by a proteolytic enzyme after the protein chain has folded into a specific shape. Excision of part of the proinsulin polypeptide chain removes some of the information needed for the protein to fold spontaneously into its normal conformation once it has been denatured and its two polypeptide chains separated.

inactive precursor, proinsulin, with the primary structure shown in (Figure 1.10). The polypeptide chain contains 86 amino acids and three intrachain cystine disulfide bonds. It is transformed into biologically active insulin by proteolytic cleavage of the primary structure prior to its secretion from islet cells. Proinsulin is cleaved by proteases present in the islet cells that cleave two peptide bonds in proinsulin between amino acid residues 30 and 31, and 65 and 66. This releases a 35-amino acid segment (the C-peptide) and insulin, which consists of two polypeptide chains (A and B) of 21 amino acids and 30 amino acids, respectively, covalently joined by the same disulfide bonds present in proinsulin. The activated form of insulin is then released into the circulation.

When the protein assumes its unique conformation in the cell following its synthesis on the ribosome, the nonpolar hydrophobic side chains on the amino acids

Polar Side Chains

Nonpolar Side Chains

Polar Side Chains

Nonpolar Side Chains

Hydrophobic Polar side chains core region on the outside contains of the molecule m

Hydrophobic Polar side chains core region on the outside contains of the molecule nonpolar can form hydrogen side chains bonds to water

Unfolded Polypeptide

Folded Conformation in Aqueous Environment

FIGURE 1.11 How a protein folds into a compact conformation. The polar amino acid side chains tend to gather on the outside of the protein, where they can interact with water; the nonpolar amino acid side chains are buried on the inside to form a tightly packed hydrophobic core of atoms that are hidden from water. In this schematic drawing, the protein contains only about 30 amino acids.

tend to localize in the interior of the protein, away from the water interface. The polar side chains of amino acids that can be ionized in water are localized on the outside of the protein, where they are stabilized through interactions with water molecules (Figure 1.11).

The next level of organization of the protein refers to secondary structure. This includes certain folding patterns or conformations that many proteins assume, such as a-helix found in globular proteins such as myoglobin and the cell membrane proteins such as transporters and receptors.1 Another folded conformation that has been observed in many proteins is the b-sheet, which is found in immunoglobulins that provide protection against pathogenic viruses and bacteria (Figure 1.12). Some enzymes (e.g., lactic dehydrogenase) and fibronectin (involved in cell adhesion) also contain significant amounts of b-sheet.1 Fibrous proteins, including collagen, elastin, and a-keratin, which is found in nails and hair (Figure 1.12), characteristically contain larger amounts of regular secondary structure and have a long cylindrical (rodlike) shape and low water solubility. They generally impart a structural role in the cell. Collagen is present in all mammalian tissues and organs, where it provides the framework that gives the tissues their form and structural strength. As a major component of skin and bone, collagen is the most abundant protein in mammals, comprising 25% of the total protein mass.1 Its secondary structure includes large amounts of a triple helix, whereas elastin, which gives tissues such as skin, blood vessels, and the lung their elasticity, consists of a random coil structure.

Tertiary structure refers to the three-dimensional structure of the polypeptide. It includes the conformational relationships in space of the side chains and the geometric relationship between distant regions of the polypeptide chain. Proteins that function as enzymes have one or more catalytic sites on the protein that bind the substrate

Carbon

Carbon

Oxygen H-bond

Hydrogen / Nitrogen

Oxygen H-bond

Hydrogen / Nitrogen

Carbon Nitrogen

a-helix

Carbon Nitrogen a-helix

Hydrogen Carbon

Nitrogen

Hydrogen Carbon

Nitrogen

Figure 1.12 The regular conformation of the polypeptide backbone observed in the a-helix and the b-sheet (A, B, and C). The a-helix. The N-H of every peptide bond is hydrogen-bonded to the C=O of a neighboring peptide bond located four peptide bonds away in the same chain. (D, E, and F) The b-sheet. In this example, adjacent peptide chains run in opposite (antiparallel) directions. The individual polypeptide chains (strands) in a b-sheet are held together by hydrogen-bonding between peptide bonds in different strands, and the amino acid side chains in each strand alternately project above and below the plane of the sheet. (A) and (D) show all the atoms in the polypeptide backbone, but the amino acid side chains are truncated and denoted by R. In contrast, (B) and (E) show the backbone atoms only, while (C) and (F) display the shorthand symbols that are used to represent the a-helix and the b-sheet in ribbon drawings of proteins.

Figure 1.12 The regular conformation of the polypeptide backbone observed in the a-helix and the b-sheet (A, B, and C). The a-helix. The N-H of every peptide bond is hydrogen-bonded to the C=O of a neighboring peptide bond located four peptide bonds away in the same chain. (D, E, and F) The b-sheet. In this example, adjacent peptide chains run in opposite (antiparallel) directions. The individual polypeptide chains (strands) in a b-sheet are held together by hydrogen-bonding between peptide bonds in different strands, and the amino acid side chains in each strand alternately project above and below the plane of the sheet. (A) and (D) show all the atoms in the polypeptide backbone, but the amino acid side chains are truncated and denoted by R. In contrast, (B) and (E) show the backbone atoms only, while (C) and (F) display the shorthand symbols that are used to represent the a-helix and the b-sheet in ribbon drawings of proteins.

Molecule A Enzyme- Enzyme- Molecule B

(substrate) substrate product (product)

Complex Complex

Molecule A Enzyme- Enzyme- Molecule B

(substrate) substrate product (product)

Complex Complex

FIGURE 1.13 How enzymes work. Each enzyme has an active site to which one or two substrate molecules bind, forming an enzyme-substrate complex. A reaction occurs at the active site, producing an enzyme-product complex. The product is then released, allowing the enzyme to bind additional substrate molecules.

to catalyze its chemical transformation into a product (Figure 1.13). Although the amino acids that form the catalytic site of the protein may be widely separated in the primary structure of the protein, the tertiary structure brings them together in space to form the catalytic site (Figure 1.14). An example is chymotrypsin, a serine protease made up of 245 amino acids that is produced in the pancreas and released into the intestinal tract to degrade ingested proteins. The functional groups on amino acids that form the catalytic site of chymotrypsin include: (1) the hydroxy

Amino Acid Side Chains

Amino Acid Side Chains

Unfolded Protein

I Folding Binding Site

Unfolded Protein

I Folding Binding Site

Folded Protein

Folded Protein

Hydrogen Bond

Hydrogen Bond

CH2 ch2

CH2 ch2

Figure 1.14 The binding site of a protein. (A) The folding of the polypeptide chain typically creates a crevice or cavity on the protein surface. This crevice contains a set of amino acid side chains disposed in such a way that they can make noncovalent bonds only with certain ligands. (B) A close-up of an actual binding site showing the hydrogen bonds and ionic interactions formed between a protein and its ligand (in this example, cyclic AMP is the bound ligand).

Hydrogen Bond Rearrangements

Hydrogen Bond Rearrangements

FIGURE 1.15 An unusually reactive amino acid at the active site of an enzyme. This example is the "catalytic triad" found in chymotrypsin, elastase, and other serine proteases. The aspar-tic acid side chain (Asp 102) induces the histidine (His 57) to remove he proton from serine 195. This activates the serine to form a covalent bond with the enzyme substrate, hydrolyzing a peptide bond.

methyl group of serine (position 195 of the primary structure); (2) the imidazole of histidine (position 57); and (3) the side chain carboxylate of aspartate (position 102) (Figure 1.15).1

Quaternary structure refers to the individual protein subunits that form multi-subunit protein complexes that interact to provide the protein function. For example, hemoglobin (which transports oxygen in red blood cells) contains two a-globin and two b-globin subunits. Each subunit contains an oxygen binding site that cooperatively interacts with those on the other subunits to bind and release oxygen from the red blood cell to body tissues (Figure 1.16). Not all proteins have a quaternary structure.1

FIGURE 1.16 A protein formed as a symmetric assembly of two different subunits. Hemoglobin is an abundant protein in red blood cells that contains two copies of a-globin and two copies of b-globin. Each of these four polypeptide chains contains a heme molecule, which is the site where oxygen (O2) is bound. Thus, each molecule of hemoglobin in the blood carries four molecules of oxygen.

FIGURE 1.16 A protein formed as a symmetric assembly of two different subunits. Hemoglobin is an abundant protein in red blood cells that contains two copies of a-globin and two copies of b-globin. Each of these four polypeptide chains contains a heme molecule, which is the site where oxygen (O2) is bound. Thus, each molecule of hemoglobin in the blood carries four molecules of oxygen.

SH3 Domain

Small Kinase Domain

SH3 Domain

Small Kinase Domain

SH2 Domain

Large Kinase Domain

FIGURE 1.17 Protein formed from four domains. In the Src protein shown, two of the domains form a protein kinase enzyme, while the SH2 and SH3 domains perform regulatory functions. A ribbon model, with ATP substrate.

SH2 Domain

Large Kinase Domain

FIGURE 1.17 Protein formed from four domains. In the Src protein shown, two of the domains form a protein kinase enzyme, while the SH2 and SH3 domains perform regulatory functions. A ribbon model, with ATP substrate.

Large proteins consist of several distinct protein domains — structural units that fold more or less independently of each other. A domain typically contains between 40 and 350 amino acids, and larger proteins may be composed of several domains (Figure 1.17). Domains can impart different biochemical functions to the same protein.1 Protein domains are classified by class, fold, and family. The class of the protein is determined by the predominant type of secondary structure present in the protein. Some protein classes possess mainly a-helical structures, others primarily b-sheet, and some proteins possess approximately equal amounts of a-helix and b-sheet. The fold classification is determined by the arrangement of secondary structure elements within the domain. The family classification is determined by the amino acid sequence identity between proteins. Proteins that are members of the same family have a common evolutionary relationship, as they are derived from the same primordial gene. Proteins of the same family have the same folding pattern and often have similar functions across species. Many large proteins have evolved by the joining of preexisting domains in new recombinations, an evolutionary process called domain shuffling (Figure 1.18).1

During the course of protein evolution, changes in the amino acid content can occur due to spontaneous mutations in the DNA codons. Changes in amino acids may alter the noncovalent interactions between amino acids in a protein-altering tertiary structure. If the amino acid that is changed is "essential" to the structural stability of the protein conformation, then the protein function may be significantly

H2N^—COOH Chymotrypsin h2^^xZ)"cooh

Urokinase

H2N^—COOH Chymotrypsin h2^^xZ)"cooh

Urokinase

FIGURE 1.18 Domain shuffling. An extensive shuffling of blocks of protein sequence (protein domains) has occurred during protein evolution. Those portions of a protein denoted by the same shape and shading in this diagram are evolutionarily related. Serine proteases like chymotrypsin are formed from two domains. In the three other proteases shown, which are highly regulated and more specialized, these two protease domains are connected to one or more domains homologous to domains found in epidermal growth factor, to a calcium-binding protein (triangle), or to a "kringle" domain (box) that contains three internal disulfide bridges.

FIGURE 1.18 Domain shuffling. An extensive shuffling of blocks of protein sequence (protein domains) has occurred during protein evolution. Those portions of a protein denoted by the same shape and shading in this diagram are evolutionarily related. Serine proteases like chymotrypsin are formed from two domains. In the three other proteases shown, which are highly regulated and more specialized, these two protease domains are connected to one or more domains homologous to domains found in epidermal growth factor, to a calcium-binding protein (triangle), or to a "kringle" domain (box) that contains three internal disulfide bridges.

impaired or lost. A classic example of such a change is the substitution of valine for glutamate in the b-globin chain of hemoglobin. The substitution of a nonpolar amino acid (valine) for a polar amino acid (glutamate) changes the hydrophobic interactions leading to aggregation of the hemoglobin molecules. They precipitate in the red blood cells, resulting in a change of red blood cell conformation to a "sickle" shape. The sickle-shaped red blood cells hemolyze more readily (sickle cell anemia) and, due to decreased elasticity and misshapen appearance, they can clog small capillaries.6 The disease is manifest in persons who are homozygous for this trait. Although this mutation would normally be selected against because it causes death in homozygous carriers, heterozygous carriers of the sickle-cell trait in parts of Africa are protected because they do not develop sickle cell anemia, and malarial parasites grow poorly in red blood cells of humans who carry the sickle cell trait.6

Certain positions in the amino acid sequence of proteins found in mammals are observed to vary across diverse populations. These sequence positions, when they involve single changes in the DNA codon, are termed single nucleotide polymorphisms (SNPs) and sometimes may provide insight into the varying response of individuals with the same disease to therapeutic treatment.6

There are many more examples of changes in the amino acid content of proteins that have no impact because they are not essential to maintaining structural integrity. During the course of protein evolution, the amino acid content of some proteins has changed considerably across species, yet the tertiary structure has remained very

Chymotrypsin Structure

FIGURE 1.19 The conformations of two serine proteases compared. The backbone conformation of elastase and chymotrypsin. Although only those amino acids in the polypeptide chain shaded are the same in the two proteins, the two conformations are very similar nearly everywhere. The active site of each enzyme is circled; this is where the peptide bonds of the proteins that serve as substrates are bound and cleaved by hydrolysis. The serine proteases derive their name from the amino acid serine, whose side chain is part of the active site of each enzyme and directly participates in the cleavage reaction.

FIGURE 1.19 The conformations of two serine proteases compared. The backbone conformation of elastase and chymotrypsin. Although only those amino acids in the polypeptide chain shaded are the same in the two proteins, the two conformations are very similar nearly everywhere. The active site of each enzyme is circled; this is where the peptide bonds of the proteins that serve as substrates are bound and cleaved by hydrolysis. The serine proteases derive their name from the amino acid serine, whose side chain is part of the active site of each enzyme and directly participates in the cleavage reaction.

similar and the proteins have related biochemical functions. For example, the large family of serine proteases, such as the digestive enzymes chymotrypsin, trypsin, and elastase, have similarities of amino acid sequence in the regions of the protein involved in protease activity. In other "nonessential" regions of the protease structure, significant differences in amino acid content exist. When the tertiary structures of the catalytic portion of the enzymes are compared, considerable similarity across serine proteases is observed (Figure 1.19). However, specificity of the serine proteases may differ regarding to the peptide bonds they cleave in proteins.1

There are other examples where the amino acid sequences of two proteins in different orders of organisms are quite different, yet when there tertiary structures are compared, they are quite similar. This occurs when the proteins present in different organisms are derived from similar primordial genes.

Once proteins have been formed, they may undergo further modifications in the cell involving linkage to other molecules such as carbohydrates and lipids. Lipoproteins are multicomponent complexes of proteins and lipids that form distinct molecular aggregates. The protein and lipid in each complex are generally held together by noncovalent bonds. They are involved in transport of lipids in the blood from tissue to tissue, and also participate in lipid metabolism.6 Lipid-linked proteins are also found in cell membranes and fulfill a variety of functions including enzymatic, signaling, structural, and transport (Figure 1.20).

Lipid Raft

Cholesterol

CYTOSOL

Normal Trans Golgi Network Membrane

Lipid Raft

CYTOSOL

Normal Trans Golgi Network Membrane

GPI-anchored Protein

Protein with Short Transmembrane Domain Cannot Enter Lipid Raft

FIGURE 1.20 Model of lipid rafts in the trans Golgi network. Glycosphingolipids and cholesterol are thought to form rafts in the lipid bilayer. Membrane proteins with long enough membrane-spanning segments preferentially partition into the lipid rafts and thus become sorted into transport vesicles. These rafts are subsequently packaged into transport vesicles that carry them to the apical domain of the plasma membrane. Carbohydrate-binding proteins (lectins) in the lumen of the trans Golgi network may help stabilize the rafts as shown.

GPI-anchored Protein

Protein with Short Transmembrane Domain Cannot Enter Lipid Raft

FIGURE 1.20 Model of lipid rafts in the trans Golgi network. Glycosphingolipids and cholesterol are thought to form rafts in the lipid bilayer. Membrane proteins with long enough membrane-spanning segments preferentially partition into the lipid rafts and thus become sorted into transport vesicles. These rafts are subsequently packaged into transport vesicles that carry them to the apical domain of the plasma membrane. Carbohydrate-binding proteins (lectins) in the lumen of the trans Golgi network may help stabilize the rafts as shown.

Glycoproteins contain covalently bound carbohydrates and are produced in the rough endoplasmic reticulum in the cytoplasm (Figure 1.21). Many plasma membrane proteins are glycoproteins. Some glycoproteins determine the blood antigen system (A, B, O) and the histocompatibility and transplantation determinants of an individual. Immunoglobulin antigenic recognition sites and viral and hormone receptor binding sites on plasma membranes are often glycoproteins. Carbohydrates linked to proteins on the surface of cell membranes provide a recognition site for identification by other cells and for contact inhibition in the regulation of cell growth. Changes in membrane glycoproteins have been correlated with tumorigenesis and malignant transformation of cells leading to cancer. Most plasma proteins, except albumin, are glycoproteins, including blood-clotting proteins, immunoglobulins, and many of the complement proteins. Some protein hormones, such as follicle-stimulating hormone (FSH) and thyroid-stimulating hormone (TSH), are glycoproteins. The structural proteins collagen, laminin, and fibronectin contain carbohydrate, as do proteins of mucous secretions that perform a role in lubrication and protection of epithelial tissue.

The percentage of carbohydrate in glycoproteins is variable. IgG contains small amounts of carbohydrate (4%); glycophorin of human red blood cell membranes is 60% carbohydrate and human gastric glycoprotein is 82% carbohydrate. The carbohydrate can be distributed evenly along the polypeptide chain or concentrated in defined regions. Glycoproteins with the same function but from different animal species often have homologous amino acid sequences but variable carbohydrate structures.6

FIGURE 1.21 Protein glycosylation in the rough ER. Almost as soon as a polypeptide chain enters the ER lumen, it is glycosylated on target asparagine amino acids. The precursor oligosaccharide is transferred to the asparagine as an intact unit in a reaction catalyzed by a membrane-bound oligosaccharyl transferase enzyme. As with signal peptidase, one copy of this enzyme is associated with each protein translocator in the ER membrane. (The ribosome is not shown for clarity.)

FIGURE 1.21 Protein glycosylation in the rough ER. Almost as soon as a polypeptide chain enters the ER lumen, it is glycosylated on target asparagine amino acids. The precursor oligosaccharide is transferred to the asparagine as an intact unit in a reaction catalyzed by a membrane-bound oligosaccharyl transferase enzyme. As with signal peptidase, one copy of this enzyme is associated with each protein translocator in the ER membrane. (The ribosome is not shown for clarity.)

Fantastic Organic Food Facts

Fantastic Organic Food Facts

Get All The Support And Guidance You Need To Be A Success At Utilizing Organic Foods. This Book Is One Of The Most Valuable Resources In The World When It Comes To Getting The Right Information About Eating Healthy With Organic Food.

Get My Free Ebook


Post a comment