Basics of Protein Structure
From Proteopedia
This tutorial illustrates some basic properties of protein structure for a general audience. For a more in depth discussion, please visit Introduction to protein structure. Words shown in green change the protein view in the box to the right; blue words are links to other pages. Proteins perform many important functions in living organisms, including movement, immune responses, sensing the environment, energy acquisition, and catalyzing reactions. The protein shown to the right is insulin; when insulin isn't properly synthesized or recognized, diabetes occurs. Proteins are long chains of amino acids, and are synthesized by the ribosome, using messenger RNA as a template. There are 20 amino acids commonly found in proteins. contain an , a central carbon atom called the , and a . The 20 amino acids differ by what is attached to the central atom; is variable portion is referred to as the . The amino acid shown is alanine; its side chain is a methyl (-CH3) group. The atoms are displayed using the coloring convention Carbon, Hydrogen , Oxygen, Nitrogen: C, H, O, N. Proteins are sometimes compared to , where each amino acid residue is a bead. These long chains form complicated structures that allow them to perform their function. Even small alterations in any level of the structure can change how the protein does its job, and can lead to diseases. Ways of representing protein structureProtein structures can be displayed in many different ways. In models, all of the non-hydrogen atoms are shown as spheres with their van der Waals radii. This view is the easiest to use to see holes, clefts or other large scale features, but it is hard to identify individual amino acids or finer structural details. In the model, the atoms are shown as smaller balls, connected by sticks; this is further simplified in the model, which only shows the bonds between atoms. representation shows only the N-Calpha-C=O repeating unit; the side chains are omitted. The representation is based upon the backbone, but highlights specific secondary structures (more on that later!). Levels of Protein StructureThere are four different levels of protein structure. The is the amino acid sequence. The amino acids are connected by an amide bond, made from the amino group (NH2) of one amino acid, and the carboxylic acid (C=O) from another amino acid. In the process of making the bond, a water molecule is removed. The amino acids are linked in a repeating pattern. The backbone of the protein is the repeating pattern, with the projecting out from the backbone. The end with the free -NH2 group is called the Amino or , while the end with a free carboxylic acid is called the . Notice that most protein structure representations do not show the hydrogens. The sequence of amino acids is written and numbered from the N terminus (where protein synthesis begins) to the C terminus (where amino acids are added during protein synthesis), so for , the sequence would be Val-Asn-Gln, or VNQ, if one letter abbreviations are used for the amino acids. For more practice identifying peptide bonds between amino acids, please try Peptide tutorial 1 part 1 and Peptide tutorial 1 part 2. The second level of structure is called secondary structure, and is the shapes (conformations) formed by short sequences of amino acids. This level of structure is stabilized by along the backbone. Hydrogen bonds are attractions between an N, O or F and a hydrogen attached to an N, O or F (More about hydrogen bonds.) The two most common shapes are alpha helices and beta strands. These are favored simply because two atoms cannot occupy the same space (steric collisions). Insulin only contains ; they are shown in pink. The third level of structure, or tertiary structure, is how the secondary structures pack together to form the overall form of the entire peptide chain. Side chains play an important role in tertiary structure formation, especially the burying of hydrophobic ("water fearing") amino acids in the middle of the structure. In , Hydrophobic residues are grey and Polar atoms are shown in light purple. Water molecules are shown with red balls; notice that they tend to be close to the hydrophilic (water loving) groups. Some proteins, like insulin, are also stabilized by (shown in yellow) called disulfide bonds. Not all proteins have the fourth level of structure, quaternary structure. Quaternary structure is the association of more than one chain to form a larger structure. Insulin forms a . In this view, each insulin monomer is shown in a different color. Quaternary structure can be very important in how the protein functions. Minor changes in insulin's sequence leads to tighter or weaker association between the chains, and is the difference between long lasting and quick acting insulin. For a more in depth discussion about insulin's structure and function, please visit the Insulin page. Protein Structure DataThe World Wide Protein Data Bank (WWPDB) is where all experimentally-determined published protein structures are made freely available. Each model has a unique accession code, called a PDB code. One model of human insulin, shown at right has the PDB code 3i40. Many examples are illustrated in the Atlas of Macromolecules. Looking for a model of a specific protein? See Is there an empirical model? After you find a PDB code of interest, see Introduction to molecular visualization. Further Reading
|