User:Eric Martz/Introduction to Structural Bioinformatics I

From Proteopedia

Jump to: navigation, search

How to find, visualize, and understand 3D protein molecular structures
by Eric Martz, October 2 and 4, 2012
for Prof. Steven Sandler's course Microbiology 565: Laboratory in Molecular Genetics
University of Massachusetts, Amherst MA USA
Get here with 565.MolviZ.Org


I. Computer Lab Preparation (BCRC)

  1. Log in
  2. Run Firefox
  3. Go to (do NOT type www).
  4. If you see inactive plug-in, click on it and ENABLE the plug-in.
  5. Restart the browser and again go to
  6. When you see a rotating 3D molecular structure, you are prepared.
  7. Take a look around Proteopedia. Click on the PDB codes below, or the Random links to see other molecules.

II. Protein Structure and Structural Bioinformatics

About this image
1. Amino acid sequence + protein chain conformation = protein function.
A. Why do we care?
B. Conformation can be a stable fold or intrinsically unstructured. Both commonly exist in the same protein molecule.
C. Conformation is specified by sequence.
  • Folded domains fold spontaneously (Anfinson, 1960's[1]), or with the help of chaperonins.
  • The denaturation (unfolding) of a folded domain destroys its function.

2. Backbone Representation.
A. Peptides and Backbones (within a tutorial on hemoglobin)
B. Small Protein in FirstGlance (use Vines, Cartoon)

3. Structure Knowledge.
A. Although sequence specifies fold, scientists cannot yet predict the fold from the sequence. Therefore, fold must be determined by empirical (experimental) methods. The most common methods for determining the 3D structure of a protein molecule are:
  • Cannot determine the structure of intrinsically unstructured loops or molecules.
  • Result is a single model representing the average of the molecules in the crystal.
  • Resolution reflects the degree of order or disorder in the crystal.
  • NMR is limited to small proteins (30 kD or smaller).
  • Result is an ensemble of models consistent with the data. Examples: 2bbn
  • High resolution cryo-electron microscopy, 0.5%.
B. These methods are difficult and expensive. Less than 10% of proteins have known structure.
C. All published, empirically determined 3D macromolecular structure models are available from the Protein Data Bank (PDB;; About the PDB).
D. Each model has a unique, 4-character accession code called a PDB identification code, for example
E. Crystallographers publish the asymmetric unit of the crystal. It may be identical with the biological unit (the functional form of the molecule), or it may be only part of the biological unit, or it may contain multiple copies of the biological unit. See examples.
Interchain contacts that occur in the asymmetric unit that are absent in the biological unit are an artifact of crystallization, termed crystal contacts.

III. Choose a Molecule to Explore

  • Choose a molecule that includes protein. It may also include ligand and/or nucleic acid, but must have protein.
  • Be sure to note the 4-character PDB code of the molecule you choose. The PDB code makes it easy to retrieve the molecule and information about it. Here are some ways to find a protein with known structure:
  1. Atlas of Macromolecules (Atlas.MolviZ.Org). Choose a "straightforward" molecule that has ligand.
  2. Structural View of Biology at the PDB.
  3. Molecule of the Month at the PDB.
  4. Topic Pages in Proteopedia.
  5. Random PDB Entry in Proteopedia (see random box at top left of this page).
  6. Search by molecule name or amino acid sequence at, but remember that less than 10% of proteins have known structure.

IV. Explore Your Molecule

1. Start in Proteopedia

Open Proteopedia in a new browser tab and enter your PDB code in the search slot at the left. We will use the following information offered by Proteopedia:

A. The title of the study, which usually includes the name of the molecule.
B. The abstract of the publication about this structure, which usually mentions the function of the molecule if known.
C. The number of polymer chains under About this Structure.
D. Full names of ligands and non-standard residues (displayed when their green links are clicked beneath the molecule). Example: 2src.
E. Evolutionary conservation.
See Introduction to Evolutionary Conservation.
F. The popup button for enlarging the molecular scene.
G. A link to display the molecule in FirstGlance in Jmol (in the Resources block under the molecule).

2. Continue in FirstGlance in Jmol

In Proteopedia, use the link to FirstGlance in the Resources block under the molecule to display your molecule in FirstGlance in Jmol.

Try out the first six views (links) at the upper left, and any other controls that interest you. In particular, we will use these capabilities of FirstGlance in the Powerpoint report:

A. Hydrophobic/Polar

  • Water-soluble proteins have polar/charged amino acids nearly everywhere on their surfaces (Examples: small 2hhd, large 1igy). Patches of hydrophobic amino acids on the surfaces of soluble proteins are usually less than ~10 å in their smaller diameter, and usually recessed.
  • Hydrophobic surface patches may be buried in chain-to-chain contacts -- check the biological unit (example: lac repressor homodimer).
  • Large, protruding hydrophobic surface areas (>25 Å in their smaller diameter) may indicate transmembrane proteins (insoluble). Examples:

B. Charge

Most proteins have roughly equal numbers of positive and negative charges intermixed on their surfaces. Surface patches of exclusively positive charge often bind nucleic acids (negatively charged because of their phosphates). For example, examine the protein surface charges where the gal4 transcriptional regulator binds DNA (1d66).

V. Powerpoint Report

Save your report with the filename yourLastName-565.pptx, for example sandler-565.pptx. When completed, your Powerpoint report is to be emailed to for grading.

Each slide MUST be labeled at the top with its section number, e.g. Section 1.

Each Section below may be answered in a single slide, or multiple slides. For example, suppose you want to show two snapshots for Section 3, and make separate comments. You may choose to use two slides, labeled Section 3A and Section 3B.

This is not a test. It is to help you learn by doing. Ask for help!
Sample Completed Powerpoint Assignment (You may download it, rename the file, and use it as a template.)

Section 1: Identity

  • The label Section 1 at the top (and so forth for every slide).
  • Your name.
  • Your major; grad students, give the name of your grad program (Micro, MCB, etc.) and whose lab you work in.
  • Your PDB identification code.
  • The name of your molecule.
  • The function of your molecule.
  • The resolution or number of models (given in Proteopedia immediately under the molecule). The experimental method used to determine the structure.
    • A resolution usually implies that the method is X-ray crystallography.
    • A number of models usually implies that the method is NMR.
    • To double check, in Proteopedia, click on the link RCSB and at the RCSB PDB, look in the box at the lower right, Experimental Details.
  • The number of polymer chains (protein or nucleic acid) present. (Given in Proteopedia in the section About this Structure.)
  • A snapshot of your molecule. (See instructions for taking static snapshots, also linked at the bottom left in FirstGlance.)

Section 2: Ligands and Non-Standard Residues

Give the 3-letter abbreviations and full names for all ligands and non-standard residues. If none, so state. (Standard residues)

Proteopedia lists the 1 to 3-letter abbreviations for each ligand and non-standard residue in green links under the molecule. Click on each one to see its full name shown in red at the bottom of the molecule.

Section 3: Evolutionary Conservation

Does your molecule have a highly conserved region? If so, what is its function? If there is no highly conserved region, is there a highly variable region? Show a snapshot illustrating a highly conserved (or variable) region.

Click on Evolutionary Conservation in Proteopedia. Toggle the quality button to high quality. Use the popup button to enlarge the high quality image. Problems? See How to see conserved regions.

Section 4: Hydrophobic/Polar

Do you think your molecule is water soluble? Support your conclusion with a snapshot.

Section 5: Charge

Are there any areas on the surface of your molecule with only positive (or negative) charges? Show a snapshot illustrating your conclusions.

Section 6: Biological Unit

How many polymer chains (protein, DNA or RNA) are in the biological unit? The asymmetric unit?

A. The asymmetric unit is what you see in Proteopedia or FirstGlance, when you use the PDB code.
B. In a new browser tab, go to MakeMultimer.
C. Enter your PDB code. Leave all other options at their defaults. Click Submit.
D. Pay attention to the tables, especially the "Chain" column (model made by MakeMultimer), vs. the "original" column (original chain names).
E. Click "View in FirstGlance".

Show side-by-side two snapshots comparing the asymmetric unit with the biological unit. The Cartoon representation in FirstGlance is best for these snapshots. Make sure to label which is which.

Section 7: Animation from Polyview-3D

Minimal steps to make an animation:

  1. Go to Polyview-3D.
  2. Enter your PDB code in the PDB ID slot near the top.
  3. Change "Type of request" from "Single slide" to "Animation". It is under the Image Settings section near the bottom.
  4. Click any "Preview" link.
  5. Optional: If you want to modify the orientation or zoom of the molecule, click on View by Jmol / Set orientation under Quick links at the upper left of the page. Use the mouse to rotate and zoom in Jmol. Then click the Set and close button.
  6. Optional: If you want to change the colors, hide portions of the molecule, emphasize certain residues, etc. feel free to try out these options in the form, using Preview to check your results.
  7. In the "Animation Settings" section at the bottom of the page, set Delay to 10/100.
  8. Change "Angle step" to 5 degrees.
  9. Check "Rocking".
  10. Change "angle range" for rocking to 30 degrees.
  11. Click "Get 3D Image".

The above steps are the minimum for an animation that avoids putting a heavy load on the server. Feel free to try other options, but while the class is in session, please don't make a large (>300 pixel) animation, or increase the angle range, or decrease the angle step size. Otherwise, the server may get overloaded and take a very long time to produce results. After class is over, feel free to submit more demanding jobs. If you highlight specific residues, please explain why.

In Powerpoint, animations move only when the slides are projected (full-screen).

Windows Powerpoint: Simply drag the animation directly from the Polyview-3D web page and drop it into a Powerpoint slide.

Mac Powerpoint: The method below produces a slide that will animate continuously. Other methods we have tried do not.

  1. Control-Click on the animation in the Polyview-3D web page, and select Save Image As ...
  2. Save the image to the Desktop.
  3. Drag the image file (filename ending in .gif) from the Desktop and drop it into a Powerpoint slide.
  4. As stated above, the animation will run only when the saved .ppt file is projected (full screen).

Section 8 - Optional: Contacts/Non-covalent Bonds

  1. Use the Contacts tool in FirstGlance.
  2. Change target selection to Residues/Groups.
  3. Click on something small to select it as a "target", such as a ligand, or a single amino acid.
  4. Click the link to Show atoms contacting target.
  5. Click Center contacts.
  6. Uncheck Backbones.
  7. Zoom in (and click Return to Contacts if necessary).
  8. Uncheck all categories of non-covalent bonds.
  9. Check hydrogen-bonded non-water. (Review hydrogen bonds.)
  10. Double click the hydrogen bond donor and acceptor atoms to insert a distance monitor.

Describe the moiety you selected as a target. Include a snapshot showing a hydrogen bond.

VI. See Also

VII. Notes and References

  1. For a brief overview of Anfinson's protein folding experiments in the 1960's, see the first paragraph at Intrinsically Disordered Protein.

Proteopedia Page Contributors and Editors (what is this?)

Eric Martz

Personal tools