User:Nikhil Malvankar/MB&B 420a-720a 2019
From Proteopedia
How to visualize, understand and share/present 3D protein molecular structures
by Nikhil Malvankar, 2019
Adapted from Eric Martz
for
MB&B 420a/720a Macromolecular Structure Fall 2019
Yale University USA
I. Getting Started
- Go to Atlas.MolviZ.Org.
- In the Atlas, choose any molecule deemed Straightforward and click on the link to FirstGlance. After a minute or so to load, you should see a rotating molecule.
- If you have any difficulty or the molecule does not appear, ask for help!
II. Goals
1. Review principles of protein 3D structure.
2. Choose an existing experimentally-determined 3D protein structure model to investigate.
3. Learn how FirstGlance in Jmol makes it easy to see structure-function relationships in the protein you chose.
4. Write a report including snapshots of your protein that illustrate your answers to the questions below. (Your report will be in the form of Slides emailed to TAs. You will not present your report in class.)
III. Protein Structure and Structural Bioinformatics
|
- 1. Amino acid sequence + protein chain conformation = protein function.
- A. Why do we care about protein 3D structure?
- B. Conformation can be a stable fold or intrinsically unstructured. Both commonly exist in the same protein molecule.
- C. Conformation is specified by sequence.
- Folded domains fold spontaneously (Anfinson, 1960's[1]), or with the help of chaperonins.
- The denaturation (unfolding) of a folded protein domain destroys its function.
- 2. Backbone Representation.
- A. Backbone Representations
- B. Small Protein in FirstGlance (use the Views Tab: Vines, Cartoon)
- 3. Structure Knowledge.
- A. Although sequence specifies fold, scientists cannot yet predict the fold from the sequence. Therefore, fold must be determined by empirical (experimental) methods. The most common methods for determining the 3D structure of a protein molecule are:
- X-ray crystallography, 88%.
- Result is a single model representing the average of the molecules in the crystal.
- Resolution reflects the degree of order or disorder in the crystal.
- X-ray crystallography gives no models for intrinsically unstructured loops or molecules.
- Nuclear magnetic resonance (NMR) in aqueous solution, 9%.
- NMR is limited to small proteins (30 kD or smaller; median NMR in PDB is 10K; median X-ray is 50K).
- Result is an ensemble of models consistent with the data. Examples: 2bbn
- High resolution cryo-electron microscopy, 0.8%.
- A. Although sequence specifies fold, scientists cannot yet predict the fold from the sequence. Therefore, fold must be determined by empirical (experimental) methods. The most common methods for determining the 3D structure of a protein molecule are:
- B. These methods are difficult and expensive. Less than 10% of proteins have known structure.
- C. All published, empirically determined 3D macromolecular structure models are available from the Protein Data Bank (PDB; pdb.org; About the PDB).
- E. Crystallographers publish the asymmetric unit of the crystal. It may be identical with the biological unit (the functional form of the molecule), or it may be only part of the biological unit, or it may contain multiple copies of the biological unit. See examples.
- Interchain contacts that occur in the asymmetric unit, which are absent in the biological unit, are an artifact of crystallization, termed crystal contacts.
- E. Crystallographers publish the asymmetric unit of the crystal. It may be identical with the biological unit (the functional form of the molecule), or it may be only part of the biological unit, or it may contain multiple copies of the biological unit. See examples.
IV. Choose a Molecule to Explore
- Choose a molecule to use for your report.
- Each student should choose a different molecule.
- Be sure to note the 4-character PDB code of the molecule you choose. The PDB code makes it easy to retrieve the molecule and information about it.
- Report the PDB code you chose to the instructor to make sure it is not already taken.
- It must have protein.
- It will be more interesting if it contains some non-protein: ligand, or DNA or RNA.
- X-ray results should have resolution of 3 Å or better.
- Here are some ways to find a protein with known structure:
- Recommended: Atlas of Macromolecules (Atlas.MolviZ.Org). Choose a "Straightforward" or "Challenging" (not "Enormous") molecule that has protein and ligand.
- Molecule of the Month at the PDB. Look for PDB codes in the article, and use FirstGlance to view them.
- Topic Pages in Proteopedia, or its Table of Contents.
- Random PDB Entry in Proteopedia (see Random at top left of this page in the navigation box).
- Search by molecule name or amino acid sequence at www.pdb.org, but remember that less than 10% of proteins have known structure. See also Practical Guide to Homology Modeling which includes instructions for finding empirical 3D models for a protein sequence.
V. Explore Your Molecule
FirstGlance in Jmol
The main tool we will use is FirstGlance in Jmol: FirstGlance.Jmol.Org. (To google it later, use the single word (no space) firstglance.)
- Enter your 4-character PDB code at FirstGlance, and you should see the molecule you have chosen.
- Get familiar with what the molecule information tab tells you. Ask about anything you don't understand.
- In the Views tab, there are 10 links at the top that show you different aspects of the molecule. Try them all, as well as any of the other tools in FirstGlance that interest you.
Here are two Views in FirstGlance that will be used in your report:
A. Hydrophobic/Polar
- Water-soluble proteins have polar/charged amino acids nearly everywhere on their surfaces (Examples: small 2hhd, large 1igy). Patches of hydrophobic amino acids on the surfaces of soluble proteins are usually less than ~10 å in their smaller diameter, and usually recessed.
- Hydrophobic surface patches may be buried in chain-to-chain contacts -- check the biological unit (example: lac repressor homodimer).
- Large, protruding hydrophobic surface areas (>25 Å in their smaller diameter) may indicate transmembrane proteins (insoluble). Examples:
- 7ahl
- showing bilayer boundaries (click on "Jmol"; ligand toggles boundaries).
- 1bl8
- Gramicidin Channel in Lipid Bilayer.
B. Charge
Most proteins have roughly equal numbers of positive and negative charges intermixed on their surfaces. Surface patches of exclusively positive charge often bind nucleic acids (negatively charged because of their phosphates). For example, examine the protein surface charges where the gal4 transcriptional regulator binds DNA (1d66).
VI. Report Slides
Answer the questions below in slides, using Google Slides or powerpoint.
- Name your report YourLastName-IW2018, for example Malvankar-IW2018. If the name of your report does not begin with your family name, you will lose 2 points. When completed, email a link to your report to TAs for grading.
- (While viewing your report Google slides, click the blue Share button at the upper right, then Get shareable link, and paste the link into the email.)
- You will not be asked to present your report in class.
- Each slide MUST be labeled at the top with its section number, e.g. Section 1.
- Each Section below may be answered in a single slide, or multiple slides. For example, suppose you want to show two snapshots for Section 3, and make separate comments. You may choose to use two slides, labeled Section 3A and Section 3B.
- For full credit, your slides must include at least 2 animations.
- Additional work beyond the minimum required may earn extra credit.
- Due date: 22 October 2018 .
This is not a test. It is to help you learn by doing. Ask for help!
Example of a Completed Report (You may import these slides into a new presentation of your own, and use them as a templates, putting in your own content.)
Section 1: Identity
- The label Section 1 at the top (and so forth for every slide).
- Your name.
- Your major; grad students, give the name of your track (Micro, BQBS, MCGD, etc.) and whose lab you work in.
- Your PDB identification code.
- The name of your molecule.
- The function of your molecule (briefly, one sentence).
- The experimental method and resolution (or number of models for NMR). Available in the molecule information tab in FirstGlance.
- A snapshot of your molecule.
is also linked at the bottom left in FirstGlance. |
Section 2: Composition
- The number of
- Protein chains (What does "chain" mean?)
- DNA chains
- RNA chains
- Available in the molecule information tab in FirstGlance: Chain Details. You can identify a residue in any chain by touching it with the mouse (spinning off!). DNA residues are DA, DG, DC, DT while RNA residues are A, G, C and U.
- Ligands and Non-Standard Residues: Give the one to three-letter abbreviations and full names for all ligands and non-standard residues. If none, so state. (Standard residues)
- The molecule information tab in FirstGlance lists the 1 to 3-letter abbreviations for each ligand and non-standard residue, and their full names. Click on an abbreviation to locate that entity in the model. See also Composition in FirstGlance's Views tab.
Section 3: Evolutionary Conservation
See Introduction to Evolutionary Conservation. (Example 4d7b: Consurf, Result)
Does your molecule have a highly conserved region? If so, what is its function? If there is no highly conserved region, is there a highly variable region? Show two snapshots illustrating a highly conserved region, and a contrasting region.
See How to see conserved regions.
- If Proteopedia lacks a pre-calculated Evolutionary Conservation for your molecule, and you do your own calculation at the ConSurf Server, be sure to include the address of the ConSurf result in your report slide!.
Section 4: Hydrophobic/Polar
Do you think your molecule is water soluble? Support your conclusion with a snapshot. Be sure to use the Hydrophobic/Polar view from FirstGlance in a snapshot. Optionally, you may show other views in other snapshots.
Section 5: Hydrophobic Core
Are there hydrophobic cores in your molecule? For soluble proteins, expect a hydrophobic core in each domain. Support your conclusion with a snapshot. Be sure to use the Hydrophobic/Polar view from FirstGlance and turn on the Slab button.
Section 6: Charge
Are there any areas on the surface of your molecule with only positive (or negative) charges? Show snapshots illustrating your conclusions. Be sure to use the Charge view from FirstGlance in your snapshots.
Section 7: Cation-Pi Interactions
Use CaPTURE program to identify energetically significant cation-pi interactions within proteins
Show a snapshot of an energetically significant cation-pi interaction. Include a distance monitor in your snapshot. Also paste in the report from CaPTURE confirming its energetic significance. The cation-pi interaction tool, and instructions for measuring distances, are in the Tools Tab.
Section 8: Biological Unit
In FirstGlance, in the molecule information tab click Biological Unit. (It is also in the Resources Tab.)
How many total polymer chains (protein + DNA + RNA) are in the asymmetric unit? The Biological unit?
Show side-by-side two snapshots comparing the asymmetric unit with the biological unit. The Cartoon representation in FirstGlance is best for these snapshots. Make sure to label which is which.
Section 9 - Contacts/Non-covalent Bonds
Example: 4d7b.
- Click Contacts in the Tools Tab in FirstGlance.
- Change target selection to Residues/Groups.
- Click on something small to select it as a "target", such as a ligand, or a single amino acid. Choose an amino acid with an uncharged polar side chain, such as Ser, Thr, Asn, Gln, Tyr, His.
- Click the link to Show atoms contacting target.
- Click Center contacts.
- Uncheck Backbones.
- Click the 4th thumbnail image to display the contacts as balls and sticks colored by element. The element color key is at the bottom of the Contacts help panel in FirstGlance.
- Zoom in (and click Return to Contacts if necessary).
- Uncheck all categories of non-covalent bonds.
- Check hydrogen-bonded non-water. (Review hydrogen bonds.)
- Double click the hydrogen bond donor and acceptor atoms to insert a distance monitor.
Describe the moiety you selected as a target. Include a snapshot showing exactly one hydrogen bond. Be sure to identify the two entities (amino acids, nucleotides, ligand) by name, chain, and sequence number. I need enough detail to be able to reproduce what you are reporting.
Section 10 - How Structure Supports Function
Write a brief description in your own words (avoid plagiarism!) of how the structure of this protein supports its function. Doing some online research will strengthen your description.
Include links to supporting references. Wikipedia can be cited, but authoritative sources, such as peer-reviewed scientific journal articles or government websites, will have more weight.
Your description should be at least 75 words. More work, if well done, will earn more credit.
Section 10 in the Sample Report is longer than the minimum required, but illustrates the sort of thing that could be done if you can spend the time.
VII. See Also
- User:Eric Martz/Introduction to Structural Bioinformatics, a list of courses and workshops at various levels, including earlier versions of this segment.
VIII. Notes and References
- ↑ For a brief overview of Anfinson's protein folding experiments in the 1960's, see the first paragraph at Intrinsically Disordered Protein.