AlphaFold2 examples from CASP 14

From Proteopedia

(Difference between revisions)
Jump to: navigation, search
Line 15: Line 15:
<scene name='87/875686/Chain_a_of_7jx6/1'>Here is one chain of ORF8</scene> from the higher resolution X-ray structure, [[7jx6]]. These chains form [http://firstglance.jmol.org/fg.htm?mol=7jx6 disulfide-linked dimers], and the dimers form higher order multimers<ref name="multimers">PMID: 33361333</ref> (not shown). Notice that the <span class="text-blue"><b>amino</b></span> and <span class="text-red"><b>carboxy</b></span> '''ends of the chain come together''' to form two parallel beta strands of a beta sheet. Also notice that there are '''3 disulfide bonds'''. An accurate prediction would include both of these features.
<scene name='87/875686/Chain_a_of_7jx6/1'>Here is one chain of ORF8</scene> from the higher resolution X-ray structure, [[7jx6]]. These chains form [http://firstglance.jmol.org/fg.htm?mol=7jx6 disulfide-linked dimers], and the dimers form higher order multimers<ref name="multimers">PMID: 33361333</ref> (not shown). Notice that the <span class="text-blue"><b>amino</b></span> and <span class="text-red"><b>carboxy</b></span> '''ends of the chain come together''' to form two parallel beta strands of a beta sheet. Also notice that there are '''3 disulfide bonds'''. An accurate prediction would include both of these features.
-
<scene name='87/875686/Morf_lin_7jx6_imf_7jtl/3'>The two X-ray structures agree very well</scene><ref name="imf">Alignment by Swiss-PdbViewer's ''iterative magic fit''. This starts with a sequence alignment-guided structural alignment, and then selects subsets of the structures to minimize the RMSD. Eight intermediate structures were generated by the [[Morphs#Linear_Morph_Server|Theis Morph Server]] by linear interpolation.</ref>. The only substantial disagreement is for a large surface loop, sequence range 48-57. See the TABLE below for [https://en.wikipedia.org/wiki/Root-mean-square_deviation_of_atomic_positions RMSD] values.
+
<scene name='87/875686/Morf_lin_7jx6_imf_7jtl/3'>The two X-ray structures agree very well</scene><ref name="imf">Alignment by Swiss-PdbViewer's ''iterative magic fit''. This starts with a sequence alignment-guided structural alignment, and then selects subsets of the structures to minimize the RMSD. Eight intermediate structures were generated by the [[Morphs#Linear_Morph_Server|Theis Morph Server]] by linear interpolation.</ref>. The only substantial disagreement is for a large surface loop, sequence range 48-57. See the Table I below for [https://en.wikipedia.org/wiki/Root-mean-square_deviation_of_atomic_positions RMSD] values.
===ORF8 is not a novel fold===
===ORF8 is not a novel fold===
-
Less than 2% of new [[empirically-determined structures]] have novel folds; that is, folds not aready represented in the [[PDB]]<ref name="cath2011">PMID: 21097779</ref>.
+
Less than 2% of new [[empirically-determined structures]] have novel folds; that is, folds not aready represented in the [[PDB]]<ref name="cath2011">PMID: 21097779</ref>. When chain A of [[7jx6]] was submitted to Dali<ref name="dali2020">PMID: 31606894</ref> (February, 2021), the top hit was one of the two domains in [[5a2f]], the CD166 human cell surface receptor involved in activation of T lymphocytes. The Z-score was 7.1, and 88 alpha carbons aligned with RMSD 3.2 Å. See Table I below for further analysis.
===AlphaFold2 Prediction for ORF8===
===AlphaFold2 Prediction for ORF8===
-
The quality of a prediction in CASP is judged, in large part, by the [[Theoretical_models#CASP_14_Global_Distance_Test_Results|Global Distance Test Total Score, GDT_TS]]. AlphaFold2's predicted structure<ref>Download AlphaFold2's predicted structure for ORF8 from [https://predictioncenter.org/casp14/MODELS_PDB/T1064-D1/T1064TS427_1-D1.pdb T1064TS427_1-D1.pdb].</ref> has a '''GDT_TS score of 87'''. (A score of 0 is meaningless, and a score of 100 means perfect agreement with an X-ray crystal structure.) 87 means <scene name='87/875686/Af2_vs_7jx6_chain_a/1'>the model is close to the accuracy of an X-ray crystal structure</scene><ref name="imf" />. The structure predicted by AlphaFold2 is '''almost as close to the X-ray crystallographic model''' [[7jx6]] as is the independently-determined X-ray structure [[7jtl]]. AlphaFold2 predicted the positions of 92 amino acids. (CASP 14 excluded residues 48-59, a 12-residue surface loop, from the target residues<ref name="casp14domains" />.) See TABLE below for [https://en.wikipedia.org/wiki/Root-mean-square_deviation_of_atomic_positions RMSD] values. The only intra-chain '''salt bridge''' (4.0 Å cutoff) that occurred in all 4 chains of the two X-ray structures was Arg101:Asp113. It also occurred in the AlphaFold2 prediction (Arg86:Asp98).
+
The quality of a prediction in CASP is judged, in large part, by the [[Theoretical_models#CASP_14_Global_Distance_Test_Results|Global Distance Test Total Score, GDT_TS]]. AlphaFold2's predicted structure<ref>Download AlphaFold2's predicted structure for ORF8 from [https://predictioncenter.org/casp14/MODELS_PDB/T1064-D1/T1064TS427_1-D1.pdb T1064TS427_1-D1.pdb].</ref> has a '''GDT_TS score of 87'''. (A score of 0 is meaningless, and a score of 100 means perfect agreement with an X-ray crystal structure.) 87 means <scene name='87/875686/Af2_vs_7jx6_chain_a/1'>the model is close to the accuracy of an X-ray crystal structure</scene><ref name="imf" />. The structure predicted by AlphaFold2 is '''almost as close to the X-ray crystallographic model''' [[7jx6]] as is the independently-determined X-ray structure [[7jtl]]. AlphaFold2 predicted the positions of 92 amino acids. (CASP 14 excluded residues 48-59, a 12-residue surface loop, from the target residues<ref name="casp14domains" />.) See Table I below for [https://en.wikipedia.org/wiki/Root-mean-square_deviation_of_atomic_positions RMSD] values. The only intra-chain '''salt bridge''' (4.0 Å cutoff) that occurred in all 4 chains of the two X-ray structures was Arg101:Asp113. It also occurred in the AlphaFold2 prediction (Arg86:Asp98).
{| style="text-align:center;" class="wikitable"
{| style="text-align:center;" class="wikitable"
-
|+ ORF8 Alignments With Chain A of [[7jx6]]
+
|+ Table I. ORF8 Predictions Aligned With Chain A of [[7jx6]]
|-
|-
! Model || GDT_TS || Disulfde<br>Bonds || C&alpha; [https://en.wikipedia.org/wiki/Root-mean-square_deviation_of_atomic_positions RMSD], Å || C&alpha; Aligned || [https://en.wikipedia.org/wiki/Root-mean-square_deviation_of_atomic_positions RMSD] Including<br>Sidechains, Å || Atoms Aligned
! Model || GDT_TS || Disulfde<br>Bonds || C&alpha; [https://en.wikipedia.org/wiki/Root-mean-square_deviation_of_atomic_positions RMSD], Å || C&alpha; Aligned || [https://en.wikipedia.org/wiki/Root-mean-square_deviation_of_atomic_positions RMSD] Including<br>Sidechains, Å || Atoms Aligned
Line 49: Line 49:
===Second Best Prediction for ORF8===
===Second Best Prediction for ORF8===
-
In CASP 14, 70 research groups and 42 automated servers predicted structures for ORF8. The median GDT_TS score for all 112 predictions was 26. AlphaFold2 made the best prediction (GDT_TS 87). <scene name='87/875686/Second_best_orf8_imf/1'>The second best prediction was by the group of Xian Ming Pan</scene>, with GDT_TS 43 (see TABLE above). The fold and topology were predicted correctly, but the '''details are far less accurate''' than those in AlphaFold2's prediction. The 2nd best prediction has '''no disulfide bonds'''. The '''salt bridge''' Arg86:Asp98 is correctly predicted, along with two incorrectly predicted salt bridges.
+
In CASP 14, 70 research groups and 42 automated servers predicted structures for ORF8. The median GDT_TS score for all 112 predictions was 26. AlphaFold2 made the best prediction (GDT_TS 87). <scene name='87/875686/Second_best_orf8_imf/1'>The second best prediction was by the group of Xian Ming Pan</scene>, with GDT_TS 43 (see Table I above). The fold and topology were predicted correctly, but the '''details are far less accurate''' than those in AlphaFold2's prediction. The 2nd best prediction has '''no disulfide bonds'''. The '''salt bridge''' Arg86:Asp98 is correctly predicted, along with two incorrectly predicted salt bridges.
===Third Best Prediction for ORF8===
===Third Best Prediction for ORF8===
-
The third best prediction for ORF8 was by the Perez Lab, with GDT_TS 33 (see TABLE above). It '''correctly predicted the parallel beta strands formed by the amino and carboxy terminal ends of the chain'''. <scene name='87/875686/3rd_best_orf8/1'>When the 2-stranded parallel beta strands formed by the ends of the chains are aligned, the remainder aligns poorly</scene>. This prediction has '''no disulfide bonds'''. The '''salt bridge''' Arg86:Asp98 is correctly predicted, along with two incorrectly predicted salt bridges.
+
The third best prediction for ORF8 was by the Perez Lab, with GDT_TS 33 (see Table I above). It '''correctly predicted the parallel beta strands formed by the amino and carboxy terminal ends of the chain'''. <scene name='87/875686/3rd_best_orf8/1'>When the 2-stranded parallel beta strands formed by the ends of the chains are aligned, the remainder aligns poorly</scene>. This prediction has '''no disulfide bonds'''. The '''salt bridge''' Arg86:Asp98 is correctly predicted, along with two incorrectly predicted salt bridges.
===Top Prediction by an Automated Server===
===Top Prediction by an Automated Server===

Revision as of 21:34, 28 February 2021

This page is under construction. Eric Martz 01:03, 22 February 2021 (UTC)

Prediction of protein structures from amino acid sequences, theoretical modeling, has been extremely challenging. In 2020, breakthrough success was achieved by AlphaFold2[1], a project of DeepMind. For an overview of this breakthrough, documented by the bi-annual prediction competition CASP, please see 2020: CASP 14. Below are illustrated some examples of predictions from that competition.

Drag the structure with the mouse to rotate

References

  1. Senior AW, Evans R, Jumper J, Kirkpatrick J, Sifre L, Green T, Qin C, Zidek A, Nelson AWR, Bridgland A, Penedones H, Petersen S, Simonyan K, Crossan S, Kohli P, Jones DT, Silver D, Kavukcuoglu K, Hassabis D. Improved protein structure prediction using potentials from deep learning. Nature. 2020 Jan;577(7792):706-710. doi: 10.1038/s41586-019-1923-7. Epub 2020 Jan, 15. PMID:31942072 doi:http://dx.doi.org/10.1038/s41586-019-1923-7
  2. CASP14: what Google DeepMind’s AlphaFold 2 really achieved, and what it means for protein folding, biology and bioinformatics, a blog post by Carlos Outeir al Rubiera, December 3, 2020.
  3. Flower TG, Buffalo CZ, Hooy RM, Allaire M, Ren X, Hurley JH. Structure of SARS-CoV-2 ORF8, a rapidly evolving immune evasion protein. Proc Natl Acad Sci U S A. 2021 Jan 12;118(2). pii: 2021785118. doi:, 10.1073/pnas.2021785118. PMID:33361333 doi:http://dx.doi.org/10.1073/pnas.2021785118
  4. 4.0 4.1 Summary and Classifications of Domains for CASP 14.
  5. Flower TG, Buffalo CZ, Hooy RM, Allaire M, Ren X, Hurley JH. Structure of SARS-CoV-2 ORF8, a rapidly evolving immune evasion protein. Proc Natl Acad Sci U S A. 2021 Jan 12;118(2). pii: 2021785118. doi:, 10.1073/pnas.2021785118. PMID:33361333 doi:http://dx.doi.org/10.1073/pnas.2021785118
  6. 6.0 6.1 6.2 Alignment by Swiss-PdbViewer's iterative magic fit. This starts with a sequence alignment-guided structural alignment, and then selects subsets of the structures to minimize the RMSD. Eight intermediate structures were generated by the Theis Morph Server by linear interpolation.
  7. Cuff AL, Sillitoe I, Lewis T, Clegg AB, Rentzsch R, Furnham N, Pellegrini-Calace M, Jones D, Thornton J, Orengo CA. Extending CATH: increasing coverage of the protein structure universe and linking structure with function. Nucleic Acids Res. 2011 Jan;39(Database issue):D420-6. doi: 10.1093/nar/gkq1001. , Epub 2010 Nov 19. PMID:21097779 doi:http://dx.doi.org/10.1093/nar/gkq1001
  8. Holm L. DALI and the persistence of protein shape. Protein Sci. 2020 Jan;29(1):128-140. doi: 10.1002/pro.3749. Epub 2019 Nov 5. PMID:31606894 doi:http://dx.doi.org/10.1002/pro.3749
  9. Download AlphaFold2's predicted structure for ORF8 from T1064TS427_1-D1.pdb.
  10. Alignment by Swiss-PdbViewer's magic fit. This is a sequence alignment-guided structural alignment. Eight intermediate structures were generated by the Theis Morph Server by linear interpolation.
  11. For all targets in CASP 14, the top two servers were QUARK and Zhang-server (which were not significantly different at a Z-score sum of 62.9), followed by Zhang-CEthreader (55.9) and BAKER-ROSETTASERVER (55.3).

Proteopedia Page Contributors and Editors (what is this?)

Eric Martz

Personal tools