AlphaFold2 examples from CASP 14
From Proteopedia
(Difference between revisions)
| Line 8: | Line 8: | ||
Following the discussion by Rubiera<ref name="rubiera">[https://www.blopig.com/blog/2020/12/casp14-what-google-deepminds-alphafold-2-really-achieved-and-what-it-means-for-protein-folding-biology-and-bioinformatics/ CASP14: what Google DeepMind’s AlphaFold 2 really achieved, and what it means for protein folding, biology and bioinformatics], a blog post by Carlos Outeir al Rubiera, December 3, 2020.</ref>, | Following the discussion by Rubiera<ref name="rubiera">[https://www.blopig.com/blog/2020/12/casp14-what-google-deepminds-alphafold-2-really-achieved-and-what-it-means-for-protein-folding-biology-and-bioinformatics/ CASP14: what Google DeepMind’s AlphaFold 2 really achieved, and what it means for protein folding, biology and bioinformatics], a blog post by Carlos Outeir al Rubiera, December 3, 2020.</ref>, | ||
our first example will be [[SARS-CoV-2 protein ORF8]], a protein that contributes to virulence in COVID-19<ref name="7jtl">PMID: 33361333</ref>. CASP 14 classified ORF8 as a "free modeling" (FM) target<ref name="casp14domains">[https://predictioncenter.org/casp14/domains_summary.cgi Summary and Classifications of Domains for CASP 14].</ref>, meaning that there were no adequate empirical templates for [[homology modeling]]. This was easily confirmed. When the [https://www.uniprot.org/uniprot/P0DTC8 amino acid sequence of ORF8] is submitted to [https://swissmodel.expasy.org/ Swiss Model], it reports the best templates for homology modeling. When the two [[empirical models]] that were not available during CASP 14 are excluded ([[7jtl]] and [[7jx6]]), the best template offered, chain B of [[3afc]], covers only 36% of the length of ORF8 at 13.2% sequence identity, with a 4-residue untemplated gap in the sequence alignment. This template would be inadequate for constructing a useful model. | our first example will be [[SARS-CoV-2 protein ORF8]], a protein that contributes to virulence in COVID-19<ref name="7jtl">PMID: 33361333</ref>. CASP 14 classified ORF8 as a "free modeling" (FM) target<ref name="casp14domains">[https://predictioncenter.org/casp14/domains_summary.cgi Summary and Classifications of Domains for CASP 14].</ref>, meaning that there were no adequate empirical templates for [[homology modeling]]. This was easily confirmed. When the [https://www.uniprot.org/uniprot/P0DTC8 amino acid sequence of ORF8] is submitted to [https://swissmodel.expasy.org/ Swiss Model], it reports the best templates for homology modeling. When the two [[empirical models]] that were not available during CASP 14 are excluded ([[7jtl]] and [[7jx6]]), the best template offered, chain B of [[3afc]], covers only 36% of the length of ORF8 at 13.2% sequence identity, with a 4-residue untemplated gap in the sequence alignment. This template would be inadequate for constructing a useful model. | ||
| + | |||
| + | ===X-Ray Structures for ORF8=== | ||
| + | |||
| + | The quality of predictions for the structure of ORF8 are judged by comparison with X-ray crystallographic [[empirical models]] which were not available to the groups making predictions. Shortly after the CASP 14 competition (summer 2020), two X-ray crystal structures were reported for ORF8: [[7jtl]] released August 26, 2020, and [[7jx6]], released September 23, 2020. The resolutions are 2.0 and 1.6 Å respectively, and both have worse than average [[Rfree]] values. | ||
===AlphaFold2 Prediction for ORF8=== | ===AlphaFold2 Prediction for ORF8=== | ||
| Line 13: | Line 17: | ||
The quality of a prediction in CASP is judged, in large part, by the [[Theoretical_models#CASP_14_Global_Distance_Test_Results|Global Distance Test Total Score, GDT_TS]]. AlphaFold2's predicted structure has a '''GDT_TS score of 87''. (A score of 0 is meaningless, and a score of 100 means perfect agreement with an X-ray crystal structure.) 87 means the model is close to the accuracy of an X-ray crystal structure. | The quality of a prediction in CASP is judged, in large part, by the [[Theoretical_models#CASP_14_Global_Distance_Test_Results|Global Distance Test Total Score, GDT_TS]]. AlphaFold2's predicted structure has a '''GDT_TS score of 87''. (A score of 0 is meaningless, and a score of 100 means perfect agreement with an X-ray crystal structure.) 87 means the model is close to the accuracy of an X-ray crystal structure. | ||
| - | Shortly after the CASP 14 competition (summer 2020), two X-ray crystal structures were reported for ORF8: [[7jtl]] released August 26, 2020, and [[7jx6]], released September 23, 2020. The resolutions are 2.0 and 1.6 Å respectively, and both have worse than average [[Rfree]] values. | ||
</StructureSection> | </StructureSection> | ||
Revision as of 23:35, 22 February 2021
This page is under construction. Eric Martz 01:03, 22 February 2021 (UTC)
Prediction of protein structures from amino acid sequences, theoretical modeling, has been extremely challenging. In 2020, breakthrough success was achieved by AlphaFold2[1], a project of DeepMind. For an overview of this breakthrough, verified by the bi-annual prediction competition CASP, please see 2020: CASP 14. Below are illustrated some examples of predictions from that competition.
| |||||||||||
References
- ↑ Senior AW, Evans R, Jumper J, Kirkpatrick J, Sifre L, Green T, Qin C, Zidek A, Nelson AWR, Bridgland A, Penedones H, Petersen S, Simonyan K, Crossan S, Kohli P, Jones DT, Silver D, Kavukcuoglu K, Hassabis D. Improved protein structure prediction using potentials from deep learning. Nature. 2020 Jan;577(7792):706-710. doi: 10.1038/s41586-019-1923-7. Epub 2020 Jan, 15. PMID:31942072 doi:http://dx.doi.org/10.1038/s41586-019-1923-7
- ↑ CASP14: what Google DeepMind’s AlphaFold 2 really achieved, and what it means for protein folding, biology and bioinformatics, a blog post by Carlos Outeir al Rubiera, December 3, 2020.
- ↑ Flower TG, Buffalo CZ, Hooy RM, Allaire M, Ren X, Hurley JH. Structure of SARS-CoV-2 ORF8, a rapidly evolving immune evasion protein. Proc Natl Acad Sci U S A. 2021 Jan 12;118(2). pii: 2021785118. doi:, 10.1073/pnas.2021785118. PMID:33361333 doi:http://dx.doi.org/10.1073/pnas.2021785118
- ↑ Summary and Classifications of Domains for CASP 14.
