Homology modeling servers

From Proteopedia

Jump to: navigation, search
CAUTION: Issues with servers, reported in this article, have not been updated since 2011.

There are a number of free servers that create homology models (also called comparative models) for a submitted amino acid sequence, or that offer libraries of 3D models created in advance for protein sequences. The performance of homology modeling methods is evaluated in an international, biannual competition called CASP. A comparison of 10 servers is included in the 2009 description of Phyre by Kelley and Sternberg[1], which also offers guidance in how to use these servers effectively.

Contents

Servers

Terminology: The sequence with unknown 3D structure is usually called the target. It is modeled on the template. Because those two terms are similar, sometimes leading to confusion, we shall call the target the query.

The list below is incomplete, and may not include some of the best servers, nor does it include assessments of server performance. Please help by adding additional servers.

  • SWISS-MODEL provides a free, fully-automated homology modeling service. Using the Automated Mode, you submit a protein sequence. When the PDB contains an empirically-determined structure with sufficient sequence identity with your query sequence, it will be used as a template. The resulting homology model will be constructed automatically.
  • I-Tasser, formerly known as 'the Zhang lab-Server' - employs comparative protein modelling based on protein threading and has won the last few CASP events.
  • YASARA - Yet Another Scientific Artificial Reality Application) features a complete homology modeling module that fully automatically takes all the steps from an amino acid sequence to a refined high-resolution model using a CASP approved protocol.

MetaServers

MetaServers are servers that submit your modeling job to other servers.

Handling of gaps

There are three kinds of gaps that present challenges when creating a homology model. It is important to know how a given server handles these challenges. Behavior marked Caution seems likely to produce errors in the homology model.

Server

Sequence Alignment[3]

Template residues lacking 3D coordinates

Untemplated Query Residues
(Gap in Template Sequence)

Gap in Query Sequence

Swiss-Model (Automated Mode)

Untemplated query residues (aligned with a gap in the template sequence) are present in the 3D model, and are indicated with a high temperature value. This appears to be true regardless of the length of the untemplated region. Long untemplated regions may occur in the 3D model as a long hairpin loop extending away from a compact domain, making their lack of template fairly obvious.

The 3D model takes a shortcut, skipping the residues in the template aligned with the gap. This causes the 3D template to bulge away from the 3D query model in this region, and permits registration to be maintained.

Caution: Omitted in the sequence alignment, yet not indicated there by a gap. The 3D model lacks a spatial gap between the residues at the gap boundries, effectively ligating** them . This causes a shift in query-template registration, and produces a 3D model that fails to make apparent the absence of some residues in the template (unless the structural alignment is examined as the downloadable Project in DeepView). The absence of some residues will affect analyses of the 3D model, such as charge distribution, and distribution of evolutionary conservation.

Phyre2 (Normal Modeling Mode)

For short untemplated regions (e.g. 1-5 untemplated residues), the untemplated query residues are present in the 3D model, and registration is maintained according to the sequence alignment by bunching up the untemplated residues in the 3D model, allowing the untemplated query residues to bulge away from the template in the 3D model.

Caution: For long untemplated regions (e.g. 87 residues), the untemplated query residues are omitted from the 3D model, effectively ligating** the templated boundary residues together. The omission fails to reveal, in the 3D model, that a large untemplated region exists. The absence of some query residues will affect analyses of the 3D model, such as charge distribution, and distribution of evolutionary conservation.

Caution: Phyre2's behavior is identical to that of Swiss-Model, above.

*Observation is based on a single model and needs confirmation with additional models.
**Covalent peptide bonds between amino acids are not explicit in PDB files, but all commonly used software places covalent bonds based on interatomic distances. Thus, when a spatial gap is omitted in the 3D model, the two residues abutting the gap are effectively ligated.

Problems

Sequence Numbering Anomalies

It is common for the sequences of proteins in PDB structures to begin with a number other than 1 (2fsr:A, 1ucy:E, 1nsa), and to include a residue numbered zero (1avq:A, 1bxw:A). Discontinuities in sequential numbering may occur (1igt:B, 2fsr:A, 1nsa, 1iao:B). Residues in the same chain may have the same sequence number, notably in the case of "insertions" relative to a reference sequence (1igt:B, 1ucy). These inserted residues may all have the same number, but are distinguished by insertion codes, typically letters in alphabetical order. However, in rare cases, the letters may not be in alphabetical order, e.g. chain J in 1ucy. An overview of sequence numbering anomalies in the PDB, including further examples, is at Unusual sequence numbering.

  • Phyre2 - Caution: In early April, 2011, Phyre2 numbered the aligned portion of the template sequence incorrectly when the above kinds of sequence anomalies occur in the template PDB file. The development team has acknowledged the problem and is working on a fix.

Sequence Alignment

  • Swiss-Model fails to indicate which residues are identical, and which are similar, in its sequence alignment.
  • Phyre2 may number the aligned template sequence incorrectly -- see above.

See Also

References and Notes

  1. Kelley LA, Sternberg MJ. Protein structure prediction on the Web: a case study using the Phyre server. Nat Protoc. 2009;4(3):363-71. PMID:19247286 doi:10.1038/nprot.2009.2
  2. Arnold K, Kiefer F, Kopp J, Battey JN, Podvinec M, Westbrook JD, Berman HM, Bordoli L, Schwede T. The protein model portal. J Struct Funct Genomics. 2009 Mar;10(1):1-8. Epub 2008 Nov 27. PMID:19037750 doi:10.1007/s10969-008-9048-5
  3. Alignment between the query and template sequences. See Homology modeling.

Proteopedia Page Contributors and Editors (what is this?)

Eric Martz, Wayne Decatur

Personal tools