Getting Unremediated PDB Files

From Proteopedia

Jump to: navigation, search

Periodically, the Protein Data Bank remediates PDB files in the worldwide archive that it maintains. Two rounds of remediation took place between 2007 and 2009, in order to better standardize and enhance the PDB files deposited before December 2, 2008. The details of the two rounds can be found at the Worldwide Protein Data Bank's Documentation page. The most recent version was released March 17th, 2009 as described on the news page at the Protein Data Bank.

The unremediated PDB archive from before March 17, 2009 is available, as detailed here because a time-stamped snapshot of the PDB archive before the March 17th release exists here in the directory 20090316.

The unremediated PDB archive from before August 1, 2007 is available, as detailed here.

If a PDB file was released after December 2, 2008, it is not available in unremediated form.

Contents

Proteopedia avoids remediation-related problems

Following the experience of the 2009 remediation, Proteopedia automatically saves the version of the PDB file for which each molecular scene is developed, along with the Jmol script for the scene. When a new scene is developed with the Scene Authoring Tools, the current version of the PDB file is used and saved. Thus, subsequent remediations cannot inadvertantly corrupt scenes developed on earlier versions of the PDB file. (The 2009 remediation changed the order of atoms in some PDB files, which broke a few scenes until Proteopedia was modified to save the PDB file along with the scene script. These were repaired by obtaining the unremediated PDB files and using them for these scenes.)

Why would you need an unremediated version of a pdb file?

A significant change made in the course of the first round (2007) of remediation was the distinction between ribonucleotides (A, C, G, I, T, U) and deoxyribonucleotides (DA, DC, DG, DI, DT, DU). The main reason for getting unremediated PDB files from before the 2007 remediation is that when the remediated PDB files contain DNA, Chime-based Protein Explorer (and perhaps some other software) does not display the DNA properly. If the PDB file does not contain DNA (protein, RNA, solvent and ligands are OK), you probably don't need the unremediated file. If a PDB file was released after August 1, 2007, it will not be available in unremediated form that suits CHIME-based (and perhaps other) software. The second round of remediation (2009 round; released March 17th 2009) also mainly affected nucleic acid residues and atoms.

How to get the unremediated version?

Use the simple interface here at Eric Martz's UMASS site to easily get July 31, 2007 unremediated pdb files via ftp at the RCSB Protein Data Bank in the directory 20070731. This is primarily for obtaining DNA before the residues were re-named DC, DG, DT, DA, for Protein Explorer/Chime.

The March 16, 2009 unremediated versions are available via ftp at the RCSB Protein Data Bank in the directory 20090316.

These files come back in a compressed form. The 2007 files are .Z compressed, while the 2009 files are .gz compressed.

Thus to get the two forms of 1d66:

  • For prior to 2007 remediation: ftp://snapshots.wwpdb.org/20070731/pub/pdb/data/structures/all/pdb/pdb1d66.ent.Z
  • For prior to 2009 remediation: ftp://snapshots.wwpdb.org/20090316/pub/pdb/data/structures/all/pdb/pdb1d66.ent.gz

To uncompress the downloaded files:

  • For uncompressing the 2007 files that are in .Z format:
    • Both .Z and .gz files can be uncompressed in Windows using 7-Zip, which is freeware and open source.
    • Eric Martz's site lists WinZip as an alternative for PC but this program is not free once a trial period expires;
    • Stuffit Expander failed to uncompress such a file on a PC although Eric lists it as useful on Macs.
    • I found to uncompress them, I could upload them to the Web hosting server I have access to and use 'zcat -d [FILE NAME]' to have it show an uncompressed form in my Secure Shell client using logging enabled to save a file of the output locally on my own drive.
    • Others have mentioned using a DOS version of uncompress on a PC, called uncomp.exe.
  • For uncompressing the 2009 files that are in .gz format:
    • Stuffit Expander works to uncompress such a file on a PC and most likely Macs.
    • Both .Z and .gz files can be uncompressed in Windows using 7-Zip, which is freeware and open source.

Please, note that .gz files can be displayed in Proteopedia without being uncompressed, since Jmol can read gzipped files directly.

See Also

Proteopedia Page Contributors and Editors (what is this?)

Wayne Decatur, Eric Martz, Angel Herraez

Personal tools