Jmol/PDB file editing with Jmol

From Proteopedia

Jump to: navigation, search

The Jmol.jar application can be used to edit the contents of PDB files. For example, you could change atom serial numbers, names of chains, change sequence numbers, and so forth. A command script file can be written to make specific changes, following the principles outlined below, using a plain text editor.

Contents

Identify yourself and changes made

What changes were made, by whom?

Before you make any edited PDB file public, such as by uploading it to Proteopedia, PLEASE insert REMARK lines that give your name and professional affiliation, and summarize what changes you made. Inserting REMARK lines can most easily be done as a last step, using a plain text editor.

Use a distinctive PDB file name

Do not give your PDB file a name that is easily confused with the version published at the PDB, such as 1D66.pdb. Use a name that makes it clear that it has been modified, such as 1D66_chains_renamed.pdb.

Put the PDB file in a variable

First, run the Jmol.jar Java application, and load any PDB file. If you have a PDB file saved to your disk (for example, downloaded from RCSB.Org), drag it and drop into the Jmol graphics window. Alternatively, to load 1d66 directly from the PDB:

load =1d66   # No space between = and the 4-character PDB code.
Note: Anything between '#' and the end of a line (or a semicolon) is a comment.

After you have the PDB file loaded and the molecule is displayed:

mypdb = getproperty("filecontents")

mypdb is a variable inside the Jmol scripting context that now has one long string (including newline characters) for the entire PDB file, unmodified, including the header.

Edit line by line

It is usually easiest to loop through the PDB file line by line. So, let's define a new variable:

mypdblines = mypdb.lines

Now, mypdblines is an array with one PDB line per element. So you can loop line by line:

for (i=1; i<=mypdblines.length; i++)
{
     mypdblines[i] ...  # do something with this
}

Jmol has plentiful commands for finding lines, and editing them. For example, to operate only on lines beginning "ATOM ...",

if (mypdblines[i].find("^ATOM ", "")) ...

The second parameter "" signals that the first parameter should be interpreted as a regular expression, where "^" means "beginning of the line".

Most of Jmol's built in functions for operations on character strings are listed in this section of the Jmol documentation.

Since PDB format has fixed column positions, you can, for example, change the chain name, which is in column 22:

if ((mypdblines[i])[22][23] == "G") {(mypdblines[i])[22][23] = "D";}

(The atom property "chain" is not writable in Jmol, nor are "resno" nor "seqcode". So you can't simply assign new values to these properties.)

Write a PDB file containing the edited lines

When finished, you write the PDB file like this:

write var mypdblines "someFileName.pdb"

"Var" means you are writing the contents of a variable into a disk file.

It is important to note that another command you think of using, write "someFileName.pdb", writes a file without the original header (and containing only the currently selected atoms). By using the variable mypdblines, you preserve the header and write all atoms.

Saving key information in the header

Custom information can be inserted into the header section of mypdblines.

Jmol uses the first line of a PDB file to recognize PDB format, so it is important not to put your custom lines first. Not only Jmol, but PyMOL and Chimera and likely other popular molecular visualization apps (including FirstGlance) happily ignore lines in a PDB file that do not begin with a recognizable record name such as REMARK or ATOM. (FirstGlance recognizes lines beginning "!" as custom information from the ConSurf server.)

Method #1

For example, if Jmol has calculated things that you would like to have available (without re-calculating) in the output PDB file, you can insert lines between the first and second lines of mypdblines like this:

HEADER    TRANSCRIPTION/DNA                       06-MAR-92   1D66             
@ Custom information in lines beginning "@ ".
TITLE     DNA RECOGNITION BY GAL4: STRUCTURE OF A PROTEIN/DNA COMPLEX

It is even possible to put Jmol scripts (perhaps to define a function, or specify custom variable values [variables are not saved in PDB nor in PNGJ files]) for later use. For example, this could be inserted into the header of mypdblines:

@ # Jmol script.
@ myvar = 12.6
@ function f1()
@ {
@   print _arguments
@ }
@ # End Jmol script.

After loading the saved PDB or PNGJ file with this in its header, you can drag and drop in a script file that (i) extracts the @ lines into a variable, (ii) removes the leading "@ " from each line, then (iii) executes the variable using "script inline @variable".

Method #2

If you prefer your files to be closer to PDB format standards (and so prevent potential problems if those files are read into other software), any extra custom lines should always start with the PDB keyword REMARK. In fact, Jmol is designed to read and apply any Jmol scripts embedded in the file, when a line starts with REMARK jmolscript: (as described in this page). You must put your whole script of commands into that single line, but several such lines in a file are also supported. Taking the example above, this would look like:

REMARK jmolscript: myvar = 12.6; function f1() { print _arguments }

Writing a PNGJ file containing the edited lines

PNGJ files contain a PNG (Portable Network Graphics) static image of the scene Jmol was displaying when the PNGJ file was written, and also the complete information to reproduce the scene in Jmol. When you drag a PNGJ file and drop it into Jmol's graphics window, the scene appears in interactive form that can be rotated, zoomed, and further modified with Jmol commands.

In addition to a PDB file, you can save a PNGJ file with customized PDB lines and a customized header, though it is slightly more tricky. Unlike a PDB file, you can't save a PNGJ file from a variable. So here is one scripting method that works, using only 3 commands:

# Customize mypdblines as desired previous to this line.
zap # Deletes all atoms and defined atom sets. Preserves variables and functions.
load var mypdblines # loads the PDB file data contained in the variable, including header, optionally modified.
# Render, color, center, orient and zoom as desired.
write someFileName.pngj

The PNGJ file will have all the customized PDB lines as well as the view at the time it was saved. It does not include variables or functions that were defined at the time is was saved. If any of these are needed, define them in custom header @ lines and write a script to use them as described above, or use REMARK jmolscript: lines to include the definition of those variables.

A PDB file editing server?

PDB-Tools Web, a user interface for the pdb-tools Python package.

For those who might be interested in writing a server to modify PDB files, JmolData.jar is a variant of Jmol that runs without a graphics window. It is perfect for these kinds of operations. Jaime Prilusky used it in Proteopedia.org to generate a series of image files after small rotations. These are then assembled into a multi-GIF movie using other free software (ImageMagick.org) in the web server. See the link "Export Animated Image" under any JSmol box in Proteopedia.Org. In collaboration with Prilusky, Eric Martz adapted these server routines to make such animations within FirstGlance.Jmol.Org (with a simplified user interface). There, under JSmol, click "Save Image or Animation for Powerpoint".

If you know of any PDB file editing servers, please link them here!

Proteopedia Page Contributors and Editors (what is this?)

Eric Martz, Angel Herraez, Jaime Prilusky

Personal tools