Protein Data Bank (PDB) file format

Import from PDB format

PDB files complying the PDB Contents Guide version 2.3 are processed, though with some negligible limitations. PDB files produced by various 3rd party applications may not comply the PDB standard. Most of these files are also properly handled though there might be exceptions.
All covalent bonds in proteins and in nucleic acids are properly assigned, but hydrogen bonds, sulphur and water bridges, coordinated bonds are not recognised yet. Covalent bonds in hetero groups are perceived based on geometry, bond types are guessed with some errors. Hydregon atoms are identified and bonded to the appropriate heavy atom eiter in chains, in hetero groups as well as in water molecules.
Multiple models are properly processed as well as insertions and modified residues.

Codename: pdb

Limitations

Standard record types listed below are not recognised by the current version of PDB import:

  • Optional: OBSLTE, CAVEAT, SPRSDE, JRNL, REMARK, SEQADV, FTNOTE, HETSYN, FORMUL, SSBOND, LINK, HYDBND, SLTBRG, CISPEP, SITE, MTRIX1, MTRIX2, MTRIX3, TVECT, SIGATM, ANISOU, SIGUIJ

  • Mandatory: CRYST1, ORIGX1, ORIGX2, ORIGX3, SCALE1, SCALE2, SCALE3, MASTER

The recognition and proper processing of these record types will be implemented in forthcoming releases on demand.

Export to PDB format

Marvin exports simplified PDB files containing record types listed below:

  • Title section:

    • HEADER contains the following fields: classification="PROTEIN" (or imported value), date, idCode="NONE" (or imported value).

    • TITLE, SOURCE, KEYWDS, EXPDTA: The imported value is exported. Default: "NULL".

    • COMPND: The imported value is exported. Default: "MOLECULE: name", where "name" is the molecule name.

    • AUTHOR: The imported value is exported. Default: "Marvin".

    • REVDAT: The following line plus the imported value.
      REVDAT N DD-MMM-YY 3
      (N is the modification number, DD-MMM-YY is the date of the modification.)

  • Coordinate section:

    • ATOM and HETATM: The atom name includes the remoteness indicator and the branch designator character in case of amino acids. For non-standard residues, the atom name and the element symbol field contain the same value. The occupancy and the temperature factor are zero. The residue field contains one of the standard residue symbols.

  • Connectivity section:

    • CONECT: Only the first five fields are used. If the number of bonds is greater than four, a second CONECT line with the same atom serial number (first field) will be used.

    • TER: Indicates the end of a chain. Imported but not exported in the current version.

  • Book keeping section:

    • MASTER

Export options can be specified in the format string. The format descriptor and the options are separated by a colon. Options listed below are available for PDB output.

H or +H

Add explicit hydrogen atoms. Usage: "PDB:H"

-H

Remove explicit Hydrogen atoms. Usage: "PDB:-H"

Limitations

The exporter writes the atoms in the molecule object's internal atom order which may be different from the order of residues in a chain. Thus export is still not reliable for macromolecules with residues.