Getting Started JChem PostgreSQL Cartridge

This guide will serve as a basic introduction to installing and using JChem PostgreSQL Cartridge.

See also JChem PostgreSQL Cartridge Manual for detailed description of administration and API use cases.

System requirement

  • CentOS, RedHat or Debian operating system x86_64 version.

  • postgresql-94 relational database (preferred download page)

  • availability of hstore extension of postgresql-94 (e.g., contrib package)

  • Java8 runtime environment

Installation and Setup

Install PostgreSQL Cartridge

Download the latest version of PostgreSQL Cartridge from here. In order to install the latest version, change x.y (version information) in the following statement to the current one:

yum localinstall jchem-psql-x.y.x86_64.rpm

Before the first use

  1. Install your license file

    Copy a valid ChemAxon license to /etc/chemaxon/license.cxl. Required license is 'Postgres Cartridge'.

  2. Initialize index directory

    service jchem-psql init

    If you encounter a problem in your java installation, set your JAVA_HOME in the file /etc/default/jchem-psql.

  3. First start

    service jchem-psql manual-start

Using the extension in a database

  1. Create postgres user and database

    sudo su postgres
    createuser testuser
    createdb testdb -O testuser
  2. Install the extensions

    psql testdb
    testdb> CREATE EXTENSION chemaxon_type;
    testdb> CREATE EXTENSION hstore;
    testdb> CREATE EXTENSION chemaxon_framework;


  3. Checking JChem-psql cartridge installation

    If jchem-psql service is running, the following query can be executed without any problem:

    testdb> SELECT 'C'::Molecule('sample') |<| 'CC'::Molecule;

Service

The service can be started/stopped using

service jchem-psql manual-start
service jchem-psql stop

By default, the service starts on system boot. This can be switched off by setting ENABLED=0 in the /etc/default/jchem-psql file.

BASIC API Usage

CREATE TABLE

CREATE TABLE table_name (structure_column_name MOLECULE('molecule_type_name'));

Example:

CREATE TABLE ttest (mol MOLECULE('sample'));

Import data

  • Import from sdf/mol file (located on your local machine):

    \set variable_name `cat sdf_file_name`
    CREATE TABLE table_name AS
    SELECT molSrc::molecule('molecule_type_name') AS (structure_column_name)[, props ->'sdf_field_name' AS column_name] FROM parse_sdf(:'variable_name');

    Example:

    \set content `cat ~/a.sdf`
    CREATE TABLE mysdftable AS
    SELECT molSrc::molecule('sample') AS mol,
    props -> 'Molformula' AS formula
    FROM parse_sdf(:'content');


  • Import from local smiles/cxsmiles/smarts/cxsmarts files:

    \COPY table_name (structure_column_name) FROM 'file_name' (FORMAT csv);

    Example:

    \COPY ttest(mol) FROM '/home/posgresuser/targetfiles/nci-pubchem_1m_unique.smiles' (FORMAT csv);


CREATE INDEX

Indextype named chemindex has to be used when indexing a column that contains chemical structures.

CREATE INDEX index_name ON table_name USING chemindex(structure_column_name);

Example:

CREATE INDEX ttest_idx ON ttest USING chemindex(mol);

SELECT

Substructure search

SELECT * FROM table_name WHERE 'query_structure' |<| structure_column_name;

Examples:

SELECT * FROM ttest WHERE '<?xml version="1.0" encoding="windows-1250"?>
<cml xmlns="http://www.chemaxon.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.chemaxon.com/marvin/help/formats/schema/mrvSchema_6_1_0.xsd" version="ChemAxon file format v6.1, generated by v6.1.3">
<MDocument>
<MChemicalStruct>
<molecule molID="m1">
<atomArray atomID="a1" elementType="C" x2="3.4100000858306885" y2="3.740000009536743"/>
<bondArray/>
</molecule>
</MChemicalStruct>
</MDocument>
</cml>' |<| mol;
 
SELECT * FROM ttest WHERE 'c1ccccc1' |<| mol;
 
SELECT * FROM ttest WHERE '[#6]-[#6]' |<| mol;

Full fragment (exact fragment) search

SELECT * FROM table_name WHERE query_transform('query_structure', 'fullfragment') |<| structure_column_name;

Example:

SELECT * FROM ttest WHERE query_transform('CC', 'fullfragment') |<| mol;

Superstructure search

SELECT * FROM table_name WHERE 'query_structure' |>| structure_column_name;

Example:

SELECT * FROM ttest WHERE 'CCC' |>| mol;

Duplicate search

SELECT * FROM table_name WHERE 'query_structure' |=| structure_column_name;

Example:

SELECT * FROM ttest WHERE 'CCC' |=| mol;