Getting Started JChem PostgreSQL Cartridge
This guide will serve as a basic introduction to installing and using JChem PostgreSQL Cartridge.
See also JChem PostgreSQL Cartridge Manual for detailed description of administration and API use cases.
System requirement
-
CentOS, RedHat or Debian operating system x86_64 version.
-
postgresql-94 relational database (preferred download page)
-
availability of hstore extension of postgresql-94 (e.g., contrib package)
-
Java8 runtime environment
Installation and Setup
Install PostgreSQL Cartridge
Download the latest version of PostgreSQL Cartridge from here. In order to install the latest version, change x.y (version information) in the following statement to the current one:
sudo yum localinstall jchem-psql-x.y.x86_64.rpm
Before the first use
-
Install your license file
Copy a valid ChemAxon license to /etc/chemaxon/license.cxl. Required license is 'Postgres Cartridge'.
-
Initialize index directory
sudo service jchem-psql init
If you encounter a problem in your java installation, set your JAVA_HOME in the file /etc/default/jchem-psql.
-
First start
sudo service jchem-psql manual-start
Using the extension in a database
-
Create postgres user and database
sudo su postgres
createuser testuser
createdb testdb -O testuser
-
Install the extensions
psql testdb
testdb>
CREATE
EXTENSION chemaxon_type;
testdb>
CREATE
EXTENSION hstore;
testdb>
CREATE
EXTENSION chemaxon_framework;
-
Checking JChem-psql cartridge installation
If jchem-psql service is running, the following query can be executed without any problem:
testdb>
SELECT
'C'
::Molecule(
'sample'
) |<|
'CC'
::Molecule;
Service
The service can be started/stopped using
sudo service jchem-psql manual-start
sudo service jchem-psql stop
By default, the service starts on system boot. This can be switched off by setting ENABLED=0 in the /etc/default/jchem-psql file.
BASIC API Usage
CREATE TABLE
CREATE
TABLE
table_name (structure_column_name MOLECULE(
'molecule_type_name'
));
Example:
CREATE
TABLE
ttest (mol MOLECULE(
'sample'
));
Import data
-
Import from sdf/mol file (located on your local machine):
\
set
variable_name `cat sdf_file_name`
CREATE
TABLE
table_name
AS
SELECT
molSrc::molecule(
'molecule_type_name'
)
AS
(structure_column_name)[, props ->
'sdf_field_name'
AS
column_name]
FROM
parse_sdf(:
'variable_name'
);
Example:
\
set
content `cat ~/a.sdf`
CREATE
TABLE
mysdftable
AS
SELECT
molSrc::molecule(
'sample'
)
AS
mol,
props ->
'Molformula'
AS
formula
FROM
parse_sdf(:
'content'
);
-
Import from local smiles/cxsmiles/smarts/cxsmarts files:
\COPY table_name (structure_column_name)
FROM
'file_name'
(FORMAT csv);
Example:
\COPY ttest(mol)
FROM
'/home/posgresuser/targetfiles/nci-pubchem_1m_unique.smiles'
(FORMAT csv);
CREATE INDEX
Indextype named chemindex has to be used when indexing a column that contains chemical structures.
CREATE
INDEX
index_name
ON
table_name USING chemindex(structure_column_name);
Example:
CREATE
INDEX
ttest_idx
ON
ttest USING chemindex(mol);
SELECT
Substructure search
SELECT
*
FROM
table_name
WHERE
'query_structure'
|<| structure_column_name;
Examples:
SELECT
*
FROM
ttest
WHERE
'<?xml version="1.0" encoding="windows-1250"?>
<cml xmlns="http://www.chemaxon.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.chemaxon.com/marvin/help/formats/schema/mrvSchema_6_1_0.xsd" version="ChemAxon file format v6.1, generated by v6.1.3">
<MDocument>
<MChemicalStruct>
<molecule molID="m1">
<atomArray atomID="a1" elementType="C" x2="3.4100000858306885" y2="3.740000009536743"/>
<bondArray/>
</molecule>
</MChemicalStruct>
</MDocument>
</cml>'
|<| mol;
SELECT
*
FROM
ttest
WHERE
'c1ccccc1'
|<| mol;
SELECT
*
FROM
ttest
WHERE
'[#6]-[#6]'
|<| mol;
Full fragment (exact fragment) search
SELECT
*
FROM
table_name
WHERE
query_transform(
'query_structure'
,
'fullfragment'
) |<| structure_column_name;
Example:
SELECT
*
FROM
ttest
WHERE
query_transform(
'CC'
,
'fullfragment'
) |<| mol;
Superstructure search
SELECT
*
FROM
table_name
WHERE
'query_structure'
|>| structure_column_name;
Example:
SELECT
*
FROM
ttest
WHERE
'CCC'
|>| mol;
Duplicate search
SELECT
*
FROM
table_name
WHERE
'query_structure'
|=| structure_column_name;
Example:
SELECT
*
FROM
ttest
WHERE
'CCC'
|=| mol;