Workflow tools

Integration API

To check molecules listed in a file against regulations, the Compliance Checker API was developed with three endpoints:

1. <host>/cc-bigdata/filecheck/upload

This is a POST method. It uploads a file containing molecules and creates a job which executes the checks for each structure.
It has four parameters:

- file (mandatory): The file that contains the molecules to be checked.
- categories: Array of category ids which defines the regulations. Leaving it empty results in checking against all regulations.
- date: Should be set in case molecules need to be checked against past regulations. Expected date format is 'yyyyMMdd'. Leaving it empty is the same as setting today's date.
- description (optional): Short description can be provided for the input.

The output elements of the response:

- jobId: The identifier of the job created to process the input file.
- url: The url where the information of the generated job can be requested from.
- description: Constant description message 'Use this ID or the URL to access your job status and information'.

2. - <host>/cc-bigdata/filecheck/job/{id}

This is a POST method. It submits a report generation request for a given job.
It has five parameters:

- id: The id of the job to generate the report for.
- containsErrors: Defines whether the report should contain unsuccessful checks (true/false).
- containsHits: Defines whether the report should contain regulated molecules (true/false).
- containsPasses: Defines whether the report should contain not regulated molecules (true/false).
- formats: Array of requested report formats. Accepted values: PDF, HTML, SDF, MRV, XLSX, JSON.

3. <host>/cc-bigdata/filecheck/job/{id}

This is a GET method. Returns status information about the given job.
It has one parameter:

- id: The id of the job which we would like to get the status information of.

The output elements of the response:

- status: Status of the job. Possible values: FINISHED, PENDING, FAILED, IN_PROGRESS.
- percentage: Percentage of checked molecules.
- inputSize: Number of all molecules.
- hitCount: Number of regulated molecules.
- errorCount: Number of unsuccessful checks.
- passedCount: Number of not regulated molecules.
- reports: An array of reports generated for the given job.

reports has five fields:

- format: Format of the report. Possible values: PDF, HTML, SDF, MRV, XLSX, JSON.
- reportContent: The content of the report that has been selected before generation.
Possible values: 'full', 'hit only', 'certificate only', 'error only', 'result only', 'error and hit', 'error and passed', 'summary only'
- generated: Timestamp of the report generation.
- state: Status of the report. Possible values: FINISHED, PENDING, FAILED, IN_PROGRESS.
- url: Url of the report where it can be downloaded from.

To invoke these endpoints admin privileges are needed.

  • A Swagger UI for these endpoints is also available at <host>/cc-bigdata/file-api.

KNIME Node

This node checks all chemical structures provided in a given column of the input DataTable against ComplianceChecker's rule set.

It can be downloaded from Infocom (ChemAxon's partner in Japan) : ChemAxon Node for KNIME JChem Extension_update_en in their 'Chem & Bio Informatics' section.

Options

Check mode
In 'simple' mode input rows are directed to different output ports based on the result of the check. In 'detailed' mode additional info columns are added to the input rows directed to the output ports.

Structure column
The column of the input DataTable that holds the structures to be checked.

Date of regulation
Setting this date will execute the checks against the regulations that were active on the chosen date. Not setting the field is equivalent with setting today's date.

Categories
The check will run against the selected categories. No selection means, running checks against all categories.

Connection settings

Host
The host machine for ComplianceChecker.

Timeout
Sets read and connection timeout for the checks.

Username
Name of the user to authenticate against on the service calls.

Password
Password to authenticate on the service calls.

Ports

Input Ports
0 DataTable containing the input structures that will be checked

Output Ports
0 Regulated records
1 Not regulated records
2 Records that could not be checked due to error

How to deploy a KNIME node

1. Download and install Knime64 (2/6GB memory)
2. Create knime-workspace
3. Copy the *.jar to C:\Program Files\KNIME\dropins
4. Start Knime (Node Repository > ComplianceChecker)

The ComplianceChecker-Node extension for KNIME Workbench node is open source - provided by ChemAxon.
Details of KNIME Nodes Administration by Chemaxon and the source code is available from here.

Pipeline Pilot Component

Chemaxon's Pipeline Pilot Component Collection

    • Provides access to ChemAxon tools from Pipeline Pilot

    • Developed and directly supported by ChemAxon

    • The component collection itself is free of charge

    • The corresponding ChemAxon licenses are needed for the tools accessed via the components

    • Compatible with the exact same JChem version (weekly release)

    • Pipeline Pilot 9.2 or newer required

    • JChem Oracle Cartridge 6.1 or later is supported

Compliance Checker functionality is available. (Documentation)

.