This manual describes the Sphere Exclusion clustering algorithm:

Introduction

Sphere Exclusion is a simple, intuitive selection method. Clustering begins by selecting an initial structure, including all structures that meet a defined similarity threshold in the first cluster, and repeating this process until all structures are in clusters. A subsequent clustering is also done as cluster formation is dependent on the initial structure and parameter selection.The structure selection process can be random or directed by some preprocessing of the structures. Clusters are defined by similarity and their number is not predetermined. Sphere exclusion is a method to select subsets, e.g. diverse subsets. The method is highly dependent on the initial element of the input file.

Fig. 1 Sphere Exclusion clustering

Usage

You can invoke the Sphere Exclusion algorithm via the jklustor command line tool:

jklustor [<options>] [<input files>]

Prepare the usage of the jklustor script or batch file as described in Preparing and Running Batch Files and Shell Scripts.

Options

   sphex:[Minimal separation between cluster centroids]  Use single level sphere exclusion clustering

  -h, --help                    help message
  -c,                           specify the clustering method
  -o, --output <filepath>       output file path (default: stdout)
  -t, --tag                     name of the SDFile tag to store the
                                Pharmacophore Map (default: PMAP)
  -S, --sdf-output              SDF output (otherwise only PMAP list)
  -g, --ignore-error            continue with next molecule on error
  -v, --verbose                 print calculation warnings to the console
  -l,                           store individual input structures regardless of output actions
  -s, --port                    after  performing all  output actions  launch  listening  server on given port

Examples

The following examples demonstrate the usage of the Sphere Exclusion algorithm:

Invoke sphere exclusion clustering (using dissmilaity radius 0.4) on the given data set; store input structures and present results with builtin lightweight HTTP server. When clustering process finished connect browser to http://localhost:84.

jklustor -v -l -s 81 -c sphex:0.4 http://www.chemaxon.com/shared/libMCS/default.sdf
Clustering with a 0.8 Tanimoto distance between centroids, and writes out each cluster with its members into different files:

jklustor -c sphex:0.8 input .sdf -o "wrmols:sdf:cluster_*.sdf

For full user guide, type

jklustor -h.