Similarity specific search options

In this section search options concerning similarity are summarized and their usage is shown in different search interfaces.

Dissimilarity metrics

Sets the metric to be used during similarity search.

MolSearch API

Not applicable.

JChemSearch API

        JChemSearchOptions searchOptions = new JChemSearchOptions(SearchConstants.SIMILARITY);
searchOptions.setDissimilarityMetrics( "tanimoto" / "tversky" / "dice" /
"euclidean" / "normalized_euclidean" / "substructure" / "superstructure" );
// ...
JChemSearch searcher = new JChemSearch();
searcher.setSearchOptions(searchOptions);

Default value is null, which means using the default metric.

JChem Oracle Cartridge

Use the jc_compare operator with dissimilarityMetric.

The values accepted as <metric> depend on the structureType.

  1. In case of table types anyStructures and molecules <metric> may be one of the following:

    • tanimoto

    • tversky

    • euclidean

    • ask for others

  2. In case of table type reactions <metric> may be one of the following:

    • ReactantTanimoto

    • ProductTanimoto

    • CoarseReactionTanimoto

    • MediumReactionTanimoto (default)

    • FineReactionTanimoto

Example:

select count(*) from nci_1k where jc_compare(structure, 'C[C@H](CS)C(=O)N1CCC[C@H]1C(O)=O |r|',
't:i dissimilarityThreshold:0.61 dissimilarityMetric:tversky;0.3;0.7') = 1;

jcsearch command line tool

Not applicable.

See the availability of the option in further ChemAxon products:

Dissimilarity threshold for similarity search

Sets the property dissimilarity threshold. Sets the threshold for similarity searches. Lower threshold results less hits which are more similar to the query structure.

MolSearch API

Not applicable.

JChemSearch API

        JChemSearchOptions searchOptions = new JChemSearchOptions(SearchConstants.SIMILARITY);
searchOptions.setDissimilarityThreshold(threshold);
// ...
JChemSearch searcher = new JChemSearch();
searcher.setSearchOptions(searchOptions);
Default value is 0.3.

JChem Oracle Cartridge

Use the jc_compare operator with simThreshold:float.

The following SQL query returns the number of structures in nci_250k, whose similarity with Brc1ccccc1 is greater then 0.9:

                  SELECT count(*) FROM nci_250k WHERE jc_compare(smiles, 'Brc1ccccc1', 't:i simThreshold:0.9') = 1;               

jcsearch command line tool

It can be set when the search type (similarity) is specified:

-t:i[:dissimilarity_threshold]

See the availability of the option in further ChemAxon products: