Evaluator and JChem Cartridge Examples

Chemical Terms Evaluator and Chemical Terms - JChem Cartridge examples ( molecule context )

Structure based calculations (plugin calculations)

Plugin references provide access to ChemAxon's calculator plugins. These calculations equip our expressions with chemical meaning.

  1. The physiological microspecies at pH 7.4 of the input molecule:

    microspecies("7.4")
  2. The partial charges on atoms 0, 2 and 3 (0-based) of the input molecule:

    charge(0, 2, 3)
  3. The same with taking the physiological microspecies at pH 7.4:

    charge(0, 2, 3, "7.4")
  4. Checking whether the partial charge on atom 0 in the input molecule is greater than or equal to this charge value in the physiological microspecies at pH 7.4:

    charge(0) > charge(0, "7.4")
  5. The significant pKa value (acidic or basic) on atom 9 (0-based) of the input molecule:

    pka(9)
  6. The acidic pKa value on atom 9 (0-based) of the input molecule:

    pka("acidic", 9)

    Note that if the pKa type "acidic" or "basic" is omitted (as in the previous example), then the more significant value is returned, while specifically the "acidic" (or "basic") pKa value is returned if the type is specified.

  7. The strongest acidic pKa value of the input molecule:

    pka("acidic", "1")

    Note the difference in the last two examples: in pKa calculation a number denotes the atom index while a number in quotation marks denotes the strength order: 9 in the previous example refers to atom 9 while "1" in the above example refers to the strongest acidic pKa value ("2" refers to the second strongest value, etc.).

  8. The logP value of the input molecule:

    logp()
  9. The logD value at pH=7.4 of the input molecule:

    logd("7.4")

    Note that in logD calculation the pH value should be enclosed in quotation marks.

  10. Check the difference between logD values at two different pH-s:

    logd("7.4") - logd("3.8") > 0.5
  11. The mass of the input molecule:

    mass()
  12. The number of H bond acceptor atoms in the input molecule:

    acceptorCount()
  13. The same with taking the physiological microspecies at pH 7.4:

    acceptorCount("7.4")
  14. Checking the difference of the two above:

    acceptorCount("7.4") - acceptorCount() > 1

Functions

There are different types of functions provided by ChemAxon:

  1. general purpose functions: simple array utility functions, such as minimum, maximum, sum or number of array elements and an array element sorter function

  2. atomic functions: functions referring to an input atom, such as the atom property query function of which queries atom properties (e.g. hydrogen count) or the containment function that checks whether an atom index is contained in an atom index array

  3. molecular functions: functions that calculate molecular properties, but do not fit into the structure based calculations section (e.g. isQuery function)

  4. evaluator functions: functions containing an inner expression string as parameter - evaluate this expression for each atom in an atom context, examples include a filtering function that takes a boolean expression and returns atoms satisfying it and min-max functions which evaluate the inner expression for all atoms in the context, return the minimum or maximum value or the corresponding atom index

1. The minimum of the partial charge values on atoms 7, 8 and 9 (0-based) of the input molecule:

min(charge(7), charge(8), charge(9))

2. The hydrogen count on atom 2 (0-based) of the input molecule:

hcount(2)

3. The valence of atom 2 of the input molecule:

valence(2)

4. The atom indices corresponding to positive partial charges in the input molecule:

filter("charge() > 0")

5. The number of atoms with positive partial charge in the input molecule:

count(filter("charge() > 0"))

6. The positive partial charges in the input molecule:

charge(filter("charge() > 0"))

7. The same but sorted in ascending order:

sortAsc(charge(filter("charge() > 0")))

8. Indices of atoms having partial charge at least 0.4 in the major microspecies at pH=7.4:

filter("charge('7.4') >= 0.4")

9. The partial charge values on these atoms in the input molecule:

charge(filter("charge('7.4') >= 0.4"))

10. The minimum acidic pKa value on hetero atoms with a single hydrogen:

min(pka(filter("match('[!#6!#1;H1]')"), "acidic"))

11. Checking whether there is a hetero atom with acidic pKa value less than 0.75:

min(pka(filter("match('[!#6!#1;H1]')"), "acidic")) < 0.75

12. Indices of atoms with the two strongest basic pKa values:

maxAtom("pka('basic')", 2)

Note, that expression strings can be enclosed by either double or single quotes, in case of nested strings these can be used alternated.

However, some UNIX shells interpret single quotes and therefore single quotes are hard to use in command line input - the file input solves this problem, or else single double quotes can be replaced by escaped inner double quotes:

maxAtom("pka(\"basic\")", 2)

13. The corresponding pKa values:

maxValue("pka('basic')", 2)

14. Testing whether the partial charge on the atom with the strongest basic pKa value exceeds the partial charge on the atom with the second strongest basic pKavalue:

x = maxAtom("pka('basic')", 2);
charge(x[0]) > charge(x[1])

Note, that in the current version the above expression cannot be evaluated if there are less than two basic pKa values in the input molecule.

15. The basic pKa values for atoms with positive charge, sorted in descending order:

sortDesc(pka("basic", filter("charge() > 0")))

Note, that in the current version NaN (meaning that there is no valid pKa for the given atom) values are put to the end of the array after sorting.

16. Checking whether there is a sufficiently large difference between the two strongest basic pKa values of the previous example:

x = sortDesc(pka("basic", filter("charge() > 0")));
x[0] - x[1] > 1.5

17. The hydrogen count for each atom in the input molecule:

eval("hcount()")

18. The number of hydrogens in the input molecule:

sum(eval("hcount()"))

19. Dissimilarity between the benzene ring and the input molecule using pharmacophore fingerprint as molecular descriptor with Tanimoto (default) metric:

refmol = "c1ccccc1";
dissimilarity("PF", refmol)

Note: dissimilarity function is not available in Marvin; it can be used only if JChem software package is installed.

20. The same using Euclidean metric:

refmol = "c1ccccc1";
dissimilarity("PF:Euclidean", refmol)

Note: dissimilarity function is not available in Marvin; it can be used only if JChem software package is installed.

21. The partial charge on the two atoms out of 1, 6, 8 (0-based atom indices) having the first and second biggest hydrogen counts (molecule context):

x = array(1, 6, 8);
y = maxAtom(x, "hcount()", 2);
charge(y)

22. Checking whether atom 6 (0-based atom index) has the first or second smallest partial charge among atoms 1, 6, 8, 10, 12 (molecule context):

x = array(1, 6, 8, 10, 12);
y = minAtom(x, "charge()", 2);
in(6, y)

Matching conditions

There are three options to reference substructure search from our expressions: match function returns a true / false answer while matchCount and disjointMatchCount functions return the number of search hits.

Note: match, matchCount and disjointMatchCount functions are not available in Marvin, they can be used only if JChem software package is installed.

  1. A simple molecule matching test taking the input molecule as target:

match("C1CCOCC1")

2. Atom matching with target atom being atom 2 (0-based) of the input molecule and query atom set being all query atoms:

match(2, "C1CCOCC1")

3. Atom matching with target atom being atom 2 (0-based) of the input molecule, and query atom set being both query carbon atoms attached to the oxygen:

match(2, "C1C[C:1]O[C:2]C1", 1, 2)

4. The same with referencing the query by molecule file path:

match(2, "mols/query.mol", 1, 2)

5. The same with referencing the query by molecule ID nitro as a predefined molecule constant:

match(2, nitro, 1, 2)

6. The sum of "C=O" and "CO" groups in the input molecule:

matchCount("C=O") + matchCount("CO")

7. A more complex condition checking whether the input molecule contains sulfur and whether there are at least 6 "C=O" and "CO" groups in the input molecule alltogether:

 match("S") && (matchCount("C=O") + matchCount("CO") >= 6)