Reaction Rules for Chemical Intelligence

We have already described how to define the transformation to be performed on the reactant molecules and showed tips and tricks to improve this definition. Here, a special type of supplement of the reaction scheme will be presented.

Reactions taking place in nature are usually influenced by various factors: the reactivity of the molecules may depend on chemical features, such as partial charges, p K a , log P , log D values as well as substructural matching conditions of the reactants and/or the created products. It is also possible that the reactants contain multiple functional groups so they can be transformed in more than one way, some of these possibilities are more frequent than others due to differences in spatial or chemical stability of the products. Reaction rules were designed to help the reliable predictions of reaction outcomes.

In this section you will be familiarized with the three types of reaction rules:

Reactivity rule

Selectivity rule

Exclude rule

All three rules will be presented in connection to modeling the Friedel-Crafts acylation reaction. This reaction is an electrophilic aromatic substitution for the acylation of aromatic rings with the corresponding acid halides in the presence of a Lewis acid.

The transformation can be described by the generic reaction scheme:

images/download/attachments/45987093/FC_generic_reaction_scheme.png

Accordingly, the hydrogen of an aromatic carbon atom is substituted by an acyl group. Chloride, iodide or bromide derivatives may be equally used in this reaction.

This equation was designed to be simple and applicable to various reactants.

Reaction rules in general

Reaction rules are evaluated by the Chemical Terms Evaluator. Before evaluating the expression, Reactor sets the currently processed reactants and products in the reaction context.

In this way, the reactants and products can be accessed from the reaction scheme by the following expressions:

  • reactant(int i): refers to the i-th reactant (0-based indexing)

  • product(int i): refers to the i-th product (0-based indexing)

Accordingly, the aromatic system can be referred as reactant(0) while the acid halide is reactant(1). The acyl substituted aromatic is product(0) and the hydrogen halide is product(1) in the Friedel-Crafts example.

Similarly, mapped atoms can be referred to as:

  • ratom(int m): refers to the reactant atom corresponding to reactant atom map m according to the reaction equation

  • patom(int m): refers to the product atom corresponding to product atom map m according to the reaction equation

Hence, ratom(3) denotes the halogen on the reactant side while patom(3) is the halogen on the product side in the example above.

Apart from these reaction specific functions, expression strings can also reference built-in functions, plugins as well as user-defined functions and plugins.

Reactivity rule

Reactivity rules are Boolean expressions describing natural conditions - these conditions should be satisfied, otherwise the reaction does not take place. If reactivity rule is specified then Reactor returns only the product lists satisfying the rule.

In case of the Friedel-Crafts acylation, the reactivity rule can be formulated as:

charge(ratom(1), "aromaticsystem") <= -0.2

meaning that the aromatic system should be at least as activated as dihalobenzenes.

Selectivity rule

Selectivity rules are real-valued chemical expressions that order products according to their occurrence. If a selectivity rule is specified, then Reactor sorts the product lists by the evaluation result of the rule (decreasing order) and returns product lists in this order.

  • -energyE(ratom(1))

    This is a specific directional rule saying that the electrophilic substitution takes place on the aromatic carbon atom with the lowest localization energy having an attached electrophile in the transition state.

  • selectivity tolerance:

    0.02

    By this setting we change the default tolerance 0.0001 to accept other aromatic carbons having a similar localization energy with maximum difference 0.02 from the lowest value. Results will be sorted by this localization energy in ascending order, taking lowest first.

Exclude rule

Exclude rule can be specified which excludes product lists even if they satisfy the reactivity rule.

match(ratom(2), "[C:1]C=C", 1) ||
match(reactant(0), "[O,S]C=[O,S]") ||
match(reactant(0), "P[H]") ||
(max(pka(reactant(0), filter(reactant(0), "match('[O,S;H]')"), "acidic")) > 14.5) ||
(max(pka(reactant(0), filter(reactant(0), "match('[#7][H]')"), "basic")) > 0)

The first reactant may not contain carboxylic acid group, or its thio analogue. Exclude acryloyc halides as acylating agent. Exclude PH compounds, and aromatic compounds containing such nucleophilic groups which can be acylated under these conditions (OH, SH compounds with pKa higher than 14.5, and NH compounds with pKb higher than 0). This is a more complicated, somewhat heuristical rule. You may want to have a look at a similar condition in the basic examples section referring to a molecule context. Note, that now our condition should explicitly refer to the input molecule as reactant(0), meaning the first reactant, while in a molecule context the expression implicitly refers to the input molecule.

We exclude reactant pairs satisfying any of the following subexpressions ( || means logical OR ):

  1. the second reactant is an acryloic halide, testing this with an atom-matching condition on carbon atoms:

    match(ratom(2), "[C:1]C=C", 1)

    acryloic-halide.smiles

    images/www.chemaxon.com/marvin/examples/evaluator/img/acryloic-halide.png

  2. the first reactant contains a phosphorus with an attached hydrogen:

    match(reactant(0), "P[H]")
  3. the first reactant contains an OH or SH with acidic pKa greater than 14.5:

    max(pka(reactant(0), filter(reactant(0), "match('[O,S;H]')"), "acidic")) > 14.5
  4. the first reactant contains a nitrogen with positive basic pKa with an attached hydrogen:

    max(pka(reactant(0), filter(reactant(0), "match('[#7][H]')"), "basic")) > 0