Special search types: Polymer search
JChem supports the search of polymers with polymer queries. Polymers can also be searched by a substructure of the polymer structural repeating unit. (In addition to the polymer search features described in this section, all query features of non-polymer search can be used.) Substructures of the represented polymer containing two or more structural repeating units are not found in polymers. Polymer search is designed to compare polymers using their source- or structure-based representation. Polymers can be stored in tables of molecule type. Markush structures with polymers are not yet supported.
Polymer representation
Polymers are represented either by their repeating unit (structure-based representation) or by the original monomer (source-based representation). Both representations require the enclosing of the given structure in the appropriate bracket (see Table 1).
Table 1. SRU and monomer brackets
|
|
Structural repeating unit (structure-based representation) |
Monomer bracket (source-based representation) |
Repeat patterns
Structural repeating units (SRU) with two bracket-crossing bonds can have the following three repeat patterns:
-
ht: head-to-tail repeat pattern.
-
hh: head-to-head repeat pattern.
-
eu: either-unknown repeat pattern. Both repetitions are possible or the sequence is unknown.
The polymer structure represented by a monomer bracket is considered to have head-to-tail repeat pattern.
For duplicate search type repeat patterns must match exactly. In other search types the either-unknown repeat pattern can match all types, the "hh" and "ht" repeat pattern can match only itself.
Ladder-type polymers, structures with four bracket crossing bonds have additionally a flip option for the specific repeat patterns. Hence for these polymers there are five repeat patterns:
-
ht,f: head-to-tail with flip
-
ht: head-to-tail without flip
-
hh,f: head-to-head with flip
-
hh: head-to-head without flip
-
eu: either-unknown repeat pattern. All repetitions are possible or the sequence is unknown.
As for the two crossing bonds case, either-unknown repeat pattern can match to all other types for non-duplicate search types. For other repeat patterns the flipping parameter must match as well. Examples of ladder-type structures with different repeat patterns are shown in Table 2.
Table 2. Ladder type structures
polymer SRU |
repeat pattern |
example structure |
|
ht |
|
ht,f |
|
|
hh |
|
|
hh,f |
|
Monomer-SRU matching
Monomer and structural repeating unit representations of the same polymer are matching on each other in all search types except duplicate search. For duplicate search the source and the structure-base representation, or the possibly existing several source based representations are considered non-equivalent. This enables the database registration of different monomers of the same polymer.
Matching of monomers on SRU and verse is achieved through the transformation of monomer representations to SRU representations. This transformation is based on polymerization rules, from which our system supports currently the following types:
-
Addition to a double bond. E.g. polystyrene.
-
Polymerization through elimination of water or HCl. E.g. polyester, polyamide.
Monomer transformation can be switched off by the appropriate search option. Table 3 contains examples of matching between SRU type polymers and monomers.
Table 3. Monomer and SRU matching
query |
target |
hit |
|
monomer transformation |
NO transformation |
||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Cyclization and phase-shifting
For some polymers with head-to-tail repeat pattern the brackets can be shifted along the polymer chain. This "phase-shifted" SRU represents the same polymer. Structural repeating units with head-to-tail repeat pattern can find their phase shifted variant. However the structural repeating unit is different in these cases, the represented polymer is the same. Table 4. illustrates this behavior.
Table 4. Phase shifting for SRU type polymers.
query |
target |
|
|
|
|
|
|
|
|
|
|
For SRUs with head-to-head or either-unknown repeat pattern the phase shifted version does not match the original.
Phase shifting can be switched off in order to maintain compatibility with MDL search types (see compatibility notes).
End group matching
Polymers in structure-based representation can have specific or undefined end groups. This latter is denoted by star atoms.
If the end group matching option is chosen, the end groups have to match exactly. In this case an undefined end group can still match specific end groups. Otherwise the end-groups are ignored.
In case of end group matching and specific end groups on the structures there is no "phase-shifting" behavior.
What are the limitations of being considered end group? Only those groups are considered end groups which have exclusively one bond crossing bracket(s). See examples for end groups (highlighted with green) and for not end groups (highlighted with red).
|
||
|
|
|
Matching of end-groups is illustrated in Table 5.
Table 5. Matching of end groups
query |
target |
hit |
|
end group matching |
NO end group matching |
||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Further polymer types
Copolymers
Copolymers are formed of several polymers. They have the following subtypes:
-
co - unspecified
-
alt - alternating - the components are alternating without repetition (e.g. ABABABAB...)
-
rnd - random - the components are randomly distributed. (e.g. AAABABBBAB...)
-
blk - block - the components are arranged in blocks. (e.g. ...AAAAABBBB...)
Unspecified matches all other subtypes, other subtypes have to match exactly as shown on Table 6.
Table 6. Matching of copolymer subtypes.
query |
target |
||
|
|
|
|
|
|
|
|
|
|
|
|
Copolymers are differentiated on whether the bracket-crossing bonds cross the copolymer bracket or not, and if the polymers are connected or not. Connection between polymer components specifies their order. Crossing of the copolymer bracket means that the unit inside must repeat. If no bracket-crossing bonds cross the copolymer bracket the copolymer should have either-unknown repeat pattern. In duplicate search exact matching is required, in substructure search the matching behavior is shown on Table 7.
Table 7. Copolymer matching.
query |
target |
||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Copolymers can be matched by simple SRU type polymes and by copolymers, if the "copolymerMatching" option is chosen then only copolymers can match a copolymer.
Other types
grf
-
Grafted copolymer
xl
-
Cross-linked copolymer
mer
-
Mer bracket represents a structural unit, that doesn't repeat with itself.
mod
-
Modification of an other structure.
Attached data search
Data sgroups attached to atoms of polymers or polymer brackets are considered during searching. About attached data matching see details.
Using the attachedDataMatch search option attached data matching can be switched off, which results in the ignorance of all attached data.
Polymer Mixtures
Mixtures are built of different components. Depending on whether the order of the components is relevant or not we distinguish ordered and unordered mixtures. Ordered mixtures or formulations (sign "f") can only match on ordered mixtures with the same order, though different numbering is possible. Unordered mixtures (sign "mix") can match both type of mixtures. See mixture documentation.
Polymers can be part of mixture-type brackets. Arbitrary depth of nesting is allowed. The criterion for matching is that the polymer/mixture brackets which include a given query structure should have a corresponding target-side brackets with the same order of nesting.
Examples of polymer mixture matching are shown on Table 8.
Table 8. Polymer mixture matching.
query |
target |
||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Relation to MDL polymer search types
Chemaxon polymer searching can be configured to correspond to MDL's polymer search types. Table 9. shows the settings that can be used for the different MDL polymer search types.
Table 9. MDL and ChemAxon polymer search
MDL search type |
Chemaxon search type |
Additional options |
Remarks |
Polymer exact |
duplicate |
polymer:y phaseShift:n |
- |
Find monomer or sru |
duplicate |
polymer:y transformMonomer:y |
- |
Polymer substructure |
substructure search |
polymer:y transformMonomer:n |
- |
Copolymer search |
substructure search |
polymer:y copolymerMatching:y |
- |