Special search types: Polymer search

JChem supports the search of polymers with polymer queries. Polymers can also be searched by a substructure of the polymer structural repeating unit. (In addition to the polymer search features described in this section, all query features of non-polymer search can be used.) Substructures of the represented polymer containing two or more structural repeating units are not found in polymers. Polymer search is designed to compare polymers using their source- or structure-based representation. Polymers can be stored in tables of molecule type. Markush structures with polymers are not yet supported.

Polymer representation

Polymers are represented either by their repeating unit (structure-based representation) or by the original monomer (source-based representation). Both representations require the enclosing of the given structure in the appropriate bracket (see Table 1).

Table 1. SRU and monomer brackets

images/download/attachments/41128972/poly1.jpg

images/download/attachments/41128972/poly2.jpg

Structural repeating unit (structure-based representation)

Monomer bracket (source-based representation)

Repeat patterns

Structural repeating units (SRU) with two bracket-crossing bonds can have the following three repeat patterns:

  • ht: head-to-tail repeat pattern.

  • hh: head-to-head repeat pattern.

  • eu: either-unknown repeat pattern. Both repetitions are possible or the sequence is unknown.

The polymer structure represented by a monomer bracket is considered to have head-to-tail repeat pattern.

For duplicate search type repeat patterns must match exactly. In other search types the either-unknown repeat pattern can match all types, the "hh" and "ht" repeat pattern can match only itself.

Ladder-type polymers, structures with four bracket crossing bonds have additionally a flip option for the specific repeat patterns. Hence for these polymers there are five repeat patterns:

  • ht,f: head-to-tail with flip

  • ht: head-to-tail without flip

  • hh,f: head-to-head with flip

  • hh: head-to-head without flip

  • eu: either-unknown repeat pattern. All repetitions are possible or the sequence is unknown.

As for the two crossing bonds case, either-unknown repeat pattern can match to all other types for non-duplicate search types. For other repeat patterns the flipping parameter must match as well. Examples of ladder-type structures with different repeat patterns are shown in Table 2.

Table 2. Ladder type structures

polymer SRU

repeat pattern

example structure

images/download/attachments/41128972/polyladder.jpg

ht

images/download/attachments/41128972/polyladderht.jpg

ht,f

images/download/attachments/41128972/polyladderhtf.jpg

hh

images/download/attachments/41128972/polyladderhh.jpg

hh,f

images/download/attachments/41128972/polyladderhhf.jpg

Monomer-SRU matching

Monomer and structural repeating unit representations of the same polymer are matching on each other in all search types except duplicate search. For duplicate search the source and the structure-base representation, or the possibly existing several source based representations are considered non-equivalent. This enables the database registration of different monomers of the same polymer.

Matching of monomers on SRU and verse is achieved through the transformation of monomer representations to SRU representations. This transformation is based on polymerization rules, from which our system supports currently the following types:

  • Addition to a double bond. E.g. polystyrene.

  • Polymerization through elimination of water or HCl. E.g. polyester, polyamide.

Monomer transformation can be switched off by the appropriate search option. Table 3 contains examples of matching between SRU type polymers and monomers.

Table 3. Monomer and SRU matching

query

target

hit

monomer transformation

NO transformation

images/download/attachments/41128972/poly2.jpg

images/download/attachments/41128972/poly2.jpg

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/poly3.jpg

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/no.png

images/download/attachments/41128972/poly4.jpg

images/download/attachments/41128972/poly4.jpg

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/poly5.jpg

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/no.png

Cyclization and phase-shifting

For some polymers with head-to-tail repeat pattern the brackets can be shifted along the polymer chain. This "phase-shifted" SRU represents the same polymer. Structural repeating units with head-to-tail repeat pattern can find their phase shifted variant. However the structural repeating unit is different in these cases, the represented polymer is the same. Table 4. illustrates this behavior.

Table 4. Phase shifting for SRU type polymers.

query

target

images/download/attachments/41128972/poly5.jpg

images/download/attachments/41128972/poly6.jpg

images/download/attachments/41128972/poly5.jpg

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/poly6.jpg

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/yes.png

For SRUs with head-to-head or either-unknown repeat pattern the phase shifted version does not match the original.

Phase shifting can be switched off in order to maintain compatibility with MDL search types (see compatibility notes).

End group matching

Polymers in structure-based representation can have specific or undefined end groups. This latter is denoted by star atoms.

If the end group matching option is chosen, the end groups have to match exactly. In this case an undefined end group can still match specific end groups. Otherwise the end-groups are ignored.

In case of end group matching and specific end groups on the structures there is no "phase-shifting" behavior.

What are the limitations of being considered end group? Only those groups are considered end groups which have exclusively one bond crossing bracket(s). See examples for end groups (highlighted with green) and for not end groups (highlighted with red).

images/download/attachments/41128972/poly8_endgroup01.png

images/download/attachments/41128972/poly_noendgroup01.png

images/download/attachments/41128972/poly_noendgroup03.png

images/download/attachments/41128972/poly_noendgroup04.png

Matching of end-groups is illustrated in Table 5.

Table 5. Matching of end groups

query

target

hit

end group matching

NO end group matching

images/download/attachments/41128972/poly7.jpg

images/download/attachments/41128972/poly7.jpg

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/poly8.jpg

images/download/attachments/41128972/no.png

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/poly5.jpg

images/download/attachments/41128972/no.png

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/poly9.jpg

images/download/attachments/41128972/no.png

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/poly5.jpg

images/download/attachments/41128972/poly7.jpg

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/poly8.jpg

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/poly5.jpg

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/poly9.jpg

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/yes.png

Further polymer types

Copolymers

Copolymers are formed of several polymers. They have the following subtypes:

  • co - unspecified

  • alt - alternating - the components are alternating without repetition (e.g. ABABABAB...)

  • rnd - random - the components are randomly distributed. (e.g. AAABABBBAB...)

  • blk - block - the components are arranged in blocks. (e.g. ...AAAAABBBB...)

Unspecified matches all other subtypes, other subtypes have to match exactly as shown on Table 6.

Table 6. Matching of copolymer subtypes.

query

target

images/download/attachments/41128972/poly_co.jpg

images/download/attachments/41128972/poly_alt.jpg

images/download/attachments/41128972/poly_blk.jpg

images/download/attachments/41128972/poly_co.jpg

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/poly_blk.jpg

images/download/attachments/41128972/no.png

images/download/attachments/41128972/no.png

images/download/attachments/41128972/yes.png

Copolymers are differentiated on whether the bracket-crossing bonds cross the copolymer bracket or not, and if the polymers are connected or not. Connection between polymer components specifies their order. Crossing of the copolymer bracket means that the unit inside must repeat. If no bracket-crossing bonds cross the copolymer bracket the copolymer should have either-unknown repeat pattern. In duplicate search exact matching is required, in substructure search the matching behavior is shown on Table 7.

Table 7. Copolymer matching.

query

target

images/download/attachments/41128972/poly_co.jpg

images/download/attachments/41128972/poly_co3.jpg

images/download/attachments/41128972/poly_co4.jpg

images/download/attachments/41128972/poly_co.jpg

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/poly_co3.jpg

images/download/attachments/41128972/no.png

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/poly_co4.jpg

images/download/attachments/41128972/no.png

images/download/attachments/41128972/no.png

images/download/attachments/41128972/yes.png

Copolymers can be matched by simple SRU type polymes and by copolymers, if the "copolymerMatching" option is chosen then only copolymers can match a copolymer.

Other types

grf

  • Grafted copolymer

xl

  • Cross-linked copolymer

mer

  • Mer bracket represents a structural unit, that doesn't repeat with itself.

mod

  • Modification of an other structure.

Attached data search

Data sgroups attached to atoms of polymers or polymer brackets are considered during searching. About attached data matching see details.

Using the attachedDataMatch search option attached data matching can be switched off, which results in the ignorance of all attached data.

Polymer Mixtures

Mixtures are built of different components. Depending on whether the order of the components is relevant or not we distinguish ordered and unordered mixtures. Ordered mixtures or formulations (sign "f") can only match on ordered mixtures with the same order, though different numbering is possible. Unordered mixtures (sign "mix") can match both type of mixtures. See mixture documentation.

Polymers can be part of mixture-type brackets. Arbitrary depth of nesting is allowed. The criterion for matching is that the polymer/mixture brackets which include a given query structure should have a corresponding target-side brackets with the same order of nesting.

Examples of polymer mixture matching are shown on Table 8.

Table 8. Polymer mixture matching.

query

target

images/download/attachments/41128972/polymer_mix1.jpg

images/download/attachments/41128972/polymer_mix2.jpg

images/download/attachments/41128972/polymer_mix4.jpg

images/download/attachments/41128972/polymer_mix1.jpg

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/polymer_mix2.jpg

images/download/attachments/41128972/no.png

images/download/attachments/41128972/yes.png

images/download/attachments/41128972/no.png

images/download/attachments/41128972/polymer_mix4.jpg

images/download/attachments/41128972/no.png

images/download/attachments/41128972/no.png

images/download/attachments/41128972/yes.png

Relation to MDL polymer search types

Chemaxon polymer searching can be configured to correspond to MDL's polymer search types. Table 9. shows the settings that can be used for the different MDL polymer search types.
Table 9. MDL and ChemAxon polymer search

MDL search type

Chemaxon search type

Additional options

Remarks

Polymer exact

duplicate

polymer:y phaseShift:n

-

Find monomer or sru

duplicate

polymer:y transformMonomer:y

-

Polymer substructure

substructure search

polymer:y transformMonomer:n

-

Copolymer search

substructure search

polymer:y copolymerMatching:y

-