ChemCurator is a desktop application of ChemAxon for computer-aided chemical information extraction. ChemCurator is a standalone desktop application. Running this application, you need to download and run ChemCurator installer. Some short video tutorials demonstrating the main functionality are available here.
Main Menu and Toolbar
The main menu contains "File", "View", "Window" and "Help" elements.
New Project... (Ctrl+Shift+N)
Open Project... (Ctrl+Shift+O)
Open Recent Project
Import Project from ZIP...
Export Project to ZIP...
Link Project View
Show Only Editor (Ctrl+Shift+Enter)
Full Screen (Alt+Shift+Enter)
Close All Documents
Close Other Documents
Check for Updates...
Panels and Views
Most of the panels and views in ChemCurator are optionally resizable, or can be moved to different location or screen depending your preferences. The default settings can be restored by the Reset Windows function.
Project explorer panel
Project explorer panel displays the opened projects and represents the project's structure in a tree-like hierarchical way. Every project representing one document and you are able to add as many Markush structures and compound lists as you want to it. All Markush structure automatically have an Exemplified structures list.
Document view is the viewer component of the annotated documents and the related selections. The recognized chemical entities are highlighted by gray. In structure selection mode users can select recognized chemical structures by clicking on any highlighted component or select a larger part of the document by pressing left mouse button and dragging it over the targeted part of the document. The selected structures are highlighted by red and displayed under the document in the selection panel. In text selection mode users can select the document text directly. Document linking turns on the automatic scrolling of the document based on the structure selections in the editor views. With and document's zoom level can be changed.
Compounds view is the display component of compounds lists. Can handle not only chemical structures but also the related additional information columns. Data can be edited by double-clicking on any of the cells.
Markush Editor view
Markush Editor View is the display component of the Markush structures and related exemplified structures. Markush Editor View is based on the same component like Markush Editor Desktop Application, therefore, the details of editing Markush structures are available in Markush Editor documentation. Markush Editor View compared contains an additional bottom line containing the exemplified structures related the Markush structure. Exemplified structures continuously validated against the Markush structure. Examples matching to the Markush highlighted by green non-matching structures highlighted by red.
Structure checker panel
Structure checker panel displays the structure drawing errors and warnings related to the active editor component. In the case of an error, an exclamation mark appears in a red circle , in the case of warning in as yellow triangle appears . By clicking on the checker items you are able to choose between the available automatic fixer options. You are able to fix the issues one-by-one with the Fix Selected button or all together with the Fix All button.
Create new project
In ChemCurator, every project represents a document and the extracted chemical information belongs to this document. ChemCurator offers multiple project creation option based on different search formats. Independently from the original format, all document converted to an annotated HTML preserving the structure and layout of the original document. The time of annotation process strongly depends on the format, size, and content of the original document. The new project wizard available from File>New Project... or from the main toolbar with the icon.
Import document from file
The project can be created from a file stored in your local machine. ChemCurator can process pdf, html, xml and txt documents.
Import document from Google Patents
Patent documents can be imported directly from Google Patents by using the publication number of the document. The import wizard automatically tries to find the corresponding document in Google Patents and automatically download the HTML version of the patent. Most of the non-English patents machine translated English version is available in Google Patents. If you want to download the original version select Original from language preferences.
Import document from IFI Claims
If you have IFI Claim access, you can also import documents directly from IFI Claims. The import wizard automatically tries to find the corresponding document in IFI Claims and automatically download the HTML version of the patent.
Create demo project
With creating demo project function, an example project can be created containing the annotated version of US6756383B2 patent document from Google Patents and some curated data including a Markush structure and compound list.
With annotation configuration, you are able to fine tuning the annotation parameters according to your needs. The settings panel available from File>Options... or from the main toolbar with the icon.
Chemical data extraction
ChemCurator offers multiple function to help in the recognition and extraction of the relevant chemical information from documents.
Create new Markush or Compounds list
ChemCurator supports two type of chemical information, the Markush structures, and Compound list. Markush structure objects are always created together with a linked special compound list the Examples.
Manual structure extraction
Any annotated structure can be selected from the document. After selection, it can be moved using drag and drop from the selected structures view to editor components.
Compounds extraction wizard
Compounds extraction wizard is available in Compounds and Markush view. This wizard can help to automatically find and extract a large number of chemical structures from the documents. In the first panel of the wizard some basic filter criteria available.
The extraction process can be parametrized with some filter options.
Filter duplicates: Ignore the duplications by extracting only the first occurrence of compounds from the document.
Minimum mass: Set a minimum molecular mass filter criteria.
Maximum mass: Set a maximum molecular mass filter criteria.
Structure filtering options:
None: Structure filter option ignored.
Substructure: A substructure filter criteria can be set after clicking on the Next button.
Similarity with threshold: A similarity filter criteria can be set after clicking on the Next button. MCS-based similarity calculation executed in the background and structures filtered by the Tanimoto similarity of the sutures.
If Substructure or Similarity with threshold selected by clicking on the Next button you can navigate to the second tab of the extraction wizard. In a case of Similarity with threshold only exact compounds can be used as a filter without any variability feature.In a case of Substructure any atom lists, bond lists, and any query property can be used.
After clicking on the Finish button extraction started. In a case of Similarity with threshold an additional column added to the extracted compound containing the similarity value of the compound.
Additional data extraction
Compounds view is capable of handling not only the chemical structures but also the related assay data, properties, comments, etc. You can manually add this information to the compounds lists using the Creat new column function.
A simple dialog opens where the name and type of the new column can be selected. The newly created column can be edited by simply double clicking on it.
Add structures manually
Markush fragments and compounds can be added manually from fragment and compound list's context menu and with the Add new row menu item of the compounds view.
The manually added compound can be linked to the corresponding part of the document. After a right click on any structure, you can select Add reference to document... function to specify the corresponding part of the document. After starting reverse linking document view enters reverse linking mode and any part of the document can be selected. After selecting the corresponding part of the document and clicking on OK the selected part of the text will be marked as a chemical entity and linked to the manually added compound. If Add to local dictionary check box selected, the selected text and the linked compound are added to ChemCurator dictionary and will be recognized the next time during annotation.
Import compounds wizard can add compounds file with molecule properties to the selected project as a new Compounds List and automatically associate the important compounds to the first occurrence in the document.
Share and Export functions
ChemCurator offers multiple options for project sharing and export of the annotated data in various formats.
Share projects with ChemCurator integration server
ChemCurator Integration Server is the most standard way to share your project with your colleagues and store them in a central database. For the server installation details please check the Integration Server Administrator Guide additionally you need to configure the server connection details in the Chem Curator desktop application following the corresponding section of the Installation Guide.
After successful sharing, a new indicator icon appears next to the project, and you are able to upload your modifications or download the newer version of the project.
Export from Compounds and Markush view
Structure export function is available in compounds and Markush view. The structure and related information from the view can be exported in various file formats
Export project to ZIP file and import from Zip file
A project can be exported to a ZIP file by File>Export Project to ZIP... in this way the project can be easily shared by e-mail or any file sharing method.
The zipped project can be imported in a similar way by File>Import Project from ZIP... function.
Using the project folder directly
All projects are available in project directories. The default location of the projects is the C:\Users\<user name>\Documents\ChemCurator directory. The name of the project directory is the project name. Every project contains a project file (an xml with some metainformation), a document html with the connected resources and the extracted chemical information in sdf (compound lists) and mrv (Markush structures) formats.