Code examples
Document to Structure is a toolkit for extracting chemical structures out of text, HTML and PDF documents. Currently, it recognizes names, SMILES, and InChI. Its API class is chemaxon.naming.DocumentExtractor. Below is a list of real life use-cases and code examples that showcase the various ways to use it:
-     Finding structures in text: 
 Uses DocumentExtractor's processPlainText() method to process a string.
-     Finding structures in a live webpage: 
 Downloads a live webpage and processes it using DocumentExtractor's processHTML() method.
-     Finding structures in a PDF document: 
 Creates a DocumentExtractor instance that reads the text from the PDF document.
-     Highlighting recognized structures in a webpage: 
 Finds the recognized names in the HTML code and wraps them with a special element for highlighting.
-     Saving results in SDF or MRV file: 
 Saves the results and related information into a multi-molecule file for use in chemical editors.
-     Storing results in a JChem structure table: 
 Sets up a database connection and stores the hits in a chemical structure database for searching.
-     Increasing processing speed by multithreading: 
 Uses multithreading and breaks HTML pages into fragments.