Recent Changes - Quixote project on QC databaseshttp://quixote.wikispot.org/Recent_ChangesRecent Changes on Quixote project on QC databases.en-us Front Pagehttp://quixote.wikispot.org/Front_Page2013-04-28 10:32:57(quick edit) <div id="content" class="wikipage content"> Differences for Front Page<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 5: </td> <td> Line 5: </td> </tr> <tr> <td> <span>- An example of the type of architecture that we are developing is shown below.</span> </td> <td> <span>+ Quixote is an international collaboration heavily relying on web connectivity and voluntary work by motivated researchers. The main objective/vision of the Quixote project is to design, test and deploy a modular, open source system of tools that allow computational chemistry data (now sitting in the darkness of individual hard-disks) to be organized, shared, and queried. This is to be achieved by using lightweight interdependent applications, semantic analysis of the data and linkability.</span> </td> </tr> </table> </div> Front Pagehttp://quixote.wikispot.org/Front_Page2013-04-28 10:32:31(quick edit) <div id="content" class="wikipage content"> Differences for Front Page<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 3: </td> <td> Line 3: </td> </tr> <tr> <td> <span>- Quixote is an international collaboration heavily relying on web connectivity and voluntary work by motivated researchers. The main objective/vision of the Quixote project is to design, test and deploy a modular, open source system of tools that allow computational chemistry data (now sitting in the darkness of individual hard-disks) to be organized, shared, and queried. This is to be achieved by using lightweight interdependent applications, semantic analysis of the data and linkability.</span> </td> <td> <span>+ </span> </td> </tr> </table> </div> Tutorials and problemshttp://quixote.wikispot.org/Tutorials_and_problems2013-03-05 14:36:15tcnscRevert to version 61 (Removed spam.). <div id="content" class="wikipage content"> Differences for Tutorials and problems<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 86: </td> <td> Line 86: </td> </tr> <tr> <td> <span>-</span> This will download the java jar files required by the converters into your local maven repository (~/.m2/repository on unix), compile, test and install the code into your local maven repository. Everything will then be ready for using the JUMBO-Converters software.<span>&nbsp;[https://www.whiteknightcasino.com/ Online Casino No Download]</span> </td> <td> <span>+</span> This will download the java jar files required by the converters into your local maven repository (~/.m2/repository on unix), compile, test and install the code into your local maven repository. Everything will then be ready for using the JUMBO-Converters software. </td> </tr> </table> </div> Tutorials and problemshttp://quixote.wikispot.org/Tutorials_and_problems2013-03-05 09:19:47 <div id="content" class="wikipage content"> Differences for Tutorials and problems<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 86: </td> <td> Line 86: </td> </tr> <tr> <td> <span>-</span> This will download the java jar files required by the converters into your local maven repository (~/.m2/repository on unix), compile, test and install the code into your local maven repository. Everything will then be ready for using the JUMBO-Converters software. </td> <td> <span>+</span> This will download the java jar files required by the converters into your local maven repository (~/.m2/repository on unix), compile, test and install the code into your local maven repository. Everything will then be ready for using the JUMBO-Converters software.<span>&nbsp;[https://www.whiteknightcasino.com/ Online Casino No Download]</span> </td> </tr> </table> </div> Tutorials and problemshttp://quixote.wikispot.org/Tutorials_and_problems2013-03-04 12:25:15tcnscRevert to version 59 (Removed spam.). <div id="content" class="wikipage content"> Differences for Tutorials and problems<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 19: </td> <td> Line 19: </td> </tr> <tr> <td> <span>-</span> * [http://mercurial.selenic.com/ Mercurial] for interacting with software repositories. Check ["Mercurial"] for a basic tutorial and known problems.<span>&nbsp;&nbsp;[https://www.whiteknightcasino.com/ Online Casino No Download]</span> </td> <td> <span>+</span> * [http://mercurial.selenic.com/ Mercurial] for interacting with software repositories. Check ["Mercurial"] for a basic tutorial and known problems. </td> </tr> </table> </div> Tutorials and problemshttp://quixote.wikispot.org/Tutorials_and_problems2013-03-04 12:16:38 <div id="content" class="wikipage content"> Differences for Tutorials and problems<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 19: </td> <td> Line 19: </td> </tr> <tr> <td> <span>-</span> * [http://mercurial.selenic.com/ Mercurial] for interacting with software repositories. Check ["Mercurial"] for a basic tutorial and known problems. </td> <td> <span>+</span> * [http://mercurial.selenic.com/ Mercurial] for interacting with software repositories. Check ["Mercurial"] for a basic tutorial and known problems.<span>&nbsp;&nbsp;[https://www.whiteknightcasino.com/ Online Casino No Download]</span> </td> </tr> </table> </div> Tutorials and problemshttp://quixote.wikispot.org/Tutorials_and_problems2013-03-01 14:42:35tcnscRevert to version 57 (Removed spam.). <div id="content" class="wikipage content"> Differences for Tutorials and problems<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 17: </td> <td> Line 17: </td> </tr> <tr> <td> <span>-</span> If that gives a '''command not found''' type error, you need to install the Java Development Kit.<span>&nbsp;* [http://www.telsgroup.com.de/ Internationale Spedition]</span> </td> <td> <span>+</span> If that gives a '''command not found''' type error, you need to install the Java Development Kit. </td> </tr> </table> </div> Tutorials and problemshttp://quixote.wikispot.org/Tutorials_and_problems2013-03-01 08:17:26 <div id="content" class="wikipage content"> Differences for Tutorials and problems<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 17: </td> <td> Line 17: </td> </tr> <tr> <td> <span>-</span> If that gives a '''command not found''' type error, you need to install the Java Development Kit. </td> <td> <span>+</span> If that gives a '''command not found''' type error, you need to install the Java Development Kit.<span>&nbsp;* [http://www.telsgroup.com.de/ Internationale Spedition]</span> </td> </tr> </table> </div> Tutorials and problemshttp://quixote.wikispot.org/Tutorials_and_problems2013-02-20 13:57:27tcnscRevert to version 54 (Removed spam.). <div id="content" class="wikipage content"> Differences for Tutorials and problems<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 38: </td> <td> Line 38: </td> </tr> <tr> <td> <span>- [http://www.imcredo.com/services/ppc/ PPC Agency]</span> </td> <td> </td> </tr> </table> </div> Tutorials and problemshttp://quixote.wikispot.org/Tutorials_and_problems2013-02-20 13:14:26 <div id="content" class="wikipage content"> Differences for Tutorials and problems<p><strong></strong></p>No differences found!</div> Tutorials and problemshttp://quixote.wikispot.org/Tutorials_and_problems2013-02-20 13:14:07 <div id="content" class="wikipage content"> Differences for Tutorials and problems<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 38: </td> <td> Line 38: </td> </tr> <tr> <td> </td> <td> <span>+ [http://www.imcredo.com/services/ppc/ PPC Agency]</span> </td> </tr> </table> </div> Resources and technologyhttp://quixote.wikispot.org/Resources_and_technology2012-09-20 17:00:34JensThomas <div id="content" class="wikipage content"> Differences for Resources and technology<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 49: </td> <td> Line 49: </td> </tr> <tr> <td> <span>- ==Chunking==<br> - <br> - The idea is implemented in a particular code (["GAMESS-UK"]) and it uses ["JUMBO-Converters"] but the approach is rather general.<br> - <br> - * ["Chunkers" First divide the file in chunks or blocks]<br> - * ["Block" What is a block?]<br> - * ["Parsing Blocks"]<br> - <br> - ==Converting to CML using ["JUMBO-Converters"]==<br> - <br> - The idea is implemented in a particular code (["GAMESS-UK"]) but the approach is rather general.<br> - <br> - * ["How converters work"]<br> - * ["creating rawCML" Creating raw CML]<br> - * ["creating completeCML" Creating complete CML]<br> - <br> - See also ["Creating a converter"]</span> </td> <td> <span>+ <br> + The current approach adopted by the Quixote Project is to use the ["JUMBO-Converters"].</span> </td> </tr> </table> </div> Chempoundhttp://quixote.wikispot.org/Chempound2012-08-14 17:59:59JensThomas <div id="content" class="wikipage content"> Differences for Chempound<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 699: </td> <td> Line 699: </td> </tr> <tr> <td> </td> <td> <span>+ <br> + === Debugging Chempound running under Jetty ===<br> + <br> + A simple way to debug chempound when running under jetty, is to add the following lines to the jetty '''start.ini''' file, which is used to prepend command-line arguments to jetty (the arguments can also be added to the command-line when starting jetty, or indeed to any java program that supports log4j):<br> + <br> + {{{<br> + -Dlog4j.debug<br> + -Dlog4j.configuration=file:/Users/jmht/Documents/quixote/jetty-hightide-8.1.4.v20120524/quixote/qc-log4j.properties<br> + }}}<br> + <br> + The first line turns on debugging for log4j itself - this is useful as it causes log4j to print which configuration file it is using. The second file gives the path to a log4j configuration file, which should contain the directives as described above.</span> </td> </tr> </table> </div> Chempoundhttp://quixote.wikispot.org/Chempound2012-08-07 09:46:49JensThomas <div id="content" class="wikipage content"> Differences for Chempound<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 642: </td> <td> Line 642: </td> </tr> <tr> <td> </td> <td> <span>+ <br> + == Debugging Chempound ==<br> + <br> + If Chempound is not working as expected, the logging facility can be used to increase the amount of information printed, which is useful for tracking down the causes of problems.<br> + <br> + The logging subsystem consists of the interface [http://slf4j.org/ SLF4J] 1.6.1 (Simple Logging Facade for Java): and the implementation [http://logging.apache.org/log4j/1.2/index.html LOG4J] 1.2.<br> + <br> + The included configuration for DepositNWChem is:<br> + <br> + {{{<br> + log4j.rootLogger = WARN, A<br> + <br> + log4j.appender.A = org.apache.log4j.ConsoleAppender<br> + log4j.appender.A.layout = org.apache.log4j.PatternLayout<br> + log4j.appender.A.layout.ConversionPattern = %-4r [%t] %-5p %c %x - %m%n<br> + log4j.appender.A.target = System.err<br> + }}}<br> + <br> + However, you can change the logging behavior of the application by adding your own log4j.properties file to the classpath. For instance, the following configuration file will set the general log level to INFO and, for class uk.ac.cam.ch.wwmm.chempound.compchem.CmlComp2RdfConverter, the level will be DEBUG.<br> + <br> + {{{<br> + log4j.rootLogger = INFO, A<br> + <br> + log4j.appender.A = org.apache.log4j.ConsoleAppender<br> + log4j.appender.A.layout = org.apache.log4j.PatternLayout<br> + log4j.appender.A.layout.ConversionPattern = %-4r [%t] %-5p %c %x - %m%n<br> + log4j.appender.A.target = System.err<br> + <br> + log4j.logger.uk.ac.cam.ch.wwmm.chempound.compchem.CmlComp2RdfConverter=DEBUG<br> + }}}<br> + <br> + For example, if you place the log4j.properties configuration file in your current working directory and you run DepositNWChem from there, you can add the current directory to the classpath as follows (it assumes you have the jar file of DepositNWChem with its dependencies in a target subdirectory):<br> + <br> + {{{<br> + $ java -cp .:target/quixote-utils-0.1-SNAPSHOT-jar-with-dependencies.jar net.quixote.utils.DepositNWChem http://localhost:8080/sword/collection/ n2.out<br> + }}}<br> + <br> + You can use logging anywhere in the code. You will need to grab a Logger object to pass the logging messages. Simply import the Logger and LoggerFactory classes, and call LoggerFactory.getLogger to obtain a Logger object. Then call any of the debug, info, warn or error methods to log your message at the appropriate log level.<br> + <br> + The following code snippet shows how to get the root Logger as well as another child Logger (identified with the uk.ac.cam.ch.wwmm.chempound.compchem.CmlComp2RdfConverter class name) and how to emit a INFO level message.<br> + <br> + {{{<br> + import org.slf4j.Logger;<br> + import org.slf4j.LoggerFactory;<br> + <br> + class ... {<br> + ... method (...) {<br> + Logger rootL = LoggerFactory.getLogger(Logger.ROOT_LOGGER_NAME);<br> + rootL.info("Using root logger.");<br> + <br> + Logger otherL = LoggerFactory.getLogger(uk.ac.cam.ch.wwmm.chempound.compchem.CmlComp2RdfConverter.class);<br> + otherL.info("Using CmlComp2RdfConverter logger.");<br> + <br> + <br> + }<br> + }<br> + }}}</span> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-06-21 09:53:46JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 5: </td> <td> Line 5: </td> </tr> <tr> <td> <span>-</span> Primary site: [https://bitbucket.org/wwmm/jumbo<span>-</span>converters<span>&nbsp;https://bitbucket.org/wwmm/jumbo</span>-co<span>nv</span>e<span>rters</span>] </td> <td> <span>+</span> Primary site: [https://bitbucket.org/wwmm/jumboconverters-co<span>mpch</span>e<span>m</span>] </td> </tr> <tr> <td> Line 35: </td> <td> Line 35: </td> </tr> <tr> <td> </td> <td> <span>+ Jumbo converters are written in Java, although the template parsing technology is described entirely in XML, so that once a new parser module has been created, only XML files need to be edited in order to extend and develop the parser.<br> + <br> + The reference parser for computational chemistry is the NWChem parser, so any examples will refer to it.<br> + <br> + The java class that controls the two-stage parsing for the NWChem is [https://bitbucket.org/wwmm/jumboconverters-compchem/src/693ba1b572b8/jc-compchem-nwchem/src/main/java/org/xmlcml/cml/converters/compchem/nwchem/log/NWChemLog2CompchemConverter.java NWChemLog2CompchemConverter.java].<br> + <br> + The first stage (controlled by the [https://bitbucket.org/wwmm/jumboconverters-compchem/src/693ba1b572b8/jc-compchem-nwchem/src/main/java/org/xmlcml/cml/converters/compchem/nwchem/log/NWChemLog2XMLConverter.java NWChemLog2XMLConverter.java] class), uses the [https://bitbucket.org/wwmm/jumboconverters-compchem/src/693ba1b572b8/jc-compchem-nwchem/src/main/resources/org/xmlcml/cml/converters/compchem/nwchem/log/templates/topTemplate.xml topTemplate.xml] file to include the various XML templates that parse the different chunks of the logfile.<br> + <br> + The second stage (controlled by the [https://bitbucket.org/wwmm/jumboconverters-compchem/src/693ba1b572b8/jc-compchem-nwchem/src/main/java/org/xmlcml/cml/converters/compchem/nwchem/log/NWChemLogXML2CompchemConverter.java NWChemLogXML2CompchemConverter.java] class), uses the transforms in the [https://bitbucket.org/wwmm/jumboconverters-compchem/src/693ba1b572b8/jc-compchem-nwchem/src/main/resources/org/xmlcml/cml/converters/compchem/nwchem/log/nwchem2compchem.xml nwchem2compchem.xml] to manipulate the raw XML into a [http://www.xml-cml.org/convention/compchem convention]-compliant form.<br> + </span> </td> </tr> <tr> <td> Line 36: </td> <td> Line 46: </td> </tr> <tr> <td> <span>- <br> - == Top-level declarations and parsing hierarchy ==<br> - <br> - A '''topTemplate.xml''' file, is a top-level file that creates the parent template and then defines all the templates which it itself contains.<br> - <br> - The definitions of the sub-templates are kept in separate files that are '''''included''''' from a templates subdirectory to aid modularisation.<br> - <br> - Each template has one or more '''pattern'''s, which is a regular expression defining the text in the logfile where the module starts, and one or more '''endPattern'''s, which is a regular expression defining where the module ends. The parser will read through the log file checking each line against the patterns in the list of templates included in the topTemplate.<br> - <br> - If a pattern is matched, all subsequent text is "gobbled" and added to the module until the '''endPattern''' is reached, at which point the module is closed, and the next line of text is searched to see if it matches a pattern.<br> - <br> - Each module is then parsed in turn, either by another template within this template, or by the '''records''' in the template.</span> </td> <td> </td> </tr> <tr> <td> Line 107: </td> <td> Line 105: </td> </tr> <tr> <td> <span>- </span> </td> <td> </td> </tr> <tr> <td> Line 152: </td> <td> Line 149: </td> </tr> <tr> <td> <span>-</span> This result of th<span>e</span> parsing is as follows: </td> <td> <span>+</span> This result of th<span>is</span> parsing is as follows: </td> </tr> <tr> <td> Line 234: </td> <td> Line 231: </td> </tr> <tr> <td> <span>- For the Gaussian logfile templates, the code that runs these tests lives in the file:<br> - <br> - [https://bitbucket.org/wwmm/jumbo-converters/src/c25586883d1b/jumbo-converters-compchem/jumbo-converters-compchem-gaussian/src/test/java/org/xmlcml/cml/converters/compchem/gaussian/log/TemplateTest.java jumbo-converters-compchem/jumbo-converters-compchem-gaussian/src/test/java/org/xmlcml/cml/converters/compchem/gaussian/log/TemplateTest.java]<br> - <br> - To test and develop an individual template (using the l601.fermi template as an example), the following line needs to be added to the TemplateTest.java file.<br> - <br> - {{{<br> - @Test public void testl601Fermi() {runTemplateTest("l601/", "l601.fermi");}<br> - }}}<br> - <br> - The individual test can be run from within Eclipse, but from the command-line, it only appears possible to run all of the TemplateTests (see not below), using the following command, whilst sat in the '''jumbo-converters/jumbo-converters-compchem/jumbo-converters-compchem-gaussian''' directory:<br> - <br> - {{{<br> - mvn -Dtest="log.TemplateTest" test</span> </td> <td> <span>+ For the NWChem logfile templates, the code that runs these tests lives in the file:<br> + <br> + [https://bitbucket.org/wwmm/jumboconverters-compchem/src/693ba1b572b8/jc-compchem-nwchem/src/test/java/org/xmlcml/cml/converters/compchem/nwchem/log/TemplateUnitTests.java TemplateUnitTests.java]<br> + <br> + To test and develop an individual template (using the xyz template as an example), the following line needs to be added to the TemplateTest.java file.<br> + <br> + {{{<br> + @Test public void testXyz() {runTemplateTest("xyz");}<br> + }}}<br> + <br> + The individual test can be run from within Eclipse, but from the command-line, it only appears possible to run all of the TemplateTests (see note below), using the following command, whilst sat in the '''jumboconverters-compchem/jc-compchem-nwchem''' directory:<br> + <br> + {{{<br> + mvn -Dtest="log.TemplateUnitTests" test</span> </td> </tr> <tr> <td> Line 258: </td> <td> Line 255: </td> </tr> <tr> <td> <span>-</span> &lt;module cmlx:templateRef="<span>l601.fermi</span>" xmlns="http://www.xml-cml.org/schema" xmlns:cmlx="http://www.xml-cml.org/schema/cmlx"&gt; </td> <td> <span>+</span> &lt;module cmlx:templateRef="<span>xyz</span>" xmlns="http://www.xml-cml.org/schema" xmlns:cmlx="http://www.xml-cml.org/schema/cmlx"&gt; </td> </tr> <tr> <td> Line 263: </td> <td> Line 260: </td> </tr> <tr> <td> <span>-</span> The chunk of test after the '''------------test---------------------''' line, and excluding the '''&lt;?xml version="1.0" encoding="UTF-8"?&gt;''' line is the output of the test. This should be checked, and if correct, placed in the '''&lt;comment class="example.output" id="<span>l601.fermi</span>"&gt;''' tag in the template. Re-running the test should then lead to a successful result. </td> <td> <span>+</span> The chunk of test after the '''------------test---------------------''' line, and excluding the '''&lt;?xml version="1.0" encoding="UTF-8"?&gt;''' line is the output of the test. This should be checked, and if correct, placed in the '''&lt;comment class="example.output" id="<span>xyz</span>"&gt;''' tag in the template. Re-running the test should then lead to a successful result. </td> </tr> <tr> <td> Line 268: </td> <td> Line 265: </td> </tr> <tr> <td> <span>-</span> mvn -Dtest="log.TemplateTest#test<span>l601Fermi</span>" test </td> <td> <span>+</span> mvn -Dtest="log.TemplateTest#test<span>Xyz</span>" test </td> </tr> <tr> <td> Line 279: </td> <td> Line 276: </td> </tr> <tr> <td> <span>- Taking the current implementation of the Gaussian log parser as an example, the code for the [https://bitbucket.org/wwmm/jumbo-converters/src/9aa183a356dd/jumbo-converters-compchem/jumbo-converters-compchem-gaussian/src/main/java/org/xmlcml/cml/converters/compchem/gaussian/log/GaussianLog2CompchemConverter.java GaussianLog2CompchemConverter], reads in two files:<br> - <br> - * [https://bitbucket.org/wwmm/jumbo-converters/src/c25586883d1b/jumbo-converters-compchem/jumbo-converters-compchem-gaussian/src/main/resources/org/xmlcml/cml/converters/compchem/gaussian/log/templates/topTemplate.xml topTemplate.xml] - as mentioned above, this processes all of the templates.<br> - * [https://bitbucket.org/wwmm/jumbo-converters/src/9aa183a356dd/jumbo-converters-compchem/jumbo-converters-compchem-gaussian/src/main/resources/org/xmlcml/cml/converters/compchem/gaussian/log/gaussian2compchem.xml gaussian2compchem.xml] - this is run subsequent to the file being parsed and runs transforms on the XML. As with the templates, there are a number of transforms that are carried out, the code for which either resides in this file, or files which are '''included''' by it from the templates directory.<br> - </span> </td> <td> </td> </tr> <tr> <td> Line 298: </td> <td> Line 290: </td> </tr> <tr> <td> <span>- A brief overview of the key transformations follows below, however, for those with a strong constitution, a more comprehensive documentation can be found by examining the code in the file:<br> - <br> - * [https://bitbucket.org/wwmm/jumbo-converters/src/9aa183a356dd/jumbo-converters-core/src/main/java/org/xmlcml/cml/converters/text/TransformElement.java jumbo-converters-core/src/main/java/org/xmlcml/cml/converters/text/TransformElement.java]<br> - <br> - The text from ~ line 156, starting with the comment '''// process values''' lists the processes that are available.</span> </td> <td> <span>+ A brief overview of the key transformations follows below, however, for those with a strong constitution, a more comprehensive documentation can be found by examining the code in the file [https://bitbucket.org/wwmm/jumboconverters-template/src/96f9aa6e4426/src/main/java/org/xmlcml/cml/converters/templates/output/TransformElement.java TransformElement.java]<br> + <br> + The text from ~ line 160, starting with the comment '''// process values''' lists the processes that are available.</span> </td> </tr> <tr> <td> Line 743: </td> <td> Line 733: </td> </tr> <tr> <td> <span>- Under<br> - <br> - {{{<br> - jumbo-converters/<br> - }}}<br> - <br> - we have<br> - <br> - {{{<br> - jumbo-converters/<br> - jumbo-converters-compchem/<br> - }}}<br> - <br> - which is where most of the important stuff for Quixote lies if we do not want to get into too many details.</span> </td> <td> <span>+ The main folder is<br> + <br> + {{{<br> + jumboconverters-compchem/<br> + }}}<br> + <br> + Under this is:<br> + <br> + {{{<br> + jumboconverters-compchem/<br> + jc-compchem-nwchem/<br> + }}}</span> </td> </tr> <tr> <td> Line 761: </td> <td> Line 749: </td> </tr> <tr> <td> <span>-</span> jumbo<span>-converters/<br> - jumbo-</span>converters-compchem/<br> - src/<br> <span>-</span> target/ </td> <td> <span>+</span> jumboconverters-compchem/<br> <span>+ jc</span>-<span>compchem-nwchem/<br> + </span> src/<br> <span>+ </span> target/ </td> </tr> <tr> <td> Line 770: </td> <td> Line 758: </td> </tr> <tr> <td> <span>-</span> jumbo<span>-converters/<br> - jumbo-</span>converters-compchem/ </td> <td> <span>+</span> jumboconverters-compchem/<span><br> + jc-compchem-nwchem/</span> </td> </tr> </table> </div> Declarative parsing syntaxhttp://quixote.wikispot.org/Declarative_parsing_syntax2012-06-21 09:25:23JensThomas <div id="content" class="wikipage content"> Differences for Declarative parsing syntax<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 70: </td> <td> Line 70: </td> </tr> <tr> <td> <span>- For example, in the case of the list in the previous section, the list includes the file {{{environment.xml}}}, which is located in<br> - <br> - [https://bitbucket.org/wwmm/jumbo-converters/src/c25586883d1b/jumbo-converters-compchem/jumbo-converters-compchem-gaussian/src/main/resources/org/xmlcml/cml/converters/compchem/gaussian/log/templates/l601 jumbo-converters-compchem/jumbo-converters-compchem-gaussian/src/main/resources/org/xmlcml/cml/converters/compchem/gaussian/log/templates/l601/]<br> - <br> - together with several other template files related to the environment link.</span> </td> <td> <span>+ For example, in the case of the list in the previous section, the list includes the file [https://bitbucket.org/wwmm/jumboconverters-compchem/src/693ba1b572b8/jc-compchem-nwchem/src/main/resources/org/xmlcml/cml/converters/compchem/nwchem/log/templates/environment.xml environment.xml], together with several other template files related to the environment link.</span> </td> </tr> <tr> <td> Line 157: </td> <td> Line 153: </td> </tr> <tr> <td> <span>- When no further match is found, the parser will proceed to the next item in the list, in this case {{{nwchem.job.xml}}}, which will do the same process but '''without having access''' to the already captured modules.</span> </td> <td> <span>+ When no further match is found, the parser will proceed to the next item in the list, in this case [https://bitbucket.org/wwmm/jumboconverters-compchem/src/693ba1b572b8/jc-compchem-nwchem/src/main/resources/org/xmlcml/cml/converters/compchem/nwchem/log/templates/nwchem.job.xml nwchem.job.xml], which will do the same process but '''without having access''' to the already captured modules.</span> </td> </tr> <tr> <td> Line 159: </td> <td> Line 155: </td> </tr> <tr> <td> <span>- The final important point is that this process can be nested, i.e., there can be modules inside modules inside modules, as you can see in the contents of {{{environment.xml}}}, where a deeper list of templates appear. Of course, its items will only be matched against the chunk that was matched in the parent module.</span> </td> <td> <span>+ The final important point is that this process can be nested, i.e., there can be modules inside modules inside modules, as you can see in the contents of [https://bitbucket.org/wwmm/jumboconverters-compchem/src/693ba1b572b8/jc-compchem-nwchem/src/main/resources/org/xmlcml/cml/converters/compchem/nwchem/log/templates/environment.xml environment.xml], where a deeper list of templates appear. Of course, its items will only be matched against the chunk that was matched in the parent module.</span> </td> </tr> </table> </div> Declarative parsing syntaxhttp://quixote.wikispot.org/Declarative_parsing_syntax2012-06-21 09:22:12JensThomas <div id="content" class="wikipage content"> Differences for Declarative parsing syntax<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 2: </td> <td> Line 2: </td> </tr> <tr> <td> <span>- </span> </td> <td> </td> </tr> <tr> <td> Line 8: </td> <td> Line 7: </td> </tr> <tr> <td> <span>-</span> The file called {{{topTemplate.xml}}}, which is located at the top folder for each filetype, e.g., </td> <td> <span>+</span> The file called {{{topTemplate.xml}}}, which is located at the top folder for each filetype, e.g.,<span>[https://bitbucket.org/wwmm/jumboconverters-compchem/src/693ba1b572b8/jc-compchem-nwchem/src/main/resources/org/xmlcml/cml/converters/compchem/nwchem/log/templates/topTemplate.xml topTemplate.xml]</span> </td> </tr> <tr> <td> Line 10: </td> <td> Line 9: </td> </tr> <tr> <td> <span>- [https://bitbucket.org/wwmm/jumbo-converters/src/c25586883d1b/jumbo-converters-compchem/jumbo-converters-compchem-gaussian/src/main/resources/org/xmlcml/cml/converters/compchem/gaussian/log/templates/topTemplate.xml jumbo-converters-compchem/jumbo-converters-compchem-gaussian/src/main/resources/org/xmlcml/cml/converters/compchem/gaussian/log/templates/topTemplate.xml]<br> - <br> - is the first one which is read by ["JUMBO-Converters"] when the file is to be parsed; a Gaussian logfile in this case.</span> </td> <td> <span>+ is the first one which is read by ["JUMBO-Converters"] when the file is to be parsed; an NWChem logfile in this case.</span> </td> </tr> <tr> <td> Line 18: </td> <td> Line 15: </td> </tr> <tr> <td> <span>- &lt;template id='gaussian.log' output="VERBOSE"&gt;<br> - &lt;templateList id='main' xmlns:xi="http://www.w3.org/2001/XInclude"&gt;<br> - &lt;xi:include href="l1.temp.xml"/&gt;<br> - &lt;xi:include href="l101.temp.xml"/&gt;<br> - &lt;xi:include href="l202.temp.xml"/&gt;<br> - &lt;xi:include href="l301.temp.xml"/&gt;<br> - &lt;xi:include href="l601.temp.xml"/&gt;</span> </td> <td> <span>+ &lt;!-- The ORDER of the subtemplates may be important. Templates are processed in the order<br> + in this files and the subtemplates. Some of the files are marked with comments<br> + --&gt;<br> + &lt;template id='nwchem.log'<br> + output="VERBOSE"<br> + convention='conventions:compchem'<br> + xmlns:conventions="http://www.xml-cml.org/convention/"<br> + <br> + xmlns:compchem="http://www.xml-cml.org/dictionary/compchem/"<br> + xmlns:cc="http://www.xml-cml.org/dictionary/compchem/"<br> + xmlns:n="http://www.xml-cml.org/dictionary/nwchem/"<br> + xmlns:x="http://www.xml-cml.org/dictionary/cmlx/"<br> + xmlns:h="http://www.w3.org/1999/xhtml"<br> + <br> + xmlns:cmlx="http://www.xml-cml.org/schema/cmlx"<br> + xmlns:cml="http://www.xml-cml.org/schema"<br> + xmlns:xi="http://www.w3.org/2001/XInclude"<br> + &gt;<br> + <br> + &lt;dictionary uri="http://www.xml-cml.org/dictionary/nwchem/"<br> + href="org/xmlcml/cml/converters/compchem/nwchem/nwchemDict.xml"/&gt;<br> + <br> + &lt;templateList id='main'&gt;<br> + &lt;xi:include href="input.user.xml"/&gt; &lt;!-- do this early on --&gt;<br> + &lt;xi:include href="argument.xml"/&gt;<br> + <br> + &lt;!-- errors --&gt;<br> + &lt;xi:include href="error.file.xml"/&gt;<br> + &lt;xi:include href="error.current.xml"/&gt;<br> + &lt;xi:include href="error.mult.xml"/&gt;<br> + &lt;xi:include href="error.nocat.xml"/&gt;<br> + &lt;xi:include href="error.lastsys.xml"/&gt;<br> + <br> + &lt;!-- Environment same for all jobs --&gt;<br> + &lt;xi:include href="environment.xml"/&gt;<br> + <br> + &lt;!-- Parse each job in turn --&gt;<br> + &lt;xi:include href="nwchem.job.xml"/&gt;<br> + <br> + &lt;!-- Below come after job finished --&gt;<br> + &lt;xi:include href="ga.summary.xml"/&gt;<br> + &lt;xi:include href="ga.stats.xml"/&gt;<br> + &lt;xi:include href="citation.xml"/&gt;<br> + &lt;xi:include href="authors.xml"/&gt;<br> + &lt;xi:include href="times.xml"/&gt;<br> + </span> </td> </tr> <tr> <td> Line 34: </td> <td> Line 70: </td> </tr> <tr> <td> <span>-</span> For example, in the case of the list in the previous section, the list includes the file {{{<span>l601.p</span>o<span>pa</span>n<span>al</span>.xml}}}, which is located in </td> <td> <span>+</span> For example, in the case of the list in the previous section, the list includes the file {{{<span>envir</span>on<span>ment</span>.xml}}}, which is located in </td> </tr> <tr> <td> Line 38: </td> <td> Line 74: </td> </tr> <tr> <td> <span>-</span> together with several other template files related to the <span>l601</span> link. </td> <td> <span>+</span> together with several other template files related to the <span>environment</span> link. </td> </tr> <tr> <td> Line 43: </td> <td> Line 79: </td> </tr> <tr> <td> <span>- &lt;template id="l601.popanal" name="Population analysis using the SCF density"<br> - repeat="*"<br> - pattern="\s*\*+\s*$\s*$\s*Population analysis using the SCF density.*"<br> - endPattern="\sN\-N\=.*" endOffset="1"<br> - xmlns:xi="http://www.w3.org/2001/XInclude"<br> - &gt;<br> - &lt;comment class="example.input" id="l601.popanal"&gt;<br> - ...<br> - &lt;/comment&gt;</span> </td> <td> <span>+ &lt;template<br> + id="environment"<br> + name="Environment"<br> + repeat="*"<br> + pattern="\s*Northwest Computational Chemistry Package .*$\s+\-+.*"<br> + newline="$" endPattern="\s*NWChem Input Module\s*$\s+\-+\s*"<br> + offset="0"<br> + endOffset="0"&gt;</span> </td> </tr> <tr> <td> Line 53: </td> <td> Line 88: </td> </tr> <tr> <td> <span>- &lt;record repeat="5"/&gt;<br> - &lt;record repeat="2"/&gt;<br> - &lt;templateList&gt;<br> - &lt;xi:include href="l601.condensed.xml"/&gt;<br> - &lt;xi:include href="../l401/l4601.virtual.xml"/&gt;<br> - ...</span> </td> <td> <span>+ &lt;comment class="example.input" id="environment"&gt;<br> + ...<br> + &lt;/comment&gt;<br> + <br> + &lt;templateList xmlns:xi="http://www.w3.org/2001/XInclude"&gt;<br> + &lt;xi:include href="nccp.xml"/&gt;<br> + &lt;xi:include href="acknow.xml"/&gt;<br> + &lt;xi:include href="job.info.xml"/&gt;<br> + &lt;xi:include href="memory.xml"/&gt;<br> + &lt;xi:include href="dirinfo.xml"/&gt;</span> </td> </tr> <tr> <td> Line 61: </td> <td> Line 100: </td> </tr> <tr> <td> <span>- </span> &lt;comment class="example.output" id="<span>l601.p</span>o<span>pa</span>n<span>al</span>"&gt;<br> <span>- </span> ...<br> <span>- </span> &lt;/comment&gt; </td> <td> <span>+</span> &lt;comment class="example.output" id="<span>envir</span>on<span>ment</span>"&gt;<br> <span>+</span> ...<br> <span>+</span> &lt;/comment&gt; </td> </tr> <tr> <td> Line 71: </td> <td> Line 110: </td> </tr> <tr> <td> <span>- pattern"\s*\*+\s*$\s*$\s*Population analysis using the SCF density.*"</span> </td> <td> <span>+ pattern="\s*Northwest Computational Chemistry Package .*$\s+\-+.*"</span> </td> </tr> <tr> <td> Line 77: </td> <td> Line 116: </td> </tr> <tr> <td> <span>- **********************************************************************<br> - <br> - Population analysis using the SCF density.</span> </td> <td> <span>+ Northwest Computational Chemistry Package (NWChem) 6.1<br> + ------------------------------------------------------</span> </td> </tr> <tr> <td> Line 87: </td> <td> Line 125: </td> </tr> <tr> <td> <span>-</span> endPattern="\sN\-<span>N</span>\<span>=.</span>*" </td> <td> <span>+</span> endPattern="\s<span>*</span>N<span>WChem Input Module\s*$\s+</span>\-<span>+</span>\<span>s</span>*" </td> </tr> <tr> <td> Line 93: </td> <td> Line 131: </td> </tr> <tr> <td> <span>- N-N= 8.247004252289D+02 E-N=-3.066552593713D+03 KE= 6.044489055531D+02</span> </td> <td> <span>+ NWChem Input Module<br> + -------------------</span> </td> </tr> <tr> <td> Line 96: </td> <td> Line 135: </td> </tr> <tr> <td> <span>-</span> Everything between these two lines, '''excluding the last one''', is eaten up by this template and included into the module identified by </td> <td> <span>+</span> Everything between these two lines, '''excluding the last one''', is eaten up by this template and included into the module identified by<span>&nbsp;the '''id''':</span> </td> </tr> <tr> <td> Line 99: </td> <td> Line 138: </td> </tr> <tr> <td> <span>- l601.popanal</span> </td> <td> <span>+ environment</span> </td> </tr> <tr> <td> Line 105: </td> <td> Line 144: </td> </tr> <tr> <td> <span>- <br> - &lt;module cmlx:templateRef="l601.popanal" xmlns="http...<br> - <br> - &lt;/module</span> </td> <td> <span>+ &lt;module xmlns="http://www.xml-cml.org/schema" xmlns:cmlx="http://www.xml-cml.org/schema/cmlx" cmlx:templateRef="environment"&gt;<br> + ...<br> + &lt;/module&gt;</span> </td> </tr> <tr> <td> Line 119: </td> <td> Line 157: </td> </tr> <tr> <td> <span>-</span> When no further match is found, the parser will proceed to the next item in the list, in this case {{{<span>l601/l601</span>.<span>polariz</span>.xml}}}, which will do the same process but '''without having access''' to the already captured modules. </td> <td> <span>+</span> When no further match is found, the parser will proceed to the next item in the list, in this case {{{<span>nwchem</span>.<span>job</span>.xml}}}, which will do the same process but '''without having access''' to the already captured modules. </td> </tr> <tr> <td> Line 121: </td> <td> Line 159: </td> </tr> <tr> <td> <span>-</span> The final important point is that this process can be nested, i.e., there can be modules inside modules inside modules, as you can see in the contents of {{{<span>l1.temp</span>.xml}}}, where a deeper list of templates appear. Of course, its items will only be matched against the chunk that was matched in the parent module. </td> <td> <span>+</span> The final important point is that this process can be nested, i.e., there can be modules inside modules inside modules, as you can see in the contents of {{{<span>environment</span>.xml}}}, where a deeper list of templates appear. Of course, its items will only be matched against the chunk that was matched in the parent module. </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-06-20 21:03:11JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 43: </td> <td> Line 43: </td> </tr> <tr> <td> <span>-</span> Each template has <span>a '''pattern'''</span>, which is a regular expression defining the text in the logfile where the module starts, and <span>an</span> '''endPattern''', which is a regular expression defining where the module ends<span>, and a '''repeatCount''',</span> w<span>hich describes how many times the module can occur within the fil</span>e.<br> <span>-</span> <br> <span>- The parser will read through the log file checking each line against the patterns in the list of templates that define the current parser.<br> - <br> -</span> If a pattern is matched, all subsequent text is "gobbled" and added to the module until the '''endPattern''' is reached, at which point the module is closed, and the next line of text is searched to see if it matches a pattern<span>&nbsp;(if there is no end pattern, the template will swallow all the remaining text in it's enclosing template)</span>.<span><br> - <br> - If no pattern is matched, the text is added to the parent module.</span> </td> <td> <span>+</span> Each template has <span>one or more '''pattern'''s</span>, which is a regular expression defining the text in the logfile where the module starts, and <span>one or more</span> '''endPattern'''<span>s</span>, which is a regular expression defining where the module ends<span>. The parser</span> w<span>ill read through the log file checking each line against the patterns in the list of templates included in the topTemplat</span>e.<br> <span>+</span> <br> <span>+</span> If a pattern is matched, all subsequent text is "gobbled" and added to the module until the '''endPattern''' is reached, at which point the module is closed, and the next line of text is searched to see if it matches a pattern. </td> </tr> <tr> <td> Line 53: </td> <td> Line 49: </td> </tr> <tr> <td> <span>- For example, the Gaussian topTemplate.xml file:<br> - <br> - [https://bitbucket.org/wwmm/jumbo-converters/src/c25586883d1b/jumbo-converters-compchem/jumbo-converters-compchem-gaussian/src/main/resources/org/xmlcml/cml/converters/compchem/gaussian/log/templates/topTemplate.xml jumbo-converters-compchem/jumbo-converters-compchem-gaussian/src/main/resources/org/xmlcml/cml/converters/compchem/gaussian/log/templates/topTemplate.xml]<br> - <br> - contains:<br> - <br> - {{{<br> - &lt;?xml version="1.0" encoding="UTF-8"?&gt;<br> - &lt;template id='gaussian.log'<br> - xmlns:xi="http://www.w3.org/2001/XInclude"<br> - &gt;<br> - &lt;comment&gt;<br> - Entering Gaussian System, Link 0=/usr/local/gaussian/g03/g03<br> - Initial command:<br> - /usr/local/gaussian/g03/l1.exe /tmp/webmo/1/Gau-28330.inp -scrdir=/tmp/webmo/1/<br> - Entering Link 1 = /usr/local/gaussian/g03/l1.exe PID= 28333.<br> - ...<br> - Job cpu time: 0 days 0 hours 0 minutes 12.7 seconds.<br> - File lengths (MBytes): RWF= 12 Int= 0 D2E= 0 Chk= 7 Scr= 1<br> - Normal termination of Gaussian 03 at Mon Nov 20 14:40:36 2006.<br> - &lt;/comment&gt;<br> - <br> - &lt;templateList&gt;<br> - &lt;template id="job" pattern="\s*((Link1\:\s+Proceeding to internal job step number)|(Entering Gaussian System)).*"<br> - endPattern="\s*Normal termination of.*" endOffset="1" repeat="*"&gt;<br> - &lt;templateList id='main'&gt;<br> - &lt;xi:include href="l0.entering.xml"/&gt;<br> - &lt;xi:include href="l601/l601.anisospin.xml"/&gt;<br> - &lt;xi:include href="l301.basis.xml"/&gt;<br> - <br> - &lt;!-- Many more templates --&gt;<br> - <br> - &lt;/templateList&gt;<br> - &lt;/template&gt;<br> - &lt;/templateList&gt;<br> - &lt;/template&gt;<br> - }}}<br> - <br> - <br> - The first few lines just declare this as a template and declare the include namespaces, so that we can use this to include other templates.<br> - <br> - There is then a comment containing some example text to show what this template parses.<br> - <br> - There is then a template list, which is the container that holds the templates. This first templateList only holds one template, which is the template that parses the various links (modules) within the Gaussian program. This template then contains a further template list, which includes the other templates that will process the text found by the parent template.<br> - <br> - One of the templates the parent references is the file:<br> - <br> - [https://bitbucket.org/wwmm/jumbo-converters/src/c25586883d1b/jumbo-converters-compchem/jumbo-converters-compchem-gaussian/src/main/resources/org/xmlcml/cml/converters/compchem/gaussian/log/templates/l301.basis.xml jumbo-converters-compchem/jumbo-converters-compchem-gaussian/src/main/resources/org/xmlcml/cml/converters/compchem/gaussian/log/templates/l301.basis.xml]<br> - <br> - which is described below.<br> - </span> </td> <td> </td> </tr> <tr> <td> Line 105: </td> <td> Line 50: </td> </tr> <tr> <td> <span>- <br> - EDIT IN PROGRESS...</span> </td> <td> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-06-20 20:56:59JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 167: </td> <td> Line 167: </td> </tr> <tr> <td> <span>- Records are the machinery used to extract text from a file and mark it up into XML. A record is an XML element, which can have a number of attributes (see below)<br> - <br> - Each record is processed in turn until it fails, at which point the next record is processed until all records in the module have been processed.<br> - <br> - A record can be used to "gobble" lines (which are discarded), or to capture data that will be marked up with XML.<br> - <br> - <br> - <br> - <br> - Now we get to where the file is actually processed.<br> - <br> - {{{</span> </td> <td> <span>+ Records are the machinery used to extract text from a file and mark it up into XML.<br> + <br> + A record is an XML element, which can have a number of attributes (see below) and which may contain a string, which is a simple regular expression-type language for determining what will be extracted and how it will be marked up.<br> + <br> + Unlike the templates, where each template is tried in turn against each line of the file, records are processed sequentially. Each record is processed in turn until it fails, at which point the next record is processed until all records in the module have been processed.<br> + <br> + An empty record (such as &lt;record repeat="2"/&gt;) can be used to "gobble" lines (which are discarded).<br> + <br> + If the record has content, then the text of the line is parsed into a CML list with a templateRef as specified by the '''id''' of the record.<br> + <br> + A simple example to read the XYZ format geometry printed in an NWChem output is shown below. The text that is to be parsed is:<br> + <br> + {{{<br> + XYZ format geometry<br> + -------------------<br> + 11<br> + geometry<br> + fe 0.00000000 0.00000000 0.00000000<br> + c 0.00000000 0.00000000 1.80680057<br> + o 0.77109980 -2.87778364 0.00000000<br> + }}}<br> + <br> + The records to parse this are:<br> + <br> + {{{<br> + &lt;!-- Read 2 lines. The record has no content, so the lines are discarded. --&gt;</span> </td> </tr> <tr> <td> Line 180: </td> <td> Line 194: </td> </tr> <tr> <td> <span>- &lt;record id="fermi.atom" repeat="*" makeArray="true"&gt;\s*{I,cc:serial}{A,x:elementType}\({I,x:isotopeNumber}\)\s{F,cc:coupling,u:au}\s{F,cc:coupling,u:mhz}\s{F,cc:coupling,u:gauss}\s{F,cc:coupling,u:ten4cm-1}\s*&lt;/record&gt;<br> - &lt;record repeat="4"/&gt;<br> - &lt;record id="fermi.spindipole" repeat="*" makeArray="true"&gt;\s*{I,cc:serial}\s*Atom\s*{F,g:spindipole.xx}{F,g:spindipole.yy}{F,g:spindipole.zz}&lt;/record&gt;<br> - &lt;record repeat="3"/&gt;<br> - &lt;record id="fermi.spindipole" repeat="*" makeArray="true"&gt;\s*{I,cc:serial}\s*Atom\s*{F,g:spindipole.xy}{F,g:spindipole.xz}{F,g:spindipole.yz}&lt;/record&gt;<br> - &lt;record repeat="1"/&gt;}}}<br> - <br> - The first line just reads 2 lines from the file. As there is no id on the record, the lines are discarded.<br> - <br> - The next record reads as many lines (repeat="*") as match the expression contained within the record into an entity called "fermi.atom". The expression that is expected to be matched is an integer (I), followed by a character string (A), followed by another integer (I) and then four floats (F).<br> - <br> - The first integer will be labelled '''cc:serial''', the first string '''x:elementType''' etc. The "cc" stands for '''Computational Chemistry''', as these are attribute in the Computational Chemistry dictionary, and the "g" stands for an entry in the Gaussian dictionary.<br> - <br> - The makeArray="true" attribute will create an array for each of the integers, floats etc.<br> - <br> - Once this pattern stops being matched, the parser will skip 4 lines (&lt;record repeat="4"/&gt;) and then the next record will process a matching block of Integers and Floats into arrays as described earlier.<br> - <br> - Next comes a comment:<br> - <br> - <br> - <br> - The class of "example.input" means that this block of test will be used in the tests for the parser (see below). We can see how the first line is that which is matched by the template's '''pattern''' and the last one that matched by '''endPattern'''.<br> - <br> - <br> - The output has been parsed into a module with the templateRef "l601.fermi", which is the id of the template. We then have the 5 arrays (I,A,I,F,F,F,F) parsed by the "fermi.atom" record as a list, followed by that for the "fermi.spindipole" record etc.</span> </td> <td> <span>+ <br> + &lt;!-- Read a line with a single integer. The integer will be placed in a CML scalar with the dictRef "compchem:numAtoms".<br> + The scalar will itself be within a CML list with the templateRef of "atoms". --&gt;<br> + &lt;record id="atoms"&gt;\s*{I,compchem:numAtoms}\s*&lt;/record&gt;<br> + <br> + &lt;!-- Read a line with a single character string. The string will be placed in a CML scalar with the dictRef "n:geomtype".<br> + The scalar will itself be within a CML list with the templateRef of "atoms". --&gt;<br> + &lt;record id="geo"&gt;\s*{A,n:geomtype}\s*&lt;/record&gt;<br> + <br> + &lt;!-- Keep reading lines while they contain a character string, followed by 3 floats. Make an array of all matching variables.<br> + The arrays will be held in a CML list with the templateRef of "mol". --&gt;<br> + &lt;record makeArray="true" repeat="*"<br> + id="mol"&gt;\s*{A,compchem:elementType}\s*{F,compchem:x3}\s*{F,compchem:y3}\s*{F,compchem:z3}\s*&lt;/record&gt;<br> + }}}<br> + <br> + This result of the parsing is as follows:<br> + {{{<br> + &lt;list cmlx:templateRef="atoms"&gt;<br> + &lt;scalar dataType="xsd:integer" dictRef="compchem:numAtoms"&gt;11&lt;/scalar&gt;<br> + &lt;/list&gt;<br> + &lt;list cmlx:templateRef="geo"&gt;<br> + &lt;scalar dataType="xsd:string" dictRef="n:geomtype"&gt;geometry&lt;/scalar&gt;<br> + &lt;/list&gt;<br> + &lt;list cmlx:lineCount="3" cmlx:templateRef="mol"&gt;<br> + &lt;array dataType="xsd:string" dictRef="compchem:elementType" size="3"&gt;fe c o&lt;/array&gt;<br> + &lt;array dataType="xsd:double" dictRef="compchem:x3" size="3"&gt;0.0 0.0 0.7710998&lt;/array&gt;<br> + &lt;array dataType="xsd:double" dictRef="compchem:y3" size="3"&gt;0.0 0.0 -2.87778364&lt;/array&gt;<br> + &lt;array dataType="xsd:double" dictRef="compchem:z3" size="3"&gt;0.0 1.80680057 0.0&lt;/array&gt;<br> + &lt;/list&gt;<br> + }}}</span> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-06-20 20:09:57JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 115: </td> <td> Line 115: </td> </tr> <tr> <td> <span>-</span> &lt;!-- The templates contain their own unit-testing framework in the comments. See the 'Unit Testing' section below </td> <td> <span>+</span> &lt;!-- The templates contain their own unit-testing framework in the comments. See the 'Unit Testing<span>&nbsp;Framework</span>' section below </td> </tr> <tr> <td> Line 127: </td> <td> Line 127: </td> </tr> <tr> <td> <span>-</span> &lt;!-- The record is the mechanism to extract text into XML. See the '<span>r</span>ecords' section below for further details --&gt; </td> <td> <span>+</span> &lt;!-- The record is the mechanism to extract text into XML. See the '<span>R</span>ecords' section below for further details --&gt; </td> </tr> <tr> <td> Line 132: </td> <td> Line 132: </td> </tr> <tr> <td> <span>-</span> &lt;!-- The XML elements created with the records can be manipulated with transforms. See the '<span>t</span>ransform<span>s</span>' </td> <td> <span>+</span> &lt;!-- The XML elements created with the records can be manipulated with transforms. See the '<span>T</span>ransform<span>ing the raw XML</span>' </td> </tr> <tr> <td> Line 140: </td> <td> Line 140: </td> </tr> <tr> <td> <span>-</span> EXAMPLE LOGFILE TEXT above. See the 'Unit Testing' section below for more information --&gt; </td> <td> <span>+</span> EXAMPLE LOGFILE TEXT above. See the 'Unit Testing<span>&nbsp;Framework</span>' section below for more information --&gt; </td> </tr> <tr> <td> Line 150: </td> <td> Line 150: </td> </tr> <tr> <td> <span>-</span> The possible ATTRIBUTES on a <span>&nbsp;</span>template are: </td> <td> <span>+</span> The possible ATTRIBUTES on a template are: </td> </tr> <tr> <td> Line 165: </td> <td> Line 165: </td> </tr> <tr> <td> <span>- <br> - A template (aside from possibly containing sub-templates) contains a list of '''records''', which are used to parse the text constituting the module.</span> </td> <td> <span>+ === Records ===<br> + <br> + Records are the machinery used to extract text from a file and mark it up into XML. A record is an XML element, which can have a number of attributes (see below)</span> </td> </tr> <tr> <td> Line 205: </td> <td> Line 206: </td> </tr> <tr> <td> <span>-</span> == Testing a<span>nd d</span>e<span>vel</span>o<span>ping templates</span> == </td> <td> <span>+</span> ==<span>= Unit</span> Testing <span>Fr</span>a<span>m</span>e<span>w</span>o<span>rk</span> ==<span>=</span> </td> </tr> <tr> <td> Line 281: </td> <td> Line 282: </td> </tr> <tr> <td> <span>-</span> The individual test can be run from within Eclipse, but from the command-line, it only appears possible to run <span>the entir</span>e TemplateTests (see not below), using the following command, whilst sat in the '''jumbo-converters/jumbo-converters-compchem/jumbo-converters-compchem-gaussian''' directory: </td> <td> <span>+</span> The individual test can be run from within Eclipse, but from the command-line, it only appears possible to run <span>all of th</span>e TemplateTests (see not below), using the following command, whilst sat in the '''jumbo-converters/jumbo-converters-compchem/jumbo-converters-compchem-gaussian''' directory: </td> </tr> <tr> <td> Line 287: </td> <td> Line 288: </td> </tr> <tr> <td> <span>-</span> <span>T</span>he first time this is run, it will fail. However, it will print out the output of running the test, and something like the following: </td> <td> <span>+</span> <span>If you are developing a template, t</span>he first time this is run, it will fail. However, it will print out the output of running the test, and something like the following: </td> </tr> <tr> <td> Line 310: </td> <td> Line 311: </td> </tr> <tr> <td> <span>-</span> == Transforming the raw XML == </td> <td> <span>+</span> <span>=</span>== Transforming the raw XML ==<span>=</span> </td> </tr> <tr> <td> Line 762: </td> <td> Line 763: </td> </tr> <tr> <td> </td> <td> <span>+ </span> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-06-20 19:16:42JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 108: </td> <td> Line 108: </td> </tr> <tr> <td> <span>-</span> The structure of a typical template is <span>a</span>s <span>fol</span>low<span>s,</span> with comments to explain the various sections. </td> <td> <span>+</span> The structure of a typical template is s<span>hown</span> <span>be</span>low with comments to explain the various sections. </td> </tr> <tr> <td> Line 148: </td> <td> Line 148: </td> </tr> <tr> <td> <span>-</span> Template Attributes </td> <td> <span>+ ===</span> Template Attributes<span>&nbsp;===</span> </td> </tr> <tr> <td> Line 172: </td> <td> Line 172: </td> </tr> <tr> <td> <span>- Using '''l601.fermi.xml''' as an example, we will go through the template and describe what each bit does.<br> - <br> - {{{<br> - &lt;template id="l601.fermi" name="Isotropic Fermi Contact Couplings"<br> - pattern="\s*Isotropic Fermi Contact Couplings.*"<br> - repeat="*"<br> - endPattern="\s*"<br> - offset="0"<br> - endOffset="1"<br> - &gt;<br> - }}}<br> - <br> - The template starts with the template tag. The name is a descriptive name for the template, and the id serves to define the module that the text will be parsed into, so the module for this template will start with:<br> - <br> - {{{&lt;module cmlx:templateRef="l601.fermi" xmlns="http://www.xml-cml.org/schema" xmlns:cmlx="http://www.xml-cml.org/schema/cmlx"&gt;}}}<br> - <br> - repeat="*" says that this template can appear multiple times in the file.<br> - <br> - The '''pattern''' is a regular expression stating that this template will start when it matches the line:<br> - <br> - {{{<br> - Isotropic Fermi Contact Couplings<br> - }}}<br> - <br> - The template will swallow all text until the end pattern is encountered, which is a line containing nothing but spaces.<br> - <br> - '''offset''' indicates where the text made available to the records inside the template starts. The default (0) means that the line that is matched by the '''pattern''' is part of the text within the template. An offset of 1, would mean that the line would not be available to the template, but would become part of the parent template.<br> - <br> - '''endOffset''' indicates where the text made available to the records inside the template stops. The default (0) means that the line that is matched by the '''endPattern''' is not part of the text within the template, but is pushed into the parent template. An offset of 1, would include the line in the template, 2 would mean that the line following the endPattern would also be included in this template<br> - </span> </td> <td> <span>+ <br> + <br> + <br> + Now we get to where the file is actually processed.<br> + <br> + {{{<br> + &lt;record repeat="2"/&gt;<br> + &lt;record id="fermi.atom" repeat="*" makeArray="true"&gt;\s*{I,cc:serial}{A,x:elementType}\({I,x:isotopeNumber}\)\s{F,cc:coupling,u:au}\s{F,cc:coupling,u:mhz}\s{F,cc:coupling,u:gauss}\s{F,cc:coupling,u:ten4cm-1}\s*&lt;/record&gt;<br> + &lt;record repeat="4"/&gt;<br> + &lt;record id="fermi.spindipole" repeat="*" makeArray="true"&gt;\s*{I,cc:serial}\s*Atom\s*{F,g:spindipole.xx}{F,g:spindipole.yy}{F,g:spindipole.zz}&lt;/record&gt;<br> + &lt;record repeat="3"/&gt;<br> + &lt;record id="fermi.spindipole" repeat="*" makeArray="true"&gt;\s*{I,cc:serial}\s*Atom\s*{F,g:spindipole.xy}{F,g:spindipole.xz}{F,g:spindipole.yz}&lt;/record&gt;<br> + &lt;record repeat="1"/&gt;}}}<br> + <br> + The first line just reads 2 lines from the file. As there is no id on the record, the lines are discarded.<br> + <br> + The next record reads as many lines (repeat="*") as match the expression contained within the record into an entity called "fermi.atom". The expression that is expected to be matched is an integer (I), followed by a character string (A), followed by another integer (I) and then four floats (F).<br> + <br> + The first integer will be labelled '''cc:serial''', the first string '''x:elementType''' etc. The "cc" stands for '''Computational Chemistry''', as these are attribute in the Computational Chemistry dictionary, and the "g" stands for an entry in the Gaussian dictionary.<br> + <br> + The makeArray="true" attribute will create an array for each of the integers, floats etc.<br> + <br> + Once this pattern stops being matched, the parser will skip 4 lines (&lt;record repeat="4"/&gt;) and then the next record will process a matching block of Integers and Floats into arrays as described earlier.</span> </td> </tr> <tr> <td> Line 204: </td> <td> Line 197: </td> </tr> <tr> <td> </td> <td> <span>+ <br> + <br> + <br> + The class of "example.input" means that this block of test will be used in the tests for the parser (see below). We can see how the first line is that which is matched by the template's '''pattern''' and the last one that matched by '''endPattern'''.<br> + <br> + <br> + The output has been parsed into a module with the templateRef "l601.fermi", which is the id of the template. We then have the 5 arrays (I,A,I,F,F,F,F) parsed by the "fermi.atom" record as a list, followed by that for the "fermi.spindipole" record etc.<br> + <br> + == Testing and developing templates ==<br> + <br> + The templates contain their own internal testing framework, in the form of one or more pairs of comment blocks within them.<br> + <br> + A comment block with the '''class''' attribute "example.input" should contain a small representative chunk of text that the parsers can be tested with. The '''id''' attribute is used to match the example input with the the representative output that should be produced when the template acts on the sample text.<br> + <br> + An input comment is shown below:</span> </td> </tr> <tr> <td> Line 230: </td> <td> Line 238: </td> </tr> <tr> <td> <span>- The class of "example.input" means that this block of test will be used in the tests for the parser (see below). We can see how the first line is that which is matched by the template's '''pattern''' and the last one that matched by '''endPattern'''.<br> - <br> - Now we get to where the file is actually processed.<br> - <br> - {{{<br> - &lt;record repeat="2"/&gt;<br> - &lt;record id="fermi.atom" repeat="*" makeArray="true"&gt;\s*{I,cc:serial}{A,x:elementType}\({I,x:isotopeNumber}\)\s{F,cc:coupling,u:au}\s{F,cc:coupling,u:mhz}\s{F,cc:coupling,u:gauss}\s{F,cc:coupling,u:ten4cm-1}\s*&lt;/record&gt;<br> - &lt;record repeat="4"/&gt;<br> - &lt;record id="fermi.spindipole" repeat="*" makeArray="true"&gt;\s*{I,cc:serial}\s*Atom\s*{F,g:spindipole.xx}{F,g:spindipole.yy}{F,g:spindipole.zz}&lt;/record&gt;<br> - &lt;record repeat="3"/&gt;<br> - &lt;record id="fermi.spindipole" repeat="*" makeArray="true"&gt;\s*{I,cc:serial}\s*Atom\s*{F,g:spindipole.xy}{F,g:spindipole.xz}{F,g:spindipole.yz}&lt;/record&gt;<br> - &lt;record repeat="1"/&gt;}}}<br> - <br> - The first line just reads 2 lines from the file. As there is no id on the record, the lines are discarded.<br> - <br> - The next record reads as many lines (repeat="*") as match the expression contained within the record into an entity called "fermi.atom". The expression that is expected to be matched is an integer (I), followed by a character string (A), followed by another integer (I) and then four floats (F).<br> - <br> - The first integer will be labelled '''cc:serial''', the first string '''x:elementType''' etc. The "cc" stands for '''Computational Chemistry''', as these are attribute in the Computational Chemistry dictionary, and the "g" stands for an entry in the Gaussian dictionary.<br> - <br> - The makeArray="true" attribute will create an array for each of the integers, floats etc.<br> - <br> - Once this pattern stops being matched, the parser will skip 4 lines (&lt;record repeat="4"/&gt;) and then the next record will process a matching block of Integers and Floats into arrays as described earlier.<br> - <br> - Finally we get to the last block in the template, which is a comment if class "example.output" showing what this template processes when fed the text that was in the "example.input" comment.</span> </td> <td> <span>+ <br> + The matching example.output comment is below:</span> </td> </tr> <tr> <td> Line 283: </td> <td> Line 269: </td> </tr> <tr> <td> <span>- The output has been parsed into a module with the templateRef "l601.fermi", which is the id of the template. We then have the 5 arrays (I,A,I,F,F,F,F) parsed by the "fermi.atom" record as a list, followed by that for the "fermi.spindipole" record etc.<br> - <br> - == Testing and developing templates ==<br> - <br> - The templates contain their own internal testing framework, in the form of two of the comment blocks within them.<br> - <br> - The comment block with the class "example.input" should contain a small representative chunk of text that the parsers can be tested with.<br> - <br> - For the Gaussian logfile templates we have been dealing with here, the code that runs these tests lives in the file:</span> </td> <td> <span>+ It is possible for the templates to contain multiple examples, provided that each pair has matching '''id''' attributes. In this case, each matching pair will be tested in turn and all must pass for the unit test to be successful.<br> + <br> + For the Gaussian logfile templates, the code that runs these tests lives in the file:</span> </td> </tr> <tr> <td> Line 295: </td> <td> Line 275: </td> </tr> <tr> <td> <span>-</span> To test and develop an individual template (<span>we will continue </span>using the l601.fermi template as an example), the following line needs to be added to the TemplateTest.java file. </td> <td> <span>+</span> To test and develop an individual template (using the l601.fermi template as an example), the following line needs to be added to the TemplateTest.java file. </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-06-20 15:02:29JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 105: </td> <td> Line 105: </td> </tr> <tr> <td> </td> <td> <span>+ <br> + EDIT IN PROGRESS...<br> + <br> + The structure of a typical template is as follows, with comments to explain the various sections.<br> + <br> + {{{<br> + &lt;!-- The template is contained in an XML element, with the behaviour controlled by various attributes<br> + of the form ATTRIBUTE="VALUE". See the 'Template Attributes' section below for more information --&gt;<br> + &lt;template id="foo" pattern="…"&gt;<br> + <br> + &lt;!-- The templates contain their own unit-testing framework in the comments. See the 'Unit Testing' section below<br> + for more information --&gt;<br> + &lt;comment class="example.input" id="foo"&gt;<br> + EXAMPLE LOGFILE TEXT<br> + &lt;/comment&gt;<br> + <br> + &lt;!-- Templates can themselves include other templates using a templateList. Only templates,<br> + or include directives to include other templates should be in a templateList --&gt;<br> + &lt;templateList xmlns:xi="http://www.w3.org/2001/XInclude"&gt;<br> + &lt;xi:include href="basis.summary.xml"/&gt;<br> + &lt;/templateList&gt;<br> + <br> + &lt;!-- The record is the mechanism to extract text into XML. See the 'records' section below for further details --&gt;<br> + &lt;record<br> + id="iter"<br> + repeat="*"&gt;\s*{I,compchem:iterationIndex}\s+{F,compchem:totalEnergy}\s+{E,n:gnorm}\s+{E,n:gmax}\s+{F,compchem:wallTime}&lt;/record&gt;<br> + <br> + &lt;!-- The XML elements created with the records can be manipulated with transforms. See the 'transforms'<br> + section below for more information --&gt;<br> + &lt;transform process="addUnits"<br> + xpath=".//cml:scalar[@dictRef='compchem:totalEnergy']"<br> + value="nonsi:hartree"<br> + /&gt;<br> + <br> + &lt;!-- This is part of the unit testing framework, and contains the marked-up text that should be created from the<br> + EXAMPLE LOGFILE TEXT above. See the 'Unit Testing' section below for more information --&gt;<br> + &lt;comment class='"example.output" id="foo"&gt;<br> + PARSED OUTPUT<br> + &lt;/comment&gt;<br> + <br> + &lt;/template&gt;<br> + }}}<br> + <br> + Template Attributes<br> + <br> + The possible ATTRIBUTES on a template are:<br> + <br> + * '''id''' - this should be a unique identifier. The text that is parsed by the template will be extracted into a cml module with a templateRef (in the cmlx namespace) of the id. In other words the text parsed by the template with id="foo" will end up in a module as shown below:<br> + {{{&lt;module xmlns="http://www.xml-cml.org/schema" xmlns:cmlx="http://www.xml-cml.org/schema/cmlx" cmlx:templateRef="foo"&gt;<br> + PARSED TEXT<br> + &lt;/module&gt;}}}<br> + * '''name''' - used to give a name to the template and is currently unused.<br> + * '''repeat''' - the number of times that a template will be matched within the file. If repeat="1" then the template will only be matched once, regardless how many times the '''pattern''' is matched in the file. repeat="*" means the template will be matched as many times as the '''pattern''' is matched.<br> + * '''newline''' - the character that is used to indicate a new line in the regular expression used in the pattern or endPattern. The default is the dollar character, i.e. newline="$"<br> + * '''pattern''', '''pattern2''', '''pattern3'''… - the regular expression used to trigger this template to start parsing text. The pattern may extend over more then one line if the '''newline''' character (see above) features in the expression. For example, pattern="\Number One$\s*The Larch\s*" would only match the line "Number One" if it were followed by "The Larch". Multiple patterns can be specified using the attributes pattern2, pattern3...<br> + * '''endPattern''', '''endPattern2''', '''endPattern3'''… - as for pattern (see above), but this matches where the template stops parsing. If endPattern="~", then if the end of the text that this template is parsing is reached, the entire text will be included in the template. If the endPattern is anything other then "~", then no text will not be included in the template and the entire text will be available for matching by another template within the parent.<br> + * '''offset''', '''offset2''', '''offset3'''… - the number of lines either side of the match to include within the template. WIth the default of "0", all the text from (and including) the first matched line is included in the template. An offset of "-2" includes the two lines before the match, an offset of "3" excludes the first line of the match, and the two lines following it. If no offset is specified (or only offset is specified) the offset will apply to all matches (i.e. pattern, pattern2, pattern3…). If, for example, offset2 is specified, then this is the offset that will be applied when pattern2 is matched.<br> + * '''endOffset''', '''endOffset2''', '''endOffset3'''… - the number of lines to include in the template after the endPattern match. With the default of "0", the line matched by endPattern is NOT included, and this line is pushed into the containing template, where it may be matched by the pattern of another template. An endOffset of 2, includes the endPattern line, and the one after. An endOffest of "-1" excludes the line preceding the match as well.<br> + <br> + </span> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-06-20 11:33:47JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 13: </td> <td> Line 13: </td> </tr> <tr> <td> <span>- Most log files can be logically separated into high level units called '''chunks''', which in turn are made up of smaller units called '''blocks'''.<br> - <br> - There may be many repeated chunks within a log file (e.g. if a chunk is an SCF cycle, there will be as many SCF chunks as there were SCF iterations), but a block is only likely to appear once within each chunk. Continuing the SCF example, a block within the SCF chunk could be the piece of text describing the current energy and energy change from the previous cycle.</span> </td> <td> <span>+ The approach that has been adopted by the parsers is to break the monolithic text block of the logfile into a series of separate '''chunks''' that encapsulate a coherent piece of data.<br> + <br> + There may be many repeated chunks within a log file. For example, if a chunk is an SCF calculation, for a single-point energy calculation there would just be a single ''SCF chunk'', whereas for a geometry optimisation calculation, there would be as many SCF chunks as there were SCF calculations.<br> + <br> + Chunks are often nested, so using the geometry optimisation example, a single geometry optimisation step would itself be a chunk, and this in turn would contain (one or more) SCF chunks. There would then be as many geometry optimisation step chunks as there were geometry optimisation steps.</span> </td> </tr> <tr> <td> Line 30: </td> <td> Line 32: </td> </tr> <tr> <td> <span>- </span> </td> <td> </td> </tr> </table> </div> Chempoundhttp://quixote.wikispot.org/Chempound2012-06-18 11:31:00JensThomas <div id="content" class="wikipage content"> Differences for Chempound<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 612: </td> <td> Line 612: </td> </tr> <tr> <td> </td> <td> <span>+ '''NB:''' For general information and examples of contexts file, please see the [http://wiki.eclipse.org/Jetty_Expanded_Webapp_Deploy jetty wiki].<br> + </span> </td> </tr> <tr> <td> Line 624: </td> <td> Line 626: </td> </tr> <tr> <td> </td> <td> <span>+ <br> + === Security Considerations ===<br> + <br> + In order to make chempound available on a standard URL (such as [http://cdsora4.dl.ac.uk/chempound]), the server needs to listen for TCP requests on port 80.<br> + <br> + On unix systems, only processes started by root are permitted to bind to ports numbered less than 1024, which would entail a requirement to run the chempound jetty server as root. However, this is not considered a good security practice, and there is no other reason why the server needs to run as root.<br> + <br> + On debian-based systems, a way around this is the '''authbind''' package, which allows users to bind non-root servers to a low-numbered port.<br> + <br> + Another approach is to start the server under a non-root user, binding to a high-numbered port and to use a firewall to redirect requests from port 80 to the port the server is listening on. If the server was started on port 8080, then the iptables rule to accomplish this would be:<br> + <br> + {{{<br> + iptables -t nat -A PREROUTING -p tcp --dport 80 -j REDIRECT --to 8080<br> + }}}<br> + <br> + '''NB:''' if using this method, it is important to remember that the '''chempound.uri''' variable, will need to be set to point at the url as visible externally, and should not include the port number, as otherwise the CSS files will not be found.</span> </td> </tr> </table> </div> Chempoundhttp://quixote.wikispot.org/Chempound2012-06-17 21:31:29JensThomas <div id="content" class="wikipage content"> Differences for Chempound<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 408: </td> <td> Line 408: </td> </tr> <tr> <td> </td> <td> <span>+ <br> + The latest version of the war file for chempound can be downloaded [https://hudson.ch.cam.ac.uk/job/quixote-repository/lastStableBuild/uk.ac.cam.ch.wwmm.quixote.repository$quixote-repository-webapp/ here].</span> </td> </tr> </table> </div> Chempoundhttp://quixote.wikispot.org/Chempound2012-06-17 21:12:37 <div id="content" class="wikipage content"> Differences for Chempound<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 407: </td> <td> Line 407: </td> </tr> <tr> <td> <span>-</span> Chempound is a pure java program, so can be run in any java container<span>,</span> <span>but t</span>hese instructions are specific to installing it into [http://jetty.codehaus.org/jetty/ jetty] <span>and </span>for use on unix systems. </td> <td> <span>+</span> Chempound is a pure java program, so can be run in any java container<span>.</span> <span>T</span>hese instructions are specific to installing it into [http://jetty.codehaus.org/jetty/ jetty] for use on unix systems. </td> </tr> <tr> <td> Line 443: </td> <td> Line 443: </td> </tr> <tr> <td> <span>-</span> The file '''start.jar''' is the java file used to start jetty<span>, and the server is started</span> with the command: </td> <td> <span>+</span> The file '''start.jar''' is the java file used to start jetty with the command: </td> </tr> <tr> <td> Line 578: </td> <td> Line 578: </td> </tr> <tr> <td> <span>-</span> There are two ways that jetty is usually configured to serve<span>r</span> applications:<br> <span>-</span> <br> <span>-</span> * jetty can monitor a directory (by default the '''webapps''' directory) and any '''.war''' files placed there, will be served at a URL determined <span>by</span> the name of the war file (i.e. '''quixote-repository-webapp-0.1-SNAPSHOT.war''' would be served at the URL '''/quixote-repository-webapp-0.1-SNAPSHOT''' relative to the base server url. </td> <td> <span>+</span> There are two ways that jetty is usually configured to serve applications:<br> <span>+</span> <br> <span>+</span> * jetty can monitor a directory (by default the '''webapps''' directory) and any '''.war''' files placed there, will be served at a URL determined <span>from</span> the name of the war file (i.e. '''quixote-repository-webapp-0.1-SNAPSHOT.war''' would be served at the URL '''/quixote-repository-webapp-0.1-SNAPSHOT''' relative to the base server url. </td> </tr> <tr> <td> Line 583: </td> <td> Line 583: </td> </tr> <tr> <td> <span>-</span> This example uses the second approach, he final block of XML in the '''jetty.xml''' above, configur<span>es</span> jetty to monitor the context directory. The contexts directory<span>&nbsp;</span> contains one file, '''quixote.xml''' the contents of which are shown below (with comments to explain relevant bits<span>&nbsp;of the file</span>): </td> <td> <span>+</span> This example uses the second approach, <span>t</span>he final block of XML in the '''jetty.xml''' above, configur<span>ing</span> jetty to monitor the context directory. The contexts directory contains one file, '''quixote.xml'''<span>,</span> the contents of which are shown below (with comments to explain relevant bits): </td> </tr> <tr> <td> Line 612: </td> <td> Line 612: </td> </tr> <tr> <td> <span>-</span> The next block sets two important variables that are needed by chempound<span>, as described below</span>:<br> - <span><br> - * '''chempound.uri''' - </span>this is a string included<span>&nbsp;(via a template mechanism)</span> in the html pages served by chempound and is used to set the url where various files (such as the CSS files) are expected to be found. It should be the full url where the base chempound sever will be found, such as '''http://cdsora4.dl.ac.uk/chempound'''.<br> <span>-</span> * '''chempound.workspace''' - this the path to a locally accessible directory on the server where all the files needed by chempound will be stored. The actual files held by chempound (such as the logfiles, CML file etc, are stored in the this directory in the '''content''' folder<span>.</span> </td> <td> <span>+</span> The next block sets two important variables that are needed by chempound:<br> <span>+ <br> + * '''chempound.uri''' </span>- this is a string included in the html pages served by chempound and is used to set the url where various files (such as the CSS files) are expected to be found. It should be the full url where the base chempound sever will be found, such as '''http://cdsora4.dl.ac.uk/chempound'''.<br> <span>+</span> * '''chempound.workspace''' - this the path to a locally accessible directory on the server where all the files needed by chempound will be stored. The actual files held by chempound (such as the logfiles, CML file etc, are stored in the this directory in the '''content''' folder<span>).</span> </td> </tr> </table> </div> Chempoundhttp://quixote.wikispot.org/Chempound2012-06-17 21:07:26 <div id="content" class="wikipage content"> Differences for Chempound<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 578: </td> <td> Line 578: </td> </tr> <tr> <td> <span>- There are two ways that jetty is usually configured to server applications.<br> - * jetty can monitor a directory (by default the '''webapps''' directory) and any '''.war''' files placed there, will be served at a URL determined by the name of the war file (i.e. '''quixote-repository-webapp-0.1-SNAPSHOT.war''' would be served at the URL '''/quixote-repository-webapp-0.1-SNAPSHOT''' relative to the base server url.<br> - * jetty can monitor a directory (by default the '''contexts''' directory)</span> </td> <td> <span>+ There are two ways that jetty is usually configured to server applications:<br> + <br> + * jetty can monitor a directory (by default the '''webapps''' directory) and any '''.war''' files placed there, will be served at a URL determined by the name of the war file (i.e. '''quixote-repository-webapp-0.1-SNAPSHOT.war''' would be served at the URL '''/quixote-repository-webapp-0.1-SNAPSHOT''' relative to the base server url.<br> + * jetty can monitor a directory (by default the '''contexts''' directory) for XML files, and these will then be parsed to determine the location of the application's war file and the options required for serving the application.<br> + <br> + This example uses the second approach, he final block of XML in the '''jetty.xml''' above, configures jetty to monitor the context directory. The contexts directory contains one file, '''quixote.xml''' the contents of which are shown below (with comments to explain relevant bits of the file):<br> + <br> + {{{<br> + &lt;?xml version="1.0" encoding="ISO-8859-1"?&gt;<br> + &lt;!DOCTYPE Configure PUBLIC "-//Jetty//Configure//EN" "http://www.eclipse.org/jetty/configure.dtd"&gt;<br> + <br> + &lt;Configure class="org.eclipse.jetty.webapp.WebAppContext"&gt;<br> + &lt;!-- This is the URL where the application will be served from --&gt;<br> + &lt;Set name="contextPath"&gt;/chempound&lt;/Set&gt;<br> + <br> + &lt;!-- The absolute location of the war file for chempound on the server's filesystem --&gt;<br> + &lt;Set name="war"&gt;/home/jens/jetty-hightide-8.1.4.v20120524/quixote/quixote-repository-webapp-0.1-SNAPSHOT.war&lt;/Set&gt;<br> + <br> + &lt;!-- Set startup parameters --&gt;<br> + &lt;Get name="ServletContext"&gt;<br> + &lt;Call name="setAttribute"&gt;<br> + &lt;Arg&gt;chempound.uri&lt;/Arg&gt;<br> + &lt;Arg&gt;http://cdsora4.dl.ac.uk/chempound&lt;/Arg&gt;<br> + &lt;/Call&gt;<br> + &lt;Call name="setAttribute"&gt;<br> + &lt;Arg&gt;chempound.workspace&lt;/Arg&gt;<br> + &lt;Arg&gt;/home/jens/jetty-hightide-8.1.4.v20120524/quixote/workspace&lt;/Arg&gt;<br> + &lt;/Call&gt;<br> + &lt;/Get&gt;<br> + &lt;/Configure&gt;<br> + }}}<br> + <br> + The first two '''Set''' commands should be self-explanatory.<br> + <br> + The next block sets two important variables that are needed by chempound, as described below:<br> + <br> + * '''chempound.uri''' - this is a string included (via a template mechanism) in the html pages served by chempound and is used to set the url where various files (such as the CSS files) are expected to be found. It should be the full url where the base chempound sever will be found, such as '''http://cdsora4.dl.ac.uk/chempound'''.<br> + * '''chempound.workspace''' - this the path to a locally accessible directory on the server where all the files needed by chempound will be stored. The actual files held by chempound (such as the logfiles, CML file etc, are stored in the this directory in the '''content''' folder.<br> + <br> + These two variables can also be set by setting them as environment variables before the server is started, or setting them on the command-line when the server is started as shown below:<br> + <br> + {{{<br> + java -Dchempound.uri="http://cdsora4.dl.ac.uk/chempound" -Dchempound.workspace="/home/jens/jetty-hightide-8.1.4.v20120524/quixote/workspace" -jar start.jar<br> + }}}</span> </td> </tr> </table> </div> Chempoundhttp://quixote.wikispot.org/Chempound2012-06-17 20:49:19 <div id="content" class="wikipage content"> Differences for Chempound<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 409: </td> <td> Line 409: </td> </tr> <tr> <td> </td> <td> <span>+ There are any number of ways to configure jetty, so this just describes one way, with some pointers to the other possibilities.<br> + </span> </td> </tr> <tr> <td> Line 416: </td> <td> Line 418: </td> </tr> <tr> <td> </td> <td> <span>+ +-start.ini<br> + |</span> </td> </tr> <tr> <td> Line 417: </td> <td> Line 421: </td> </tr> <tr> <td> <span>- |<br> - +-start.ini</span> </td> <td> </td> </tr> <tr> <td> Line 427: </td> <td> Line 429: </td> </tr> <tr> <td> <span>- |<br> - +-lib/</span> </td> <td> </td> </tr> <tr> <td> Line 442: </td> <td> Line 442: </td> </tr> <tr> <td> </td> <td> <span>+ <br> + The file '''start.jar''' is the java file used to start jetty, and the server is started with the command:<br> + <br> + {{{<br> + java -jar start.jar<br> + }}}<br> + <br> + By default, on startup, jetty will parse the file '''start.ini''', which contains command-line options for the server, including the list of modules to include, and a list of XML configuration files that determine various options (these are listed one per line in the start.ini files and can be removed by commenting the line out with the '''#''' character). By default, the XML files reside in the '''etc''' directory. In this example, only one configuration file is used, the '''jetty.xml''' file in the '''etc''' directory.<br> + <br> + This file contains the following:<br> + <br> + {{{<br> + &lt;?xml version="1.0"?&gt;<br> + &lt;!DOCTYPE Configure PUBLIC "-//Jetty//Configure//EN" "http://www.eclipse.org/jetty/configure.dtd"&gt;<br> + <br> + &lt;!-- =============================================================== --&gt;<br> + &lt;!-- Configure the Jetty Server --&gt;<br> + &lt;!-- --&gt;<br> + &lt;!-- Documentation of this file format can be found at: --&gt;<br> + &lt;!-- http://wiki.eclipse.org/Jetty/Reference/jetty.xml_syntax --&gt;<br> + &lt;!-- --&gt;<br> + &lt;!-- Additional configuration files are available in $JETTY_HOME/etc --&gt;<br> + &lt;!-- and can be mixed in. For example: --&gt;<br> + &lt;!-- java -jar start.jar etc/jetty-ssl.xml --&gt;<br> + &lt;!-- --&gt;<br> + &lt;!-- See start.ini file for the default configuraton files --&gt;<br> + &lt;!-- =============================================================== --&gt;<br> + <br> + <br> + &lt;Configure id="Server" class="org.eclipse.jetty.server.Server"&gt;<br> + <br> + &lt;!-- =========================================================== --&gt;<br> + &lt;!-- Server Thread Pool --&gt;<br> + &lt;!-- =========================================================== --&gt;<br> + &lt;Set name="ThreadPool"&gt;<br> + &lt;!-- Default queued blocking threadpool --&gt;<br> + &lt;New class="org.eclipse.jetty.util.thread.QueuedThreadPool"&gt;<br> + &lt;Set name="minThreads"&gt;10&lt;/Set&gt;<br> + &lt;Set name="maxThreads"&gt;200&lt;/Set&gt;<br> + &lt;Set name="detailedDump"&gt;false&lt;/Set&gt;<br> + &lt;/New&gt;<br> + &lt;/Set&gt;<br> + <br> + &lt;!-- =========================================================== --&gt;<br> + &lt;!-- Set connectors --&gt;<br> + &lt;!-- =========================================================== --&gt;<br> + <br> + &lt;Call name="addConnector"&gt;<br> + &lt;Arg&gt;<br> + &lt;New class="org.eclipse.jetty.server.nio.SelectChannelConnector"&gt;<br> + &lt;Set name="host"&gt;&lt;Property name="jetty.host" /&gt;&lt;/Set&gt;<br> + &lt;Set name="port"&gt;&lt;Property name="jetty.port" default="8181"/&gt;&lt;/Set&gt;<br> + &lt;Set name="maxIdleTime"&gt;300000&lt;/Set&gt;<br> + &lt;Set name="Acceptors"&gt;2&lt;/Set&gt;<br> + &lt;Set name="statsOn"&gt;false&lt;/Set&gt;<br> + &lt;Set name="confidentialPort"&gt;8443&lt;/Set&gt;<br> + &lt;Set name="lowResourcesConnections"&gt;20000&lt;/Set&gt;<br> + &lt;Set name="lowResourcesMaxIdleTime"&gt;5000&lt;/Set&gt;<br> + &lt;/New&gt;<br> + &lt;/Arg&gt;<br> + &lt;/Call&gt;<br> + <br> + &lt;!-- =========================================================== --&gt;<br> + &lt;!-- Set handler Collection Structure --&gt;<br> + &lt;!-- =========================================================== --&gt;<br> + &lt;Set name="handler"&gt;<br> + &lt;New id="Handlers" class="org.eclipse.jetty.server.handler.HandlerCollection"&gt;<br> + &lt;Set name="handlers"&gt;<br> + &lt;Array type="org.eclipse.jetty.server.Handler"&gt;<br> + &lt;Item&gt;<br> + &lt;New id="Contexts" class="org.eclipse.jetty.server.handler.ContextHandlerCollection"/&gt;<br> + &lt;/Item&gt;<br> + &lt;Item&gt;<br> + &lt;New id="DefaultHandler" class="org.eclipse.jetty.server.handler.DefaultHandler"/&gt;<br> + &lt;/Item&gt;<br> + &lt;/Array&gt;<br> + &lt;/Set&gt;<br> + &lt;/New&gt;<br> + &lt;/Set&gt;<br> + <br> + &lt;!-- =========================================================== --&gt;<br> + &lt;!-- extra options --&gt;<br> + &lt;!-- =========================================================== --&gt;<br> + &lt;Set name="stopAtShutdown"&gt;true&lt;/Set&gt;<br> + &lt;Set name="sendServerVersion"&gt;true&lt;/Set&gt;<br> + &lt;Set name="sendDateHeader"&gt;true&lt;/Set&gt;<br> + &lt;Set name="gracefulShutdown"&gt;1000&lt;/Set&gt;<br> + &lt;Set name="dumpAfterStart"&gt;false&lt;/Set&gt;<br> + &lt;Set name="dumpBeforeStop"&gt;false&lt;/Set&gt;<br> + <br> + &lt;!-- =============================================================== --&gt;<br> + &lt;!-- Create the deployment manager --&gt;<br> + &lt;!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - --&gt;<br> + &lt;!-- The deplyment manager handles the lifecycle of deploying web --&gt;<br> + &lt;!-- applications. Apps are provided by instances of the --&gt;<br> + &lt;!-- AppProvider interface. Typically these are provided by --&gt;<br> + &lt;!-- one or more of: --&gt;<br> + &lt;!-- jetty-webapps.xml - monitors webapps for wars and dirs --&gt;<br> + &lt;!-- jetty-contexts.xml - monitors contexts for context xml --&gt;<br> + &lt;!-- jetty-templates.xml - monitors contexts and templates --&gt;<br> + &lt;!-- =============================================================== --&gt;<br> + &lt;Call name="addBean"&gt;<br> + &lt;Arg&gt;<br> + &lt;New id="DeploymentManager" class="org.eclipse.jetty.deploy.DeploymentManager"&gt;<br> + &lt;Set name="contexts"&gt;<br> + &lt;Ref id="Contexts" /&gt;<br> + &lt;/Set&gt;<br> + &lt;!--<br> + &lt;Call name="setContextAttribute"&gt;<br> + &lt;Arg&gt;org.eclipse.jetty.server.webapp.ContainerIncludeJarPattern&lt;/Arg&gt;<br> + &lt;Arg&gt;.*/servlet-api-[^/]*\.jar$&lt;/Arg&gt;<br> + &lt;/Call&gt;<br> + --&gt;<br> + &lt;/New&gt;<br> + &lt;/Arg&gt;<br> + &lt;/Call&gt;<br> + <br> + &lt;!-- =============================================================== --&gt;<br> + &lt;!-- Add a ContextProvider to the deployment manager --&gt;<br> + &lt;!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - --&gt;<br> + &lt;!-- This scans the contexts directory for xml files descrbing an app --&gt;<br> + &lt;!-- =============================================================== --&gt;<br> + &lt;Ref id="DeploymentManager"&gt;<br> + &lt;Call name="addAppProvider"&gt;<br> + &lt;Arg&gt;<br> + &lt;New class="org.eclipse.jetty.deploy.providers.ContextProvider"&gt;<br> + &lt;Set name="monitoredDirName"&gt;&lt;Property name="jetty.home" default="." /&gt;/contexts&lt;/Set&gt;<br> + &lt;Set name="scanInterval"&gt;1&lt;/Set&gt;<br> + &lt;/New&gt;<br> + &lt;/Arg&gt;<br> + &lt;/Call&gt;<br> + &lt;/Ref&gt;<br> + <br> + &lt;/Configure&gt;<br> + }}}<br> + <br> + There are two ways that jetty is usually configured to server applications.<br> + * jetty can monitor a directory (by default the '''webapps''' directory) and any '''.war''' files placed there, will be served at a URL determined by the name of the war file (i.e. '''quixote-repository-webapp-0.1-SNAPSHOT.war''' would be served at the URL '''/quixote-repository-webapp-0.1-SNAPSHOT''' relative to the base server url.<br> + * jetty can monitor a directory (by default the '''contexts''' directory)</span> </td> </tr> </table> </div> Chempoundhttp://quixote.wikispot.org/Chempound2012-06-16 11:51:29JensThomas <div id="content" class="wikipage content"> Differences for Chempound<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 424: </td> <td> Line 424: </td> </tr> <tr> <td> <span>- +-contexts/<br> - |<br> - +-lib/<br> - |<br> - +-chempound/</span> </td> <td> <span>+ +-etc/</span> </td> </tr> <tr> <td> Line 431: </td> <td> Line 427: </td> </tr> <tr> <td> </td> <td> <span>+ |<br> + +-lib/<br> + |<br> + +-webapps/<br> + |<br> + +-chempound/</span> </td> </tr> <tr> <td> Line 439: </td> <td> Line 441: </td> </tr> <tr> <td> <span>- |<br> - +-webapps/<br> - }}}</span> </td> <td> <span>+ }}}</span> </td> </tr> </table> </div> Chempoundhttp://quixote.wikispot.org/Chempound2012-06-16 10:13:01JensThomas <div id="content" class="wikipage content"> Differences for Chempound<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 398: </td> <td> Line 398: </td> </tr> <tr> <td> </td> <td> <span>+ <br> + == Installing Chempound ==<br> + <br> + For installing chempound for personal use on a local machine, the [http://www.chempound.net/getting-started.html getting-started] notes should be sufficient.<br> + <br> + The following instructions apply for installing Chempound on an existing server, for use by an institution or group.<br> + <br> + === Installing Chempound into an existing Jetty server ===<br> + <br> + Chempound is a pure java program, so can be run in any java container, but these instructions are specific to installing it into [http://jetty.codehaus.org/jetty/ jetty] and for use on unix systems.<br> + <br> + If you do not already have jetty installed on your server, and it is not available within the package management software for your distribution, a jetty hightide distribution, can be downloaded from [http://dist.codehaus.org/jetty/ codehaus].<br> + <br> + The following instructions assume that you have a jetty server, with a directory structure similar to the following (only the relevant files and directories are listed).<br> + <br> + {{{<br> + +-jetty-hightide-8.1.4.v20120524/<br> + |<br> + +-start.jar<br> + |<br> + +-start.ini<br> + |<br> + +-contexts/<br> + | |<br> + | +-quixote.xml<br> + |<br> + +-contexts/<br> + |<br> + +-lib/<br> + |<br> + +-chempound/<br> + | |<br> + | +-jetty.xml<br> + | |<br> + | +-workspace/<br> + | | |<br> + | | +-cache/<br> + | | |<br> + | | +-content/<br> + | | |<br> + | | +-tdb/<br> + |<br> + +-webapps/<br> + }}}</span> </td> </tr> </table> </div> Prototype datahttp://quixote.wikispot.org/Prototype_data2012-06-16 09:01:25JensThomas <div id="content" class="wikipage content"> Differences for Prototype data<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 587: </td> <td> Line 587: </td> </tr> <tr> <td> <span>- This is just a temporary place to dump stuff related to the above before it is put somewhere sensible.<br> - <br> - == Conventions ==<br> - <br> - The conventions are [http://www.xml-cml.org/convention/ here]<br> - <br> - The text for the conventions currently resides in the the [https://bitbucket.org/cml/xml-cml.org/src/b08186d90a19/convention bitbucket repository] for the xml-cml website. This is where the "human-readable" text lives.<br> - <br> - The rules that the convention text describes is actually implemented in xslt within the [https://bitbucket.org/cml/cmllite-validator-code/overview cml validator]. So for example the rules defining the compchem convention, are implemented in the [https://bitbucket.org/cml/cmllite-validator-code/src/2eaa18f959bb/src/main/resources/org/xmlcml/www/compchem-rules.xsl compchem-rules.xsl] file.<br> - <br> - Any changes or updates to the convention generally requires editing BOTH files.<br> - <br> - === Updates and Extensions to the CompChem convention ===<br> - <br> - The [http://www.xml-cml.org/convention/compchem compchem convention] is currently rather loose and abstract, so this is an attempt to attach some definitions and examples to clarify its usage.<br> - <br> - The job and calculation modules are effectively functionally identical, as many of the calculations within a job could be run as separate jobs themselves (e.g. a single-point calculation in an optimisation run could be submitted separately). A job therefore serves to group one or more calculations into a logical unit of work - it can be thought of as the unit of work that would be submitted to a computational resource.<br> - <br> - Calculations can themselves contain calculations and be nested to any degree. Aside from having an initialization module, there is no requirement on a calculation to have any particular attributes (and indeed calculations are optional within jobs, as the job may just define an input, or the workings of the calculations may not be of any interest to the user).<br> - <br> - Calculations can also inherit attributes from their parents, so for example, a basis_set need not be contained within each SCF iteration calculation, as the parent's basis set can be linked to the child.<br> - <br> - However, both calculations and jobs must have an initialization module, which at least contains a task field to identify the role of the calculation, and any important results must go in the finalization module. This allows code developers complete freedom to structure the modules as they see fit, but any software that uses the convention knows to query the initialization module for information on the calculation, and the finalization module for any results.<br> - <br> - === Energies and Results ===<br> - <br> - Several energies are output by computational jobs through the course of a run. These may be a guess energy, iteration energy, two-electron energy, correlation energy etc. Labelling each of these separately would lead to a large number of energy terms and difficulty searching for them (e.g. if you search for an MP2 energy, do you want RI-MP2 results too? Is a DFT energy labelled with the functional?). To prevent this confusion, the energies are split into categories (e.g. one-electron energy, correlation energy, total energy etc) and the meaning is determined by the context the energy is found in, i.e. the parameters of the initialization module of the calculation that contains the energy. For this reason, for an SCF calculation, the guess and individual iteration energies, each need to be held within their own calculation module. This leads to some additional verbosity and complexity, but is largely mitigated because the (e.g.) iteration calculations need only contain the information in their initialization modules that cannot be inherited from their parent SCF modules.<br> - <br> - The advantage of this approach is that simple generic xpath queries can be used to extract all the energies from (e.g. a geometry optimisation or mcscf calculation) by any program that wishes to plot the energy trajectory of a calculation, without needing to know anything of the structure or logic of the calculation it is querying.<br> - <br> - ==== Data structures ===<br> - <br> - Scalar, array or matrix quantities are already supported in CML and be added to the initialization/finalization modules as parameters or properties.<br> - <br> - Any complex data structures should be held in cml lists (aside from molecules which are an existing fundamental part of CML), with a dictRef that defines what the data contained within the module is.<br> - <br> - The current complex data structures are:<br> - <br> - [http://www.xml-cml.org/convention/molecular molecules]<br> - [http://www.xml-cml.org/dictionary/compchem/#basis_set basis sets]<br> - [http://www.xml-cml.org/dictionary/compchem/#dft_functional DFT functional]<br> - [http://www.xml-cml.org/dictionary/compchem/#molecular_orbital molecular orbital]<br> - <br> - <br> - == Dictionaries ==<br> - <br> - The dictionaries are [http://www.xml-cml.org/dictionary/ here].<br> - <br> - <br> - == Basis sets ==<br> - <br> - NB: Most of this information is now in the [http://www.xml-cml.org/dictionary/compchem/ compchem dictionary]<br> - <br> - The [https://bse.pnl.gov/bse EMSL basis set exchange] has already done most of the work to describe basis sets.<br> - <br> - The paper: [http://pubs.acs.org/doi/abs/10.1021/ci600510j Basis Set Exchange: A Community Database for Computational Sciences] describes the infrastructure and the considerations that went into the design.<br> - <br> - The schemas they have developed are available here: https://bse.pnl.gov/bse/docs/schemas/<br> - <br> - This should serve as the basis for anything that is done with basis sets in CML. They currently use version 2 of the CML schema (http://www.xml-cml.org/schema/cml2/core), but the only things they use are '''cml:matrix''' and '''cml:elementTypeType'''. The following snippet of NWChem output:<br> - <br> - {{{<br> - <br> - <br> - Basis "ao basis" -&gt; "" (cartesian)<br> - -----<br> - O (Oxygen)<br> - ----------<br> - Exponent Coefficients<br> - -------------- ---------------------------------------------------------<br> - 1 S 5.48467170E+03 0.001831<br> - 1 S 8.25234950E+02 0.013950<br> - 1 S 1.88046960E+02 0.068445<br> - 1 S 5.29645000E+01 0.232714<br> - 1 S 1.68975700E+01 0.470193<br> - 1 S 5.79963530E+00 0.358521<br> - <br> - 2 S 1.55396160E+01 -0.110778<br> - 2 S 3.59993360E+00 -0.148026<br> - 2 S 1.01376180E+00 1.130767<br> - <br> - 3 P 1.55396160E+01 0.070874<br> - 3 P 3.59993360E+00 0.339753<br> - 3 P 1.01376180E+00 0.727159<br> - <br> - 4 S 2.70005800E-01 1.000000<br> - <br> - 5 P 2.70005800E-01 1.000000<br> - <br> - 6 D 8.00000000E-01 1.000000<br> - <br> - H (Hydrogen)<br> - ------------<br> - Exponent Coefficients<br> - -------------- ---------------------------------------------------------<br> - 1 S 1.87311370E+01 0.033495<br> - 1 S 2.82539370E+00 0.234727<br> - 1 S 6.40121700E-01 0.813757<br> - <br> - 2 S 1.61277800E-01 1.000000<br> - <br> - <br> - <br> - Summary of "ao basis" -&gt; "" (cartesian)<br> - ------------------------------------------------------------------------------<br> - Tag Description Shells Functions and Types<br> - ---------------- ------------------------------ ------ ---------------------<br> - O 6-31g* 6 15 3s2p1d<br> - H 6-31g* 2 2 2s<br> - }}}<br> - <br> - <br> - Would be marked up as follows:<br> - <br> - {{{<br> - &lt;?xml version="1.0" encoding="UTF-8"?&gt;<br> - <br> - &lt;module xmlns:bse="http://purl.oclc.org/NET/EMSL/BSE"<br> - xmlns:dc="http://purl.org/dc/elements/1.1/"&gt;<br> - &lt;bse:basisSet&gt;<br> - <br> - &lt;!-- Need to look into MIME type --&gt;<br> - &lt;dc:format&gt;???&lt;/dc:format&gt;<br> - <br> - &lt;dc:title&gt;6-31g*&lt;/dc:title&gt;<br> - <br> - &lt;bse:basisSetType&gt;orbital&lt;/bse:basisSetType&gt;<br> - <br> - &lt;bse:harmonicType&gt;cartesian&lt;/bse:harmonicType&gt;<br> - <br> - &lt;bse:contractionType&gt;general&lt;/bse:contractionType&gt;<br> - <br> - &lt;dc:description&gt;???&lt;/dc:description&gt;<br> - <br> - &lt;bse:contractions elementType="O" harmonicType="cartesian"&gt;<br> - <br> - &lt;dc:description&gt;???&lt;/dc:description&gt;<br> - <br> - &lt;bse:contraction shell="S"&gt;<br> - &lt;cml:matrix rows="6" columns="2" dataType="xsd:double"&gt;<br> - 5.48467170E+03 0.001831<br> - 8.25234950E+02 0.013950<br> - 1.88046960E+02 0.068445<br> - 5.29645000E+01 0.232714<br> - 1.68975700E+01 0.470193<br> - 5.79963530E+00 0.358521<br> - &lt;/cml:matrix&gt;<br> - &lt;/bse:contraction&gt;<br> - <br> - &lt;bse:contraction shell="S"&gt;<br> - &lt;cml:matrix rows="3" columns="2" dataType="xsd:double"&gt;<br> - 1.55396160E+01 -0.110778<br> - 3.59993360E+00 -0.148026<br> - 1.01376180E+00 1.130767<br> - &lt;/cml:matrix&gt;<br> - &lt;/bse:contraction&gt;<br> - <br> - &lt;bse:contraction shell="P"&gt;<br> - &lt;cml:matrix rows="3" columns="2" dataType="xsd:double"&gt;<br> - 1.55396160E+01 0.070874<br> - 3.59993360E+00 0.339753<br> - 1.01376180E+00 0.727159<br> - &lt;/cml:matrix&gt;<br> - &lt;/bse:contraction&gt;<br> - <br> - &lt;bse:contraction shell="S"&gt;<br> - &lt;cml:matrix rows="1" columns="2" dataType="xsd:double"&gt;<br> - 2.70005800E-01 1.000000<br> - &lt;/cml:matrix&gt;<br> - &lt;/bse:contraction&gt;<br> - <br> - &lt;bse:contraction shell="P"&gt;<br> - &lt;cml:matrix rows="1" columns="2" dataType="xsd:double"&gt;<br> - 2.70005800E-01 1.000000<br> - &lt;/cml:matrix&gt;<br> - &lt;/bse:contraction&gt;<br> - <br> - &lt;bse:contraction shell="D"&gt;<br> - &lt;cml:matrix rows="1" columns="2" dataType="xsd:double"&gt;<br> - 8.00000000E-01 1.000000<br> - &lt;/cml:matrix&gt;<br> - &lt;/bse:contraction&gt;<br> - <br> - &lt;/bse:contractions&gt;<br> - <br> - &lt;bse:contractions elementType="H" harmonicType="cartesian"&gt;<br> - <br> - &lt;dc:description&gt;???&lt;/dc:description&gt;<br> - <br> - &lt;bse:contraction shell="S"&gt;<br> - &lt;cml:matrix rows="3" columns="3" dataType="xsd:double"&gt;<br> - 1.87311370E+01 0.033495<br> - 2.82539370E+00 0.234727<br> - 6.40121700E-01 0.813757<br> - &lt;/cml:matrix&gt;<br> - &lt;/bse:contraction&gt;<br> - <br> - &lt;bse:contraction shell="S"&gt;<br> - &lt;cml:matrix rows="1" columns="1" dataType="xsd:double"&gt;<br> - 1.61277800E-01 1.000000<br> - &lt;/cml:matrix&gt;<br> - &lt;/bse:contraction&gt;<br> - <br> - &lt;/bse:contractions&gt;<br> - <br> - &lt;/bse:basisSet&gt;<br> - <br> - &lt;/module&gt;<br> - &lt;/xml&gt;<br> - }}}<br> - <br> - Within the context of conventions and dictionaries as detailed in the [http://www.jcheminf.com/content/3/1/43 J. Chem. Inf. paper (doi:10.1186/1758-2946-3-43)], this is best dealt with using existing CML datatypes and dictionaries. For the time being, lists will be used to hold all complex data structures, with dictionary entries describing the element. Rather then using matricies to hold the exponents and coefficients, arrays will be used. Attributes (e.g. elementType on contractions) will be moved into separate elements, with the convention used to enforce their appearance. This also means that the attributes will appear as dictionary entries with an associated explanation. We need to add an compchem:atom_label entry so that the basis sets can be mapped onto the associated atoms - the "O" printd in the NWChem output is a tag that - in theory - could be anything, it is just used to label the atom and can be used for assigning different basis sets to different atoms of the same element type. Following on from the convention used in the dictionaries, names will be separated with an underscore.<br> - <br> - {{{<br> - &lt;?xml version="1.0" encoding="UTF-8"?&gt;<br> - <br> - &lt;cml:module<br> - xmlns:compchem="http://www.xml-cml.org/dictionary/compchem/"<br> - xmlns:convention="http://www.xml-cml.org/convention/"<br> - convention="convention:compchem"<br> - &gt;<br> - &lt;cml:list dictRef="compchem:basis_set"&gt;<br> - <br> - &lt;cml:scalar dictRef="compchem:basis_set_title" dataType="xsd:string"&gt;<br> - 6-31g*<br> - &lt;/cml:scalar&gt;<br> - <br> - &lt;cml:scalar dictRef="compchem:basis_set_type" dataType="xsd:string"&gt;<br> - orbital<br> - &lt;/cml:scalar&gt;<br> - <br> - &lt;cml:scalar dictRef="compchem:basis_set_harmonic_type" dataType="xsd:string"&gt;<br> - cartesian<br> - &lt;/cml:scalar&gt;<br> - <br> - &lt;cml:scalar dictRef="compchem:basis_set_contraction_type" dataType="xsd:string"&gt;<br> - general<br> - &lt;/cml:scalar&gt;<br> - <br> - &lt;cml:scalar dictRef="compchem:basis_set_description" dataType="xsd:string"&gt;<br> - ???<br> - &lt;/cml:scalar&gt;<br> - <br> - &lt;cml:list dictRef="compchem:basis_set_contractions"&gt;<br> - <br> - &lt;!-- added as need to link contractions to atom --&gt;<br> - &lt;cml:scalar dictRef="compchem:atom_label" dataType="xsd:string"&gt;<br> - O<br> - &lt;/cml:scalar&gt;<br> - <br> - &lt;!-- was elementType attribute --&gt;<br> - &lt;cml:scalar dictRef="compchem:element_type" dataType="xsd:string"&gt;<br> - O<br> - &lt;/cml:scalar&gt;<br> - <br> - &lt;!-- was harmonicType attribute --&gt;<br> - &lt;cml:scalar dictRef="compchem:basis_set_harmonic_type" dataType="xsd:string"&gt;<br> - cartesian<br> - &lt;/cml:scalar&gt;<br> - <br> - &lt;cml:scalar dictRef="compchem:basis_set_contractions_description" dataType="xsd:string"&gt;<br> - ???<br> - &lt;/cml:scalar&gt;<br> - <br> - &lt;cml:list dictRef="compchem:basis_set_contraction"&gt;<br> - &lt;!-- was shell attribute --&gt;<br> - &lt;cml:scalar dictRef="compchem:basis_set_shell" dataType="xsd:string"&gt;<br> - S<br> - &lt;/cml:scalar&gt;<br> - <br> - &lt;cml:array size="6" dataType="xsd:double"<br> - dictRef="compchem:basis_set_exponents"&gt;<br> - 5.48467170E+03 8.25234950E+02 1.88046960E+02 5.29645000E+01 1.68975700E+01 5.79963530E+00<br> - &lt;/cml:array&gt;<br> - <br> - &lt;cml:array size="6" dataType="xsd:double"<br> - dictRef="compchem:basis_set_coefficients"&gt;<br> - 0.001831 0.013950 0.068445 0.232714 0.470193 0.358521<br> - &lt;/cml:array&gt;<br> - &lt;/cml:list&gt;<br> - <br> - &lt;cml:list dictRef="compchem:basis_set_contraction"&gt;<br> - &lt;cml:scalar dictRef="compchem:basis_set_shell" dataType="xsd:string"&gt;<br> - S<br> - &lt;/cml:scalar&gt;<br> - &lt;cml:array size="3" dataType="xsd:double"<br> - dictRef="compchem:basis_set_exponents"&gt;<br> - 1.55396160E+01 3.59993360E+00 1.01376180E+00<br> - &lt;/cml:array&gt;<br> - <br> - &lt;cml:array size="3" dataType="xsd:double"<br> - dictRef="compchem:basis_set_coefficients"&gt;<br> - -0.110778 -0.148026 1.130767<br> - &lt;/cml:array&gt;<br> - &lt;/cml:list&gt;<br> - <br> - &lt;cml:list dictRef="compchem:basis_set_contraction"&gt;<br> - &lt;cml:scalar dictRef="compchem:basis_set_shell" dataType="xsd:string"&gt;<br> - P<br> - &lt;/cml:scalar&gt;<br> - &lt;cml:array size="3" dataType="xsd:double"<br> - dictRef="compchem:basis_set_exponents"&gt;<br> - 1.55396160E+01 3.59993360E+00 1.01376180E+00<br> - &lt;/cml:array&gt;<br> - <br> - &lt;cml:array size="3" dataType="xsd:double"<br> - dictRef="compchem:basis_set_coefficients"&gt;<br> - 0.070874 0.339753 0.727159<br> - &lt;/cml:array&gt;<br> - &lt;/cml:list&gt;<br> - <br> - &lt;cml:list dictRef="compchem:basis_set_contraction"&gt;<br> - &lt;cml:scalar dictRef="compchem:basis_set_shell" dataType="xsd:string"&gt;<br> - S<br> - &lt;/cml:scalar&gt;<br> - &lt;cml:array size="1" dataType="xsd:double"<br> - dictRef="compchem:basis_set_exponents"&gt;<br> - 2.70005800E-01<br> - &lt;/cml:array&gt;<br> - <br> - &lt;cml:array size="1" dataType="xsd:double"<br> - dictRef="compchem:basis_set_coefficients"&gt;<br> - 1.000000<br> - &lt;/cml:array&gt;<br> - &lt;/cml:list&gt;<br> - <br> - &lt;cml:list dictRef="compchem:basis_set_contraction" shell="P"&gt;<br> - &lt;cml:scalar dictRef="compchem:basis_set_shell" dataType="xsd:string"&gt;<br> - P<br> - &lt;/cml:scalar&gt;<br> - &lt;cml:array size="1" dataType="xsd:double"<br> - dictRef="compchem:basis_set_exponents"&gt;<br> - 2.70005800E-01<br> - &lt;/cml:array&gt;<br> - <br> - &lt;cml:array size="1" dataType="xsd:double"<br> - dictRef="compchem:basis_set_coefficients"&gt;<br> - 1.000000<br> - &lt;/cml:array&gt;<br> - &lt;/cml:list&gt;<br> - <br> - &lt;cml:list dictRef="compchem:basis_set_contraction"&gt;<br> - &lt;cml:scalar dictRef="compchem:basis_set_shell" dataType="xsd:string"&gt;<br> - D<br> - &lt;/cml:scalar&gt;<br> - &lt;cml:array size="1" dataType="xsd:double"<br> - dictRef="compchem:basis_set_exponents"&gt;8.00000000E-01<br> - &lt;/cml:array&gt;<br> - <br> - &lt;cml:array size="1" dataType="xsd:double"<br> - dictRef="compchem:basis_set_coefficients"&gt;1.000000<br> - &lt;/cml:array&gt;<br> - &lt;/cml:list&gt;<br> - <br> - &lt;/cml:list&gt;<br> - <br> - &lt;cml:list dictRef="compchem:basis_set_contractions"&gt;<br> - <br> - &lt;!-- added as need to link contractions to atom --&gt;<br> - &lt;cml:scalar dictRef="compchem:atom_label" dataType="xsd:string"&gt;<br> - H<br> - &lt;/cml:scalar&gt;<br> - <br> - &lt;!-- was elementType attribute --&gt;<br> - &lt;cml:scalar dictRef="compchem:element_type" dataType="xsd:string"&gt;<br> - H<br> - &lt;/cml:scalar&gt;<br> - <br> - &lt;!-- was harmonicType attribute --&gt;<br> - &lt;cml:scalar dictRef="compchem:basis_set_harmonic_type" dataType="xsd:string"&gt;<br> - cartesian<br> - &lt;/cml:scalar&gt;<br> - <br> - &lt;cml:scalar dictRef="compchem:basis_set_contractions_description" dataType="xsd:string"&gt;<br> - ???<br> - &lt;/cml:scalar&gt;<br> - <br> - &lt;cml:list dictRef="compchem:basis_set_contraction"&gt;<br> - &lt;cml:scalar dictRef="compchem:basis_set_shell" dataType="xsd:string"&gt;<br> - S<br> - &lt;/cml:scalar&gt;<br> - &lt;cml:array size="3" dataType="xsd:double"<br> - dictRef="compchem:basis_set_exponents"&gt;<br> - 1.87311370E+01 2.82539370E+00 6.40121700E-01<br> - &lt;/cml:array&gt;<br> - <br> - &lt;cml:array size="3" dataType="xsd:double"<br> - dictRef="compchem:basis_set_coefficients"&gt;<br> - 0.033495 0.234727 0.813757<br> - &lt;/cml:array&gt;<br> - &lt;/cml:list&gt;<br> - <br> - &lt;cml:list dictRef="compchem:basis_set_contraction"&gt;<br> - &lt;cml:scalar dictRef="compchem:basis_set_shell" dataType="xsd:string"&gt;<br> - S<br> - &lt;/cml:scalar&gt;<br> - &lt;cml:array size="1" dataType="xsd:double"<br> - dictRef="compchem:basis_set_exponents"&gt;<br> - 1.61277800E-01<br> - &lt;/cml:array&gt;<br> - <br> - &lt;cml:array size="1" dataType="xsd:double"<br> - dictRef="compchem:basis_set_coefficients"&gt;<br> - 1.000000<br> - &lt;/cml:array&gt;<br> - &lt;/cml:list&gt;<br> - <br> - &lt;/cml:list&gt;<br> - <br> - &lt;/cml:list&gt;<br> - <br> - &lt;/module&gt;<br> - &lt;/xml&gt;<br> - }}}</span> </td> <td> <span>+ The information that was previously here concerning conventions and dictionaries has now been integrated into the [http://www.xml-cml.org/convention/ convention] and [http://www.xml-cml.org/dictionary/compchem/ dictionary]</span> </td> </tr> </table> </div> Chempoundhttp://quixote.wikispot.org/Chempound2012-04-23 09:18:30JensThomas <div id="content" class="wikipage content"> Differences for Chempound<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 191: </td> <td> Line 191: </td> </tr> <tr> <td> <span>-</span> === Discovering <span>what</span> terms <span>can be searched on </span>===<br> <span>-</span> <br> <span>-</span> The data that is extracted int RDF and therefore available for searching in Chempound is determined by the convention and dictionaries that apply to the files in question. </td> <td> <span>+</span> === Discovering <span>the available search</span> terms ===<br> <span>+</span> <br> <span>+</span> The data that is extracted int<span>o</span> RDF and therefore available for searching in Chempound is determined by the convention and dictionaries that apply to the files in question. </td> </tr> <tr> <td> Line 219: </td> <td> Line 219: </td> </tr> <tr> <td> <span>-</span> SELECT ?<span>mol</span>e<span>cule</span> ?value<br> <span>-</span> {<br> <span>-</span> ?<span>mol</span>e<span>cule</span> cif:cell_measurement_temperature ?temp . </td> <td> <span>+</span> SELECT ?e<span>ntry</span> ?value<br> <span>+</span> {<br> <span>+</span> ?e<span>ntry</span> cif:cell_measurement_temperature ?temp . </td> </tr> <tr> <td> Line 228: </td> <td> Line 228: </td> </tr> <tr> <td> <span>- </span> </td> <td> <span>+ {{{</span> </td> </tr> <tr> <td> Line 238: </td> <td> Line 238: </td> </tr> <tr> <td> </td> <td> <span>+ }}}</span> </td> </tr> </table> </div> Chempoundhttp://quixote.wikispot.org/Chempound2012-04-23 09:15:00JensThomas <div id="content" class="wikipage content"> Differences for Chempound<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 34: </td> <td> Line 34: </td> </tr> <tr> <td> <span>-</span> Chempound uses a [http://en.wikipedia.org/wiki/Representational_state_transfer RESTful] interface, which means that, by going to the url for a particular calculation, depending on how we make the request to the server, we can receive the request in a variety of formats. </td> <td> <span>+</span> Chempound uses a [http://en.wikipedia.org/wiki/Representational_state_transfer RESTful] interface, which means that, by going to the url for a particular calculation, depending on how we make the request to the server, we can receive the request<span>ed data</span> in a variety of formats. </td> </tr> <tr> <td> Line 191: </td> <td> Line 191: </td> </tr> <tr> <td> <span>- EXPLAIN HOW TO USE THE DICTIONARIES TO DISCOVER WHAT TERMS ARE AVAILABLE.</span> </td> <td> <span>+ === Discovering what terms can be searched on ===<br> + <br> + The data that is extracted int RDF and therefore available for searching in Chempound is determined by the convention and dictionaries that apply to the files in question.<br> + <br> + Please follow these links for more information on [http://www.xml-cml.org/convention/ conventions] and [http://www.xml-cml.org/dictionary/ dictionaries].<br> + <br> + For CIF files, the CIF [http://www.xml-cml.org/dictionary/cif/ dictionary] lists all the terms that are available.<br> + <br> + For Computational Chemistry outputs, the CompChem [http://www.xml-cml.org/dictionary/compchem/ dictionary] lists the indexed terms.<br> + <br> + To determine how best to search for data, it is usually useful to go to the splash page for a representative structure in chempound and download the RDF file. This will show how the form of the RDF and how a structure needs to be constructed.<br> + <br> + For example, if we wish to search on the cell_measurement_temperature, looking at the RDF for a CIF file, we see it is structured as shown below:<br> + <br> + {{{<br> + &lt;iucr:cell_measurement_temperature rdf:parseType="Resource"&gt;<br> + &lt;rdf:value rdf:datatype="http://www.w3.org/2001/XMLSchema#double"&gt;173.0&lt;/rdf:value&gt;<br> + &lt;cml:units rdf:resource="http://www.xml-cml.org/unit/sik"/&gt;<br> + &lt;cml:errorValue rdf:datatype="http://www.w3.org/2001/XMLSchema#double"&gt;2.0&lt;/cml:errorValue&gt;<br> + &lt;/iucr:cell_measurement_temperature&gt;<br> + }}}<br> + <br> + If we just search for the cell_measurement_temperature, we will be returned the RDF resource, we therefore further need to extract the value, which is done with the following query:<br> + <br> + {{{<br> + PREFIX rdf: &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#&gt;<br> + PREFIX cif: &lt;http://www.xml-cml.org/dictionary/cif/&gt;<br> + <br> + SELECT ?molecule ?value<br> + {<br> + ?molecule cif:cell_measurement_temperature ?temp .<br> + ?temp rdf:value ?value .<br> + }<br> + }}}<br> + <br> + A similar example for a CompChem file is shown below. This searches on a term in the compchem dictionary, and then filters the value for only those structures with a charge of 0.<br> + <br> + <br> + PREFIX rdf: &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#&gt;<br> + PREFIX compchem: &lt;http://www.xml-cml.org/dictionary/compchem/&gt;<br> + <br> + SELECT ?molecule ?charge<br> + {<br> + ?molecule compchem:charge ?chargeR .<br> + ?chargeR rdf:value ?charge<br> + FILTER ( ?charge = 0 )<br> + }</span> </td> </tr> </table> </div> Chempoundhttp://quixote.wikispot.org/Chempound2012-04-04 17:10:35JensThomas <div id="content" class="wikipage content"> Differences for Chempound<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 350: </td> <td> Line 350: </td> </tr> <tr> <td> <span>-</span> If the new terms are to be added to the chemistry search page, then the [https://bitbucket.org/chempound/compchem/src/ef32d64ba51b/compchem-handler/src/main/java/net/chempound/compchem/search/CompChemSearchProvider.java CompChemSearchProvider.java] file will need to be edited, and suitable tests added to the file [https://bitbucket.org/chempound/compchem/src/ef32d64ba51b/compchem-test-harness/src/test/java/net/chempound/compchem/CompChemSearchIntegrationTest.java ompChemSearchIntegrationTest.java] </td> <td> <span>+</span> If the new terms are to be added to the chemistry search page, then the [https://bitbucket.org/chempound/compchem/src/ef32d64ba51b/compchem-handler/src/main/java/net/chempound/compchem/search/CompChemSearchProvider.java CompChemSearchProvider.java] file will need to be edited, and suitable tests added to the file [https://bitbucket.org/chempound/compchem/src/ef32d64ba51b/compchem-test-harness/src/test/java/net/chempound/compchem/CompChemSearchIntegrationTest.java <span>C</span>ompChemSearchIntegrationTest.java] </td> </tr> </table> </div> Prototype datahttp://quixote.wikispot.org/Prototype_data2012-04-03 14:02:46JensThomas <div id="content" class="wikipage content"> Differences for Prototype data<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 599: </td> <td> Line 599: </td> </tr> <tr> <td> </td> <td> <span>+ === Updates and Extensions to the CompChem convention ===<br> + <br> + The [http://www.xml-cml.org/convention/compchem compchem convention] is currently rather loose and abstract, so this is an attempt to attach some definitions and examples to clarify its usage.<br> + <br> + The job and calculation modules are effectively functionally identical, as many of the calculations within a job could be run as separate jobs themselves (e.g. a single-point calculation in an optimisation run could be submitted separately). A job therefore serves to group one or more calculations into a logical unit of work - it can be thought of as the unit of work that would be submitted to a computational resource.<br> + <br> + Calculations can themselves contain calculations and be nested to any degree. Aside from having an initialization module, there is no requirement on a calculation to have any particular attributes (and indeed calculations are optional within jobs, as the job may just define an input, or the workings of the calculations may not be of any interest to the user).<br> + <br> + Calculations can also inherit attributes from their parents, so for example, a basis_set need not be contained within each SCF iteration calculation, as the parent's basis set can be linked to the child.<br> + <br> + However, both calculations and jobs must have an initialization module, which at least contains a task field to identify the role of the calculation, and any important results must go in the finalization module. This allows code developers complete freedom to structure the modules as they see fit, but any software that uses the convention knows to query the initialization module for information on the calculation, and the finalization module for any results.<br> + <br> + === Energies and Results ===<br> + <br> + Several energies are output by computational jobs through the course of a run. These may be a guess energy, iteration energy, two-electron energy, correlation energy etc. Labelling each of these separately would lead to a large number of energy terms and difficulty searching for them (e.g. if you search for an MP2 energy, do you want RI-MP2 results too? Is a DFT energy labelled with the functional?). To prevent this confusion, the energies are split into categories (e.g. one-electron energy, correlation energy, total energy etc) and the meaning is determined by the context the energy is found in, i.e. the parameters of the initialization module of the calculation that contains the energy. For this reason, for an SCF calculation, the guess and individual iteration energies, each need to be held within their own calculation module. This leads to some additional verbosity and complexity, but is largely mitigated because the (e.g.) iteration calculations need only contain the information in their initialization modules that cannot be inherited from their parent SCF modules.<br> + <br> + The advantage of this approach is that simple generic xpath queries can be used to extract all the energies from (e.g. a geometry optimisation or mcscf calculation) by any program that wishes to plot the energy trajectory of a calculation, without needing to know anything of the structure or logic of the calculation it is querying.<br> + <br> + ==== Data structures ===<br> + <br> + Scalar, array or matrix quantities are already supported in CML and be added to the initialization/finalization modules as parameters or properties.<br> + <br> + Any complex data structures should be held in cml lists (aside from molecules which are an existing fundamental part of CML), with a dictRef that defines what the data contained within the module is.<br> + <br> + The current complex data structures are:<br> + <br> + [http://www.xml-cml.org/convention/molecular molecules]<br> + [http://www.xml-cml.org/dictionary/compchem/#basis_set basis sets]<br> + [http://www.xml-cml.org/dictionary/compchem/#dft_functional DFT functional]<br> + [http://www.xml-cml.org/dictionary/compchem/#molecular_orbital molecular orbital]<br> + </span> </td> </tr> </table> </div> Front Pagehttp://quixote.wikispot.org/Front_Page2012-04-03 09:17:28JensThomas <div id="content" class="wikipage content"> Differences for Front Page<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 10: </td> <td> Line 10: </td> </tr> <tr> <td> </td> <td> <span>+ <br> + To see the quixote architecture in operation, visit the Cambridge [http://quixote.ch.cam.ac.uk/ Chempound] server.</span> </td> </tr> </table> </div> Chempoundhttp://quixote.wikispot.org/Chempound2012-04-02 15:18:11JensThomas <div id="content" class="wikipage content"> Differences for Chempound<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 297: </td> <td> Line 297: </td> </tr> <tr> <td> <span>-</span> Chempound is actually a very general tool for managing collections of objects and their associated data and metadata, using RDF for the data model. As such, almost all of the chemistry functionality is implemented using plugins, so the code that needs to be modified to change the chemistry behaviour is very localised. </td> <td> <span>+</span> Chempound is actually a very general tool for managing collections of objects <span>(collected as [http://www.openarchives.org/ore/1.0/datamodel ORE aggregates]) </span>and their associated data and metadata, using<span>&nbsp;[http://www.w3.org/TR/rdf-primer/</span> RDF<span>] </span> for the data model. As such, almost all of the chemistry functionality is implemented using plugins, so the code that needs to be modified to change the chemistry behaviour is very localised. </td> </tr> <tr> <td> Line 324: </td> <td> Line 324: </td> </tr> <tr> <td> </td> <td> <span>+ ||||||||</span> </td> </tr> <tr> <td> Line 328: </td> <td> Line 329: </td> </tr> <tr> <td> </td> <td> <span>+ <br> + <br> + === Adding New Data and editing the Splash page ===<br> + <br> + Chempound extracts data from [http://www.xml-cml.org/schema/ CML] in accordance with the [http://www.xml-cml.org/convention/compchem compchem] convention. Provided that the data is a CML scalar, and is in the job's environment, initialization or finalization modules, with a dictRef (ideally) in the [http://www.xml-cml.org/dictionary/compchem/ compchem dictionary], then the data will already be extracted into RDF.<br> + <br> + If additional data needs to be extracted (such as is currently done for basis sets and dft functionals), then all that may be necessary is to edit the file [https://bitbucket.org/chempound/compchem/src/ef32d64ba51b/compchem-importer/src/main/java/net/chempound/compchem/CmlComp2RdfConverter.java CmlComp2RdfConverter.java] to add the additional data to the RDF.<br> + <br> + The html pages in chempound are generated using the [http://freemarker.sourceforge.net/ freemarker] template engine. The freemarker template that is used to generate the html page for each individual structure is the file: [https://bitbucket.org/chempound/compchem/src/ef32d64ba51b/compchem-handler/src/main/resources/net/chempound/compchem/templates/comp.ftl comp.ftl] (other template and css files are in the parent directory).<br> + <br> + In order to facilitate extracting key RDF data for use with the freemarker templates, several classes are used. For adding new terms, the following files needed to be edited:<br> + <br> + * [https://bitbucket.org/chempound/compchem/src/ef32d64ba51b/compchem-common/src/main/java/net/chempound/compchem/rdf/CompChemCalculation.java CompChemCalculation.java] - this defines the interface that will be used by the freemarker template to access the data.<br> + <br> + * [https://bitbucket.org/chempound/compchem/src/ef32d64ba51b/compchem-common/src/main/java/net/chempound/compchem/rdf/ont/CompChem.java CompChem.java] - this creates the RDF terms that are used.<br> + <br> + * [https://bitbucket.org/chempound/compchem/src/ef32d64ba51b/compchem-common/src/main/java/net/chempound/compchem/rdf/impl/CompChemCalculationImpl.java CompChemCalculationImpl.java] - this actually implements the functions to get the data.<br> + <br> + <br> + When the new terms have been added, the tests should be updated, or a new test added in the directory [https://bitbucket.org/chempound/compchem/src/ef32d64ba51b/compchem-importer/src/test/java/net/chempound/compchem]<br> + <br> + If the new terms are to be added to the chemistry search page, then the [https://bitbucket.org/chempound/compchem/src/ef32d64ba51b/compchem-handler/src/main/java/net/chempound/compchem/search/CompChemSearchProvider.java CompChemSearchProvider.java] file will need to be edited, and suitable tests added to the file [https://bitbucket.org/chempound/compchem/src/ef32d64ba51b/compchem-test-harness/src/test/java/net/chempound/compchem/CompChemSearchIntegrationTest.java ompChemSearchIntegrationTest.java]</span> </td> </tr> </table> </div> Chempoundhttp://quixote.wikispot.org/Chempound2012-04-02 13:54:19JensThomas <div id="content" class="wikipage content"> Differences for Chempound<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 324: </td> <td> Line 324: </td> </tr> <tr> <td> </td> <td> <span>+ || [https://bitbucket.org/chempound/compchem/ compchem ] || [https://bitbucket.org/chempound/compchem/src/ef32d64ba51b/compchem-common compchem-common] || General code related to the compchem RDF data structures. The utility functions used by the freemarker templates to access the compchem data live here. ||<br> + |||| [https://bitbucket.org/chempound/compchem/src/ef32d64ba51b/compchem-handler compchem-handler] || Code to handle the processing of chemical data on the server, such as display of the html pages and the freemarker templates. ||<br> + ||||[https://bitbucket.org/chempound/compchem/src/ef32d64ba51b/compchem-importer compchem-importer] || The classes to handle importing code-specific logfiles (NWChem, Gaussian etc) using the jumbo-classes. These classes are used by the client, not chempound itself. The test cases for checking the imports also live here. ||<br> + ||||[https://bitbucket.org/chempound/compchem/src/ef32d64ba51b/compchem-test-harness compchem-test-harness] || Code to test the various compchem-specific modules, as most do not contain any test code themselves. ||</span> </td> </tr> </table> </div> Chempoundhttp://quixote.wikispot.org/Chempound2012-04-02 13:32:30JensThomas <div id="content" class="wikipage content"> Differences for Chempound<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 293: </td> <td> Line 293: </td> </tr> <tr> <td> <span>-</span> This section is for those who may be interested in altering or extending Chempound. It isn't intended to be a programmer's manual, more a brief overview of chempound's current structure and a walk-though on how to add additional CML data to the repository, which is expected to be the reason why most people would want to extend Chempound. </td> <td> <span>+</span> This section is for those who may be interested in altering or extending Chempound. It isn't intended to be a programmer's manual, more a brief overview of chempound's current structure and a walk-though on how to add additional CML data to the repository, which is expected to be the reason why most people would <span>currently </span>want to extend Chempound. </td> </tr> <tr> <td> Line 314: </td> <td> Line 314: </td> </tr> <tr> <td> </td> <td> <span>+ <br> + A slightly more detailed view of the chemistry-specific repositories and their modules follows below.<br> + <br> + ||'''Repository''' || '''Modules''' || '''Description and important classes''' ||<br> + || [https://bitbucket.org/chempound/chemistry chemistry ] || [https://bitbucket.org/chempound/chemistry/src/208dcf643265/chemistry-common chemistry-common] || Classes to handle the generic processing of CML datatypes and the conversion to RDF ||<br> + || || || '''* net.chempound.chemistry.cmlChemicalMine.java''' - mime types ||<br> + || || || '''* net.chempound.chemistry.Cml2RdfConverter.java''' - code to handle the conversion of generic, simple cml datatypes into RDF. ||<br> + || ||[https://bitbucket.org/chempound/chemistry/src/208dcf643265/chemistry-importer chemistry-importer] || Base classes for the client-side conversion of files and the generation of images ||<br> + || ||[https://bitbucket.org/chempound/chemistry/src/208dcf643265/chemistry-jmol-plugin chemistry-jmol-plugin] || Classes to drive jmol to generate the images, and also the jmol code itself ||<br> + || ||[https://bitbucket.org/chempound/chemistry/src/208dcf643265/chemistry-search-structure chemistry-search-structure] || Classes to handle the chemistry-specific search page - if you want to add more chemistry search boxes, the you'll need to edit things here. ||</span> </td> </tr> </table> </div> Chempoundhttp://quixote.wikispot.org/Chempound2012-04-02 13:05:11JensThomas <div id="content" class="wikipage content"> Differences for Chempound<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 291: </td> <td> Line 291: </td> </tr> <tr> <td> <span>- <br> - <br> - ##==Download==<br> - ##The source code for chempound is available from [https://bitbucket.org/chempound].<br> - ##<br> - ##Pre-built binaries are available from the Cambridge hudson [https://hudson.ch.cam.ac.uk server].<br> - ##<br> - ##To run any of the binaries, you will need to have [http://www.java.com/ java] installed.<br> - ##<br> - ##The latest version of the client (which is used for loading files into chempound), is available [https://hudson.ch.cam.ac.uk/job/quixote-client/lastStableBuild/ here]. The latest standalone version (which includes its own java server), is available [https://hudson.ch.cam.ac.uk/job/quixote-repository-standalone/lastStableBuild/ here]. In both cases, download either the zip or tar.gz file from the '''Build Artifacts''' section towards the top of the page.<br> - ##<br> - ##==Running the server==<br> - ##If you are familiar with running your own java servers, you can run chempound from within tomcat or jetty using the archive available from [https://hudson.ch.cam.ac.uk/job/quixote-repository/lastStableBuild/ here]<br> - ##<br> - ##To run the standalone server (which doesn't require you to already have installed a sever), download the zip or tar.gz file for the [https://hudson.ch.cam.ac.uk/job/quixote-repository-standalone/lastStableBuild/ "standalone server"]. Under Windows use the '''run.bat''' file to start the server, and under unix/osx, use the '''run.sh''' script. These will both start the server on port 8181, which can be accessed by pointing a browser at: [http://localhost:8181]<br> - ##<br> - ##At this point, you will have chempound running on your machine, but it will have no data in it.<br> - ##<br> - ##==Adding files==<br> - ##Files are added using the client, which can be downloaded as a zip or tar.gz file from [https://hudson.ch.cam.ac.uk/job/quixote-client/lastStableBuild/ here]. Once the zip file is unpacked, the client is run from the command-line using the scripts in the '''bin''' directory: '''qc.bat''' file under windows or '''qc.sh file''' under unix/osx. These will need the '''JAVA_HOME''' environment variable to be set and the path to the '''bin''' directory holding the scripts will need to be in your PATH.<br> - ##<br> - ##The first step in using the client is to create a collection within chempound to hold your files. This just serves to group the files together. Create the collection with the following command (replace '''qc.sh''' with '''qc.bat''' under windows):<br> - ##<br> - ##{{{<br> - ##qc.sh -R http://localhost:8181/content/ create-collection &lt;collection name&gt;<br> - ##}}}<br> - ##<br> - ##where '''&lt;collection name&gt;''' is the desired name of the collection.<br> - ##<br> - ##If you now point your browser at [http://localhost:8181], you should be able to browse to your collection, although it obviously has no files in it.<br> - ##<br> - ##To add a gaussian file, use the command:<br> - ##<br> - ##{{{<br> - ##qc.sh -R http://localhost:8181/content/&lt;collection name&gt;/ deposit-gaussian &lt;path to gaussian file&gt;<br> - ##}}}<br> - ##<br> - ##The file types that are currently supported are:<br> - ##[http://www.gaussian.com/ Gaussian]: '''deposit-gaussian'''<br> - ##[http://www.nwchem-sw.org/ NWChem]: '''deposit-nwchem'''<br> - ##[http://www.iucr.org/resources/cif CIF]: '''deposit-cif'''<br> - ##</span> </td> <td> <span>+ == Hacking Chempound ==<br> + <br> + This section is for those who may be interested in altering or extending Chempound. It isn't intended to be a programmer's manual, more a brief overview of chempound's current structure and a walk-though on how to add additional CML data to the repository, which is expected to be the reason why most people would want to extend Chempound.<br> + <br> + '''NB:''' additional information can be found in Jorge Estrada's [https://bytebucket.org/jestrada/quixote-docs/wiki/developersGuide/quixote-developersGuide.html repository]<br> + <br> + Chempound is actually a very general tool for managing collections of objects and their associated data and metadata, using RDF for the data model. As such, almost all of the chemistry functionality is implemented using plugins, so the code that needs to be modified to change the chemistry behaviour is very localised.<br> + <br> + === Overview of the repositories ===<br> + <br> + The repositories for the chempound packages is hosted on bitbucket: [https://bitbucket.org/chempound]<br> + <br> + Currently, there are 8 repositories as detailed below:<br> + <br> + * [https://bitbucket.org/chempound/] - this contains the main server code. There is almost no chemistry-specific code here, apart from in the chempound-rdf-cml directory, which has a very small class to add some CML data to the RDF model.<br> + * [https://bitbucket.org/chempound/chemistry] - this is where the most general chemistry code lives, and where the general functions to handle the conversion of data from CML are.<br> + * [https://bitbucket.org/chempound/chempound-client] - the base classes for the command-line client (it is the client that actually handles the conversion of logfiles into CML and the generation of the jmol pictures etc) are here, although there is no chemistry-specific code here.<br> + * [https://bitbucket.org/chempound/chempound-parent] - this just contains the central maven pom.xml that is used to configure maven for chempound.<br> + * [https://bitbucket.org/chempound/compchem] - all the code to handle the data associated with computational chemistry calculations (both server and client) lives here.<br> + * [https://bitbucket.org/chempound/crystallography] - all the code to handle the crystallography-specific aspects of the data.<br> + * [https://bitbucket.org/chempound/deposit-client] - TODO - not had to look at this yet.<br> + * [https://bitbucket.org/chempound/quixote-client] - the code to drive the code-specific imports of compchem logfiles.<br> + * [https://bitbucket.org/chempound/quixote-repository] - this is more code to package chempound for use by the quixote project and create the stand-alone chempound server war file.</span> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-03-30 08:52:19JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 371: </td> <td> Line 371: </td> </tr> <tr> <td> <span>- * '''addUnits''' - TODO</span> </td> <td> <span>+ * '''addUnits''' - this will add a '''units''' attribute to the element with the value specified in the '''value''' argument. The value should be of the form '''namespace:id''', where namespace refers to one of the [http://www.xml-cml.org/dictionary/ units dictionaries] and the id points to the actual unit. In the example below, the namespace refers to the [http://www.xml-cml.org/unit/nonSi/ non-si unit dictionary] and the id links to the entry for the [http://www.xml-cml.org/unit/nonSi/#hartree hartree].</span> </td> </tr> <tr> <td> Line 375: </td> <td> Line 375: </td> </tr> <tr> <td> <span>- xpath=".//cml:scalar"<br> - value="u:jmol-1"/&gt;</span> </td> <td> <span>+ xpath=".//cml:scalar[@dictRef='compchem:total_energy']"<br> + value="nonsi:hartree"<br> + /&gt;</span> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-03-22 18:55:17JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 663: </td> <td> Line 663: </td> </tr> <tr> <td> <span>-</span> * '''move''' - this takes one or more nodes, and moves them into the node defined by the '''to''' argument. The '''to''' argument is an xpath that must just return a single element. </td> <td> <span>+</span> * '''move''' - this takes one or more nodes, and moves them into the node defined by the '''to''' argument. The '''to''' argument is an xpath that must just return a single element.<span>&nbsp;The '''position''' argument indicates where in the children of the target, the element will be moved to. "1" makes the element the first, "2" makes it the second etc.</span> </td> </tr> </table> </div> Prototype datahttp://quixote.wikispot.org/Prototype_data2012-03-14 20:56:26JensThomas <div id="content" class="wikipage content"> Differences for Prototype data<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 589: </td> <td> Line 589: </td> </tr> <tr> <td> </td> <td> <span>+ == Conventions ==<br> + <br> + The conventions are [http://www.xml-cml.org/convention/ here]<br> + <br> + The text for the conventions currently resides in the the [https://bitbucket.org/cml/xml-cml.org/src/b08186d90a19/convention bitbucket repository] for the xml-cml website. This is where the "human-readable" text lives.<br> + <br> + The rules that the convention text describes is actually implemented in xslt within the [https://bitbucket.org/cml/cmllite-validator-code/overview cml validator]. So for example the rules defining the compchem convention, are implemented in the [https://bitbucket.org/cml/cmllite-validator-code/src/2eaa18f959bb/src/main/resources/org/xmlcml/www/compchem-rules.xsl compchem-rules.xsl] file.<br> + <br> + Any changes or updates to the convention generally requires editing BOTH files.<br> + <br> + <br> + == Dictionaries ==<br> + <br> + The dictionaries are [http://www.xml-cml.org/dictionary/ here].<br> + </span> </td> </tr> <tr> <td> Line 591: </td> <td> Line 606: </td> </tr> <tr> <td> </td> <td> <span>+ <br> + NB: Most of this information is now in the [http://www.xml-cml.org/dictionary/compchem/ compchem dictionary]</span> </td> </tr> </table> </div> Prototype datahttp://quixote.wikispot.org/Prototype_data2012-03-14 13:21:13JensThomas <div id="content" class="wikipage content"> Differences for Prototype data<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 585: </td> <td> Line 585: </td> </tr> <tr> <td> <span>-</span> = C<span>h</span>a<span>ng</span>es <span>to</span> CML =<br> <span>-</span> <br> <span>-</span> This is just a temporary place to dump stuff <span>that </span>relate<span>s</span> to <span>c</span>ha<span>ng</span>es to <span>CML</span>. </td> <td> <span>+</span> = C<span>onventions, Diction</span>a<span>ri</span>es <span>and</span> CML =<br> <span>+</span> <br> <span>+</span> This is just a temporary place to dump stuff relate<span>d</span> to <span>t</span>h<span>e </span>a<span>bov</span>e<span>&nbsp;before it i</span>s <span>pu</span>t<span>&nbsp;s</span>o<span>mewhere</span> <span>sensible</span>.<span><br> + </span> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-03-13 22:06:27 <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 313: </td> <td> Line 313: </td> </tr> <tr> <td> <span>-</span> * '''addChild''' - this will create a child element of the nodes specified by the '''xpath'''. The only required argument is '''elementName''', which specifies the type of element to create. Other supported arguments are: '''id''', '''dictRef''' and '''value'''. The '''position''' argument specifies where the child will be created in the list of children. position="0" creates it as the first child, "1", the second etc. With no position argument, the child is added as the last child. <span>'''NB:''' the</span> value <span>ar</span>g<span>ument currently doesn't support the </span>$string()<span>&nbsp;s</span>y<span>nta</span>x. </td> <td> <span>+</span> * '''addChild''' - this will create a child element of the nodes specified by the '''xpath'''. The only required argument is '''elementName''', which specifies the type of element to create. Other supported arguments are: '''id''', '''dictRef''' and '''value'''. The '''position''' argument specifies where the child will be created in the list of children. position="0" creates it as the first child, "1", the second etc. With no position argument, the child is added as the last child. <span>If</span> value <span>is a strin</span>g<span>&nbsp;of the form "</span>$string(<span>XPATH</span>)<span>" or $number(XPATH), where XPATH is a valid XPATH, then the value will be the result of evaluating the XPATH relative to the current node in the nodeset evaluated b</span>y<span>&nbsp;'''</span>x<span>path''', in string or number form</span>. </td> </tr> <tr> <td> Line 361: </td> <td> Line 361: </td> </tr> <tr> <td> <span>-</span> * '''addSibling''' - this will add a sibling element to each node in the '''xpath''' nodeset, with the type of element being that specified in '''elementName''' and the elements id attribute as specified by '''id''' argument. The '''position''' argument indicates where the element will be created, "0" creates the element before node, "1" creates it after the current node. If there are multiple siblings to the current node, "-2" would create it 2 nodes down from the current node, "2", one up from it etc. </td> <td> <span>+</span> * '''addSibling''' - this will add a sibling element to each node in the '''xpath''' nodeset, with the type of element being that specified in '''elementName''' and the elements id attribute as specified by '''id''' argument. The '''position''' argument indicates where the element will be created, "0" creates the element before node, "1" creates it after the current node. If there are multiple siblings to the current node, "-2" would create it 2 nodes down from the current node, "2", one up from it etc.<span>&nbsp;If value is a string of the form "$string(XPATH)" or $number(XPATH), where XPATH is a valid XPATH, then the value will be the result of evaluating the XPATH relative to the current node in the nodeset evaluated by '''xpath''', in string or number form.</span> </td> </tr> </table> </div> Prototype datahttp://quixote.wikispot.org/Prototype_data2012-03-12 13:21:44JensThomas <div id="content" class="wikipage content"> Differences for Prototype data<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 749: </td> <td> Line 749: </td> </tr> <tr> <td> <span>-</span> Within the context of conventions and dictionaries as detailed in the [http://www.jcheminf.com/content/3/1/43 J. Chem. Inf. paper (doi:10.1186/1758-2946-3-43)], this is best dealt with using existing CML datatypes and dictionaries. For the time being, lists will be used to hold all complex data structures, with dictionary entries describing the element. Rather then using matricies to hold the exponents and coefficients, arrays will be used. Attributes (e.g. elementType on contractions) will be moved into separate elements, with the convention used to enforce their appearance. This also means that the attributes will appear as dictionary entries with an associated explanation. We need to add an compchem:atom_<span>type</span> entry so that the basis sets can be mapped onto the associated atoms - the "O" printd in the NWChem output is a tag that - in theory - could be anything, it is just used to label the atom and can be used for assigning different basis sets to different atoms of the same element type. Following on from the convention used in the dictionaries, names will be separated with an underscore. </td> <td> <span>+</span> Within the context of conventions and dictionaries as detailed in the [http://www.jcheminf.com/content/3/1/43 J. Chem. Inf. paper (doi:10.1186/1758-2946-3-43)], this is best dealt with using existing CML datatypes and dictionaries. For the time being, lists will be used to hold all complex data structures, with dictionary entries describing the element. Rather then using matricies to hold the exponents and coefficients, arrays will be used. Attributes (e.g. elementType on contractions) will be moved into separate elements, with the convention used to enforce their appearance. This also means that the attributes will appear as dictionary entries with an associated explanation. We need to add an compchem:atom_<span>label</span> entry so that the basis sets can be mapped onto the associated atoms - the "O" printd in the NWChem output is a tag that - in theory - could be anything, it is just used to label the atom and can be used for assigning different basis sets to different atoms of the same element type. Following on from the convention used in the dictionaries, names will be separated with an underscore. </td> </tr> <tr> <td> Line 784: </td> <td> Line 784: </td> </tr> <tr> <td> <span>-</span> &lt;cml:scalar dictRef="compchem:atom_<span>typ</span>e" dataType="xsd:string"&gt; </td> <td> <span>+</span> &lt;cml:scalar dictRef="compchem:atom_<span>lab</span>e<span>l</span>" dataType="xsd:string"&gt; </td> </tr> <tr> <td> Line 897: </td> <td> Line 897: </td> </tr> <tr> <td> <span>-</span> &lt;cml:scalar dictRef="compchem:atom_<span>typ</span>e" dataType="xsd:string"&gt; </td> <td> <span>+</span> &lt;cml:scalar dictRef="compchem:atom_<span>lab</span>e<span>l</span>" dataType="xsd:string"&gt; </td> </tr> </table> </div> Prototype datahttp://quixote.wikispot.org/Prototype_data2012-03-12 13:13:10JensThomas <div id="content" class="wikipage content"> Differences for Prototype data<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 749: </td> <td> Line 749: </td> </tr> <tr> <td> <span>-</span> Within the context of conventions and dictionaries as detailed in the [http://www.jcheminf.com/content/3/1/43 J. Chem. Inf. paper (doi:10.1186/1758-2946-3-43)], this is best dealt with using existing CML datatypes and dictionaries. For the time being, lists will be used to hold all complex data structures, with dictionary entries describing the element. Rather then using matricies to hold the exponents and coefficients, arrays will be used. Attributes (e.g. elementType on contractions) will be moved into separate elements, with the convention used to enforce their appearance. This also means that the attributes will appear as dictionary entries with an associated explanation. <span>Entries that are to be indexed and made searchable within chempound will be wrapped with a parameter element</span>. Following on from the convention used in the dictionaries, names will be separated with an underscore. </td> <td> <span>+</span> Within the context of conventions and dictionaries as detailed in the [http://www.jcheminf.com/content/3/1/43 J. Chem. Inf. paper (doi:10.1186/1758-2946-3-43)], this is best dealt with using existing CML datatypes and dictionaries. For the time being, lists will be used to hold all complex data structures, with dictionary entries describing the element. Rather then using matricies to hold the exponents and coefficients, arrays will be used. Attributes (e.g. elementType on contractions) will be moved into separate elements, with the convention used to enforce their appearance. This also means that the attributes will appear as dictionary entries with an associated explanation. <span>We need to add an compchem:atom_type entry so that the basis sets can be mapped onto the associated atoms - the "O" printd in the NWChem output is a tag that - in theory - could be anything, it is just used to label the atom and can be used for assigning different basis sets to different atoms of the same element type</span>. Following on from the convention used in the dictionaries, names will be separated with an underscore. </td> </tr> <tr> <td> Line 761: </td> <td> Line 761: </td> </tr> <tr> <td> <span>-</span> &lt;<span>!-- </span>c<span>urrent</span>l<span>y on</span>l<span>y need b</span>a<span>sis set title - should everything be a parameter? --&gt;<br> - &lt;paramete</span>r dictRef="compchem:basis_set_title" <span>&gt;<br> - &lt;cml:scalar </span>dataType="xsd:string"&gt; </td> <td> <span>+</span> &lt;c<span>m</span>l<span>:sca</span>lar dictRef="compchem:basis_set_title" dataType="xsd:string"&gt; </td> </tr> <tr> <td> Line 765: </td> <td> Line 763: </td> </tr> <tr> <td> <span>- </span> &lt;/cml:scalar&gt;<span><br> - &lt;/parameter&gt;</span> </td> <td> <span>+</span> &lt;/cml:scalar&gt; </td> </tr> <tr> <td> Line 786: </td> <td> Line 783: </td> </tr> <tr> <td> </td> <td> <span>+ &lt;!-- added as need to link contractions to atom --&gt;<br> + &lt;cml:scalar dictRef="compchem:atom_type" dataType="xsd:string"&gt;<br> + O<br> + &lt;/cml:scalar&gt;<br> + </span> </td> </tr> <tr> <td> Line 789: </td> <td> Line 791: </td> </tr> <tr> <td> <span>-</span> &lt;/cml:scalar&gt; </td> <td> <span>+ </span> &lt;/cml:scalar&gt; </td> </tr> <tr> <td> Line 794: </td> <td> Line 796: </td> </tr> <tr> <td> <span>-</span> &lt;/cml:scalar&gt; </td> <td> <span>+ </span> &lt;/cml:scalar&gt; </td> </tr> <tr> <td> Line 800: </td> <td> Line 802: </td> </tr> <tr> <td> <span>-</span> &lt;cml:list dictRef="compchem:basis_set_contraction"&gt;<br> <span>- </span> &lt;!-- was shell attribute --&gt;<br> <span>- </span> &lt;cml:scalar dictRef="compchem:basis_set_shell" dataType="xsd:string"&gt; </td> <td> <span>+ </span> &lt;cml:list dictRef="compchem:basis_set_contraction"&gt;<br> <span>+</span> &lt;!-- was shell attribute --&gt;<br> <span>+</span> &lt;cml:scalar dictRef="compchem:basis_set_shell" dataType="xsd:string"&gt; </td> </tr> <tr> <td> Line 804: </td> <td> Line 806: </td> </tr> <tr> <td> <span>- </span> &lt;/cml:scalar&gt; </td> <td> <span>+</span> &lt;/cml:scalar&gt;<span><br> + </span> </td> </tr> <tr> <td> Line 814: </td> <td> Line 817: </td> </tr> <tr> <td> <span>- </span> &lt;/cml:list&gt; </td> <td> <span>+</span> &lt;/cml:list&gt; </td> </tr> <tr> <td> Line 819: </td> <td> Line 822: </td> </tr> <tr> <td> <span>-</span> &lt;/cml:scalar&gt;<br> <span>-</span> &lt;cml:array size="3" dataType="xsd:double" </td> <td> <span>+ </span> &lt;/cml:scalar&gt;<br> <span>+ </span> &lt;cml:array size="3" dataType="xsd:double" </td> </tr> <tr> <td> Line 823: </td> <td> Line 826: </td> </tr> <tr> <td> <span>-</span> &lt;/cml:array&gt;<br> <span>-</span> <br> <span>-</span> &lt;cml:array size="3" dataType="xsd:double" </td> <td> <span>+ </span> &lt;/cml:array&gt;<br> <span>+</span> <br> <span>+ </span> &lt;cml:array size="3" dataType="xsd:double" </td> </tr> <tr> <td> Line 828: </td> <td> Line 831: </td> </tr> <tr> <td> <span>-</span> &lt;/cml:array&gt; </td> <td> <span>+ </span> &lt;/cml:array&gt; </td> </tr> <tr> <td> Line 834: </td> <td> Line 837: </td> </tr> <tr> <td> <span>-</span> &lt;/cml:scalar&gt;<br> <span>-</span> &lt;cml:array size="3" dataType="xsd:double" </td> <td> <span>+ </span> &lt;/cml:scalar&gt;<br> <span>+ </span> &lt;cml:array size="3" dataType="xsd:double" </td> </tr> <tr> <td> Line 838: </td> <td> Line 841: </td> </tr> <tr> <td> <span>-</span> &lt;/cml:array&gt;<br> <span>-</span> <br> <span>-</span> &lt;cml:array size="3" dataType="xsd:double" </td> <td> <span>+ </span> &lt;/cml:array&gt;<br> <span>+</span> <br> <span>+ </span> &lt;cml:array size="3" dataType="xsd:double" </td> </tr> <tr> <td> Line 843: </td> <td> Line 846: </td> </tr> <tr> <td> <span>-</span> &lt;/cml:array&gt; </td> <td> <span>+ </span> &lt;/cml:array&gt; </td> </tr> <tr> <td> Line 849: </td> <td> Line 852: </td> </tr> <tr> <td> <span>-</span> &lt;/cml:scalar&gt;<br> <span>-</span> &lt;cml:array size="1" dataType="xsd:double" </td> <td> <span>+ </span> &lt;/cml:scalar&gt;<br> <span>+ </span> &lt;cml:array size="1" dataType="xsd:double" </td> </tr> <tr> <td> Line 853: </td> <td> Line 856: </td> </tr> <tr> <td> <span>-</span> &lt;/cml:array&gt;<br> <span>-</span> <br> <span>-</span> &lt;cml:array size="1" dataType="xsd:double" </td> <td> <span>+ </span> &lt;/cml:array&gt;<br> <span>+</span> <br> <span>+ </span> &lt;cml:array size="1" dataType="xsd:double" </td> </tr> <tr> <td> Line 858: </td> <td> Line 861: </td> </tr> <tr> <td> <span>-</span> &lt;/cml:array&gt; </td> <td> <span>+ </span> &lt;/cml:array&gt; </td> </tr> <tr> <td> Line 864: </td> <td> Line 867: </td> </tr> <tr> <td> <span>-</span> &lt;/cml:scalar&gt;<br> <span>-</span> &lt;cml:array size="1" dataType="xsd:double" </td> <td> <span>+ </span> &lt;/cml:scalar&gt;<br> <span>+ </span> &lt;cml:array size="1" dataType="xsd:double" </td> </tr> <tr> <td> Line 868: </td> <td> Line 871: </td> </tr> <tr> <td> <span>-</span> &lt;/cml:array&gt;<br> <span>-</span> <br> <span>-</span> &lt;cml:array size="1" dataType="xsd:double" </td> <td> <span>+ </span> &lt;/cml:array&gt;<br> <span>+</span> <br> <span>+ </span> &lt;cml:array size="1" dataType="xsd:double" </td> </tr> <tr> <td> Line 873: </td> <td> Line 876: </td> </tr> <tr> <td> <span>-</span> &lt;/cml:array&gt; </td> <td> <span>+ </span> &lt;/cml:array&gt; </td> </tr> <tr> <td> Line 879: </td> <td> Line 882: </td> </tr> <tr> <td> <span>-</span> &lt;/cml:scalar&gt;<br> <span>-</span> &lt;cml:array size="1" dataType="xsd:double" </td> <td> <span>+ </span> &lt;/cml:scalar&gt;<br> <span>+ </span> &lt;cml:array size="1" dataType="xsd:double" </td> </tr> <tr> <td> Line 882: </td> <td> Line 885: </td> </tr> <tr> <td> <span>-</span> &lt;/cml:array&gt;<br> <span>-</span> <br> <span>-</span> &lt;cml:array size="1" dataType="xsd:double" </td> <td> <span>+ </span> &lt;/cml:array&gt;<br> <span>+</span> <br> <span>+ </span> &lt;cml:array size="1" dataType="xsd:double" </td> </tr> <tr> <td> Line 886: </td> <td> Line 889: </td> </tr> <tr> <td> <span>-</span> &lt;/cml:array&gt; </td> <td> <span>+ </span> &lt;/cml:array&gt; </td> </tr> <tr> <td> Line 892: </td> <td> Line 895: </td> </tr> <tr> <td> </td> <td> <span>+ <br> + &lt;!-- added as need to link contractions to atom --&gt;<br> + &lt;cml:scalar dictRef="compchem:atom_type" dataType="xsd:string"&gt;<br> + H<br> + &lt;/cml:scalar&gt;</span> </td> </tr> <tr> <td> Line 896: </td> <td> Line 904: </td> </tr> <tr> <td> <span>-</span> &lt;/cml:scalar&gt; </td> <td> <span>+ </span> &lt;/cml:scalar&gt; </td> </tr> <tr> <td> Line 901: </td> <td> Line 909: </td> </tr> <tr> <td> <span>-</span> &lt;/cml:scalar&gt;<br> <span>-</span> <br> <span>-</span> &lt;cml:scalar dictRef="compchem:basis_set_contractions_description" dataType="xsd:string"&gt; </td> <td> <span>+ </span> &lt;/cml:scalar&gt;<br> <span>+</span> <br> <span>+ </span> &lt;cml:scalar dictRef="compchem:basis_set_contractions_description" dataType="xsd:string"&gt; </td> </tr> <tr> <td> Line 905: </td> <td> Line 913: </td> </tr> <tr> <td> <span>-</span> &lt;/cml:scalar&gt; </td> <td> <span>+ </span> &lt;/cml:scalar&gt; </td> </tr> <tr> <td> Line 910: </td> <td> Line 918: </td> </tr> <tr> <td> <span>-</span> &lt;/cml:scalar&gt;<br> <span>-</span> &lt;cml:array size="3" dataType="xsd:double" </td> <td> <span>+ </span> &lt;/cml:scalar&gt;<br> <span>+ </span> &lt;cml:array size="3" dataType="xsd:double" </td> </tr> <tr> <td> Line 914: </td> <td> Line 922: </td> </tr> <tr> <td> <span>-</span> &lt;/cml:array&gt;<br> <span>-</span> <br> <span>-</span> &lt;cml:array size="3" dataType="xsd:double" </td> <td> <span>+ </span> &lt;/cml:array&gt;<br> <span>+</span> <br> <span>+ </span> &lt;cml:array size="3" dataType="xsd:double" </td> </tr> <tr> <td> Line 919: </td> <td> Line 927: </td> </tr> <tr> <td> <span>-</span> &lt;/cml:array&gt; </td> <td> <span>+ </span> &lt;/cml:array&gt; </td> </tr> <tr> <td> Line 925: </td> <td> Line 933: </td> </tr> <tr> <td> <span>-</span> &lt;/cml:scalar&gt;<br> <span>-</span> &lt;cml:array size="1" dataType="xsd:double" </td> <td> <span>+ </span> &lt;/cml:scalar&gt;<br> <span>+ </span> &lt;cml:array size="1" dataType="xsd:double" </td> </tr> <tr> <td> Line 929: </td> <td> Line 937: </td> </tr> <tr> <td> <span>-</span> &lt;/cml:array&gt;<br> <span>-</span> <br> <span>-</span> &lt;cml:array size="1" dataType="xsd:double" </td> <td> <span>+ </span> &lt;/cml:array&gt;<br> <span>+</span> <br> <span>+ </span> &lt;cml:array size="1" dataType="xsd:double" </td> </tr> <tr> <td> Line 934: </td> <td> Line 942: </td> </tr> <tr> <td> <span>-</span> &lt;/cml:array&gt; </td> <td> <span>+ </span> &lt;/cml:array&gt; </td> </tr> </table> </div> Prototype datahttp://quixote.wikispot.org/Prototype_data2012-03-11 19:56:55JensThomas <div id="content" class="wikipage content"> Differences for Prototype data<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 749: </td> <td> Line 749: </td> </tr> <tr> <td> <span>-</span> Within the context of conventions and dictionaries as detailed in the [http://www.jcheminf.com/content/3/1/43 J. Chem. Inf. paper (doi:10.1186/1758-2946-3-43)], this is best dealt with using existing CML datatypes and dictionaries. For the time being, lists will be used to hold all complex data structures, with dictionary entries describing the element. Rather then using matricies to hold the exponents and coefficients, arrays will be used. Entries that are to be indexed and made searchable within chempound will be wrapped with a parameter element. Following on from the convention used in the dictionaries, names will be separated with an underscore. </td> <td> <span>+</span> Within the context of conventions and dictionaries as detailed in the [http://www.jcheminf.com/content/3/1/43 J. Chem. Inf. paper (doi:10.1186/1758-2946-3-43)], this is best dealt with using existing CML datatypes and dictionaries. For the time being, lists will be used to hold all complex data structures, with dictionary entries describing the element. Rather then using matricies to hold the exponents and coefficients, arrays will be used. <span>Attributes (e.g. elementType on contractions) will be moved into separate elements, with the convention used to enforce their appearance. This also means that the attributes will appear as dictionary entries with an associated explanation. </span>Entries that are to be indexed and made searchable within chempound will be wrapped with a parameter element. Following on from the convention used in the dictionaries, names will be separated with an underscore. </td> </tr> <tr> <td> Line 761: </td> <td> Line 761: </td> </tr> <tr> <td> <span>- &lt;cml:scalar dictRef="compchem:basis_set_title" dataType="xsd:string"&gt;<br> - 6-31g*</span> </td> <td> <span>+ &lt;!-- currently only need basis set title - should everything be a parameter? --&gt;<br> + &lt;parameter dictRef="compchem:basis_set_title" &gt;<br> + &lt;cml:scalar dataType="xsd:string"&gt;<br> + 6-31g*<br> + &lt;/cml:scalar&gt;<br> + &lt;/parameter&gt;<br> + <br> + &lt;cml:scalar dictRef="compchem:basis_set_type" dataType="xsd:string"&gt;<br> + orbital</span> </td> </tr> <tr> <td> Line 765: </td> <td> Line 772: </td> </tr> <tr> <td> <span>- &lt;cml:scalar dictRef="compchem:basis_set_type" dataType="xsd:string"&gt;<br> - orbital<br> - &lt;/cml:scalar&gt;<br> - </span> </td> <td> </td> </tr> <tr> <td> Line 770: </td> <td> Line 773: </td> </tr> <tr> <td> <span>-</span> cartesian </td> <td> <span>+ </span> cartesian </td> </tr> </table> </div> Prototype datahttp://quixote.wikispot.org/Prototype_data2012-03-11 19:51:28JensThomas <div id="content" class="wikipage content"> Differences for Prototype data<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 748: </td> <td> Line 748: </td> </tr> <tr> <td> </td> <td> <span>+ <br> + Within the context of conventions and dictionaries as detailed in the [http://www.jcheminf.com/content/3/1/43 J. Chem. Inf. paper (doi:10.1186/1758-2946-3-43)], this is best dealt with using existing CML datatypes and dictionaries. For the time being, lists will be used to hold all complex data structures, with dictionary entries describing the element. Rather then using matricies to hold the exponents and coefficients, arrays will be used. Entries that are to be indexed and made searchable within chempound will be wrapped with a parameter element. Following on from the convention used in the dictionaries, names will be separated with an underscore.<br> + <br> + {{{<br> + &lt;?xml version="1.0" encoding="UTF-8"?&gt;<br> + <br> + &lt;cml:module<br> + xmlns:compchem="http://www.xml-cml.org/dictionary/compchem/"<br> + xmlns:convention="http://www.xml-cml.org/convention/"<br> + convention="convention:compchem"<br> + &gt;<br> + &lt;cml:list dictRef="compchem:basis_set"&gt;<br> + <br> + &lt;cml:scalar dictRef="compchem:basis_set_title" dataType="xsd:string"&gt;<br> + 6-31g*<br> + &lt;/cml:scalar&gt;<br> + <br> + &lt;cml:scalar dictRef="compchem:basis_set_type" dataType="xsd:string"&gt;<br> + orbital<br> + &lt;/cml:scalar&gt;<br> + <br> + &lt;cml:scalar dictRef="compchem:basis_set_harmonic_type" dataType="xsd:string"&gt;<br> + cartesian<br> + &lt;/cml:scalar&gt;<br> + <br> + &lt;cml:scalar dictRef="compchem:basis_set_contraction_type" dataType="xsd:string"&gt;<br> + general<br> + &lt;/cml:scalar&gt;<br> + <br> + &lt;cml:scalar dictRef="compchem:basis_set_description" dataType="xsd:string"&gt;<br> + ???<br> + &lt;/cml:scalar&gt;<br> + <br> + &lt;cml:list dictRef="compchem:basis_set_contractions"&gt;<br> + <br> + &lt;!-- was elementType attribute --&gt;<br> + &lt;cml:scalar dictRef="compchem:element_type" dataType="xsd:string"&gt;<br> + O<br> + &lt;/cml:scalar&gt;<br> + <br> + &lt;!-- was harmonicType attribute --&gt;<br> + &lt;cml:scalar dictRef="compchem:basis_set_harmonic_type" dataType="xsd:string"&gt;<br> + cartesian<br> + &lt;/cml:scalar&gt;<br> + <br> + &lt;cml:scalar dictRef="compchem:basis_set_contractions_description" dataType="xsd:string"&gt;<br> + ???<br> + &lt;/cml:scalar&gt;<br> + <br> + &lt;cml:list dictRef="compchem:basis_set_contraction"&gt;<br> + &lt;!-- was shell attribute --&gt;<br> + &lt;cml:scalar dictRef="compchem:basis_set_shell" dataType="xsd:string"&gt;<br> + S<br> + &lt;/cml:scalar&gt;<br> + &lt;cml:array size="6" dataType="xsd:double"<br> + dictRef="compchem:basis_set_exponents"&gt;<br> + 5.48467170E+03 8.25234950E+02 1.88046960E+02 5.29645000E+01 1.68975700E+01 5.79963530E+00<br> + &lt;/cml:array&gt;<br> + <br> + &lt;cml:array size="6" dataType="xsd:double"<br> + dictRef="compchem:basis_set_coefficients"&gt;<br> + 0.001831 0.013950 0.068445 0.232714 0.470193 0.358521<br> + &lt;/cml:array&gt;<br> + &lt;/cml:list&gt;<br> + <br> + &lt;cml:list dictRef="compchem:basis_set_contraction"&gt;<br> + &lt;cml:scalar dictRef="compchem:basis_set_shell" dataType="xsd:string"&gt;<br> + S<br> + &lt;/cml:scalar&gt;<br> + &lt;cml:array size="3" dataType="xsd:double"<br> + dictRef="compchem:basis_set_exponents"&gt;<br> + 1.55396160E+01 3.59993360E+00 1.01376180E+00<br> + &lt;/cml:array&gt;<br> + <br> + &lt;cml:array size="3" dataType="xsd:double"<br> + dictRef="compchem:basis_set_coefficients"&gt;<br> + -0.110778 -0.148026 1.130767<br> + &lt;/cml:array&gt;<br> + &lt;/cml:list&gt;<br> + <br> + &lt;cml:list dictRef="compchem:basis_set_contraction"&gt;<br> + &lt;cml:scalar dictRef="compchem:basis_set_shell" dataType="xsd:string"&gt;<br> + P<br> + &lt;/cml:scalar&gt;<br> + &lt;cml:array size="3" dataType="xsd:double"<br> + dictRef="compchem:basis_set_exponents"&gt;<br> + 1.55396160E+01 3.59993360E+00 1.01376180E+00<br> + &lt;/cml:array&gt;<br> + <br> + &lt;cml:array size="3" dataType="xsd:double"<br> + dictRef="compchem:basis_set_coefficients"&gt;<br> + 0.070874 0.339753 0.727159<br> + &lt;/cml:array&gt;<br> + &lt;/cml:list&gt;<br> + <br> + &lt;cml:list dictRef="compchem:basis_set_contraction"&gt;<br> + &lt;cml:scalar dictRef="compchem:basis_set_shell" dataType="xsd:string"&gt;<br> + S<br> + &lt;/cml:scalar&gt;<br> + &lt;cml:array size="1" dataType="xsd:double"<br> + dictRef="compchem:basis_set_exponents"&gt;<br> + 2.70005800E-01<br> + &lt;/cml:array&gt;<br> + <br> + &lt;cml:array size="1" dataType="xsd:double"<br> + dictRef="compchem:basis_set_coefficients"&gt;<br> + 1.000000<br> + &lt;/cml:array&gt;<br> + &lt;/cml:list&gt;<br> + <br> + &lt;cml:list dictRef="compchem:basis_set_contraction" shell="P"&gt;<br> + &lt;cml:scalar dictRef="compchem:basis_set_shell" dataType="xsd:string"&gt;<br> + P<br> + &lt;/cml:scalar&gt;<br> + &lt;cml:array size="1" dataType="xsd:double"<br> + dictRef="compchem:basis_set_exponents"&gt;<br> + 2.70005800E-01<br> + &lt;/cml:array&gt;<br> + <br> + &lt;cml:array size="1" dataType="xsd:double"<br> + dictRef="compchem:basis_set_coefficients"&gt;<br> + 1.000000<br> + &lt;/cml:array&gt;<br> + &lt;/cml:list&gt;<br> + <br> + &lt;cml:list dictRef="compchem:basis_set_contraction"&gt;<br> + &lt;cml:scalar dictRef="compchem:basis_set_shell" dataType="xsd:string"&gt;<br> + D<br> + &lt;/cml:scalar&gt;<br> + &lt;cml:array size="1" dataType="xsd:double"<br> + dictRef="compchem:basis_set_exponents"&gt;8.00000000E-01<br> + &lt;/cml:array&gt;<br> + <br> + &lt;cml:array size="1" dataType="xsd:double"<br> + dictRef="compchem:basis_set_coefficients"&gt;1.000000<br> + &lt;/cml:array&gt;<br> + &lt;/cml:list&gt;<br> + <br> + &lt;/cml:list&gt;<br> + <br> + &lt;cml:list dictRef="compchem:basis_set_contractions"&gt;<br> + <br> + &lt;!-- was elementType attribute --&gt;<br> + &lt;cml:scalar dictRef="compchem:element_type" dataType="xsd:string"&gt;<br> + H<br> + &lt;/cml:scalar&gt;<br> + <br> + &lt;!-- was harmonicType attribute --&gt;<br> + &lt;cml:scalar dictRef="compchem:basis_set_harmonic_type" dataType="xsd:string"&gt;<br> + cartesian<br> + &lt;/cml:scalar&gt;<br> + <br> + &lt;cml:scalar dictRef="compchem:basis_set_contractions_description" dataType="xsd:string"&gt;<br> + ???<br> + &lt;/cml:scalar&gt;<br> + <br> + &lt;cml:list dictRef="compchem:basis_set_contraction"&gt;<br> + &lt;cml:scalar dictRef="compchem:basis_set_shell" dataType="xsd:string"&gt;<br> + S<br> + &lt;/cml:scalar&gt;<br> + &lt;cml:array size="3" dataType="xsd:double"<br> + dictRef="compchem:basis_set_exponents"&gt;<br> + 1.87311370E+01 2.82539370E+00 6.40121700E-01<br> + &lt;/cml:array&gt;<br> + <br> + &lt;cml:array size="3" dataType="xsd:double"<br> + dictRef="compchem:basis_set_coefficients"&gt;<br> + 0.033495 0.234727 0.813757<br> + &lt;/cml:array&gt;<br> + &lt;/cml:list&gt;<br> + <br> + &lt;cml:list dictRef="compchem:basis_set_contraction"&gt;<br> + &lt;cml:scalar dictRef="compchem:basis_set_shell" dataType="xsd:string"&gt;<br> + S<br> + &lt;/cml:scalar&gt;<br> + &lt;cml:array size="1" dataType="xsd:double"<br> + dictRef="compchem:basis_set_exponents"&gt;<br> + 1.61277800E-01<br> + &lt;/cml:array&gt;<br> + <br> + &lt;cml:array size="1" dataType="xsd:double"<br> + dictRef="compchem:basis_set_coefficients"&gt;<br> + 1.000000<br> + &lt;/cml:array&gt;<br> + &lt;/cml:list&gt;<br> + <br> + &lt;/cml:list&gt;<br> + <br> + &lt;/cml:list&gt;<br> + <br> + &lt;/module&gt;<br> + &lt;/xml&gt;<br> + }}}</span> </td> </tr> </table> </div> Prototype datahttp://quixote.wikispot.org/Prototype_data2012-03-11 12:50:24JensThomas <div id="content" class="wikipage content"> Differences for Prototype data<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 597: </td> <td> Line 597: </td> </tr> <tr> <td> <span>-</span> This should serve as the basis for anything that is done with basis sets in CML. They currently use version 2 of the CML schema (http://www.xml-cml.org/schema/cml2/core), but the only things they use are '''cml:matrix''' and '''cml:elementTypeType'''. The <span>cml:matrix element is used to hold the coefficients and exponents, although it would appear that a cml:array would be more suitable for this. With this single change taken into consideration, the </span>following snippet of NWChem output: </td> <td> <span>+</span> This should serve as the basis for anything that is done with basis sets in CML. They currently use version 2 of the CML schema (http://www.xml-cml.org/schema/cml2/core), but the only things they use are '''cml:matrix''' and '''cml:elementTypeType'''. The following snippet of NWChem output: </td> </tr> <tr> <td> Line 676: </td> <td> Line 676: </td> </tr> <tr> <td> </td> <td> <span>+ &lt;bse:contraction shell="S"&gt;<br> + &lt;cml:matrix rows="6" columns="2" dataType="xsd:double"&gt;<br> + 5.48467170E+03 0.001831<br> + 8.25234950E+02 0.013950<br> + 1.88046960E+02 0.068445<br> + 5.29645000E+01 0.232714<br> + 1.68975700E+01 0.470193<br> + 5.79963530E+00 0.358521<br> + &lt;/cml:matrix&gt;<br> + &lt;/bse:contraction&gt;<br> + </span> </td> </tr> <tr> <td> Line 677: </td> <td> Line 688: </td> </tr> <tr> <td> <span>- &lt;cml:array size="6" dataType="xsd:double"<br> - dictRef="compchem:gbasis_exponents"&gt;<br> - 5.48467170E+03 8.25234950E+02 1.88046960E+02<br> - 5.29645000E+01 1.68975700E+01<br> - 5.79963530E+00<br> - &lt;/cml:array&gt;<br> - <br> - &lt;cml:array size="6" dataType="xsd:double"<br> - dictRef="compchem:gbasis_coefficients"&gt;<br> - 0.001831 0.013950 0.068445 0.232714 0.470193<br> - 0.358521<br> - &lt;/cml:array&gt;</span> </td> <td> <span>+ &lt;cml:matrix rows="3" columns="2" dataType="xsd:double"&gt;<br> + 1.55396160E+01 -0.110778<br> + 3.59993360E+00 -0.148026<br> + 1.01376180E+00 1.130767<br> + &lt;/cml:matrix&gt;</span> </td> </tr> <tr> <td> Line 691: </td> <td> Line 695: </td> </tr> <tr> <td> </td> <td> <span>+ &lt;bse:contraction shell="P"&gt;<br> + &lt;cml:matrix rows="3" columns="2" dataType="xsd:double"&gt;<br> + 1.55396160E+01 0.070874<br> + 3.59993360E+00 0.339753<br> + 1.01376180E+00 0.727159<br> + &lt;/cml:matrix&gt;<br> + &lt;/bse:contraction&gt;<br> + </span> </td> </tr> <tr> <td> Line 692: </td> <td> Line 704: </td> </tr> <tr> <td> <span>- &lt;cml:array size="3" dataType="xsd:double"<br> - dictRef="compchem:gbasis_exponents"&gt;<br> - 1.55396160E+01 3.59993360E+00 1.01376180E+00<br> - &lt;/cml:array&gt;<br> - <br> - &lt;cml:array size="3" dataType="xsd:double"<br> - dictRef="compchem:gbasis_coefficients"&gt;<br> - -0.110778 -0.148026 1.130767<br> - &lt;/cml:array&gt;</span> </td> <td> <span>+ &lt;cml:matrix rows="1" columns="2" dataType="xsd:double"&gt;<br> + 2.70005800E-01 1.000000<br> + &lt;/cml:matrix&gt;</span> </td> </tr> <tr> <td> Line 704: </td> <td> Line 710: </td> </tr> <tr> <td> <span>- &lt;cml:array size="3" dataType="xsd:double"<br> - dictRef="compchem:gbasis_exponents"&gt;<br> - 1.55396160E+01 3.59993360E+00 1.01376180E+00<br> - &lt;/cml:array&gt;<br> - <br> - &lt;cml:array size="3" dataType="xsd:double"<br> - dictRef="compchem:gbasis_coefficients"&gt;<br> - 0.070874 0.339753 0.727159<br> - &lt;/cml:array&gt;</span> </td> <td> <span>+ &lt;cml:matrix rows="1" columns="2" dataType="xsd:double"&gt;<br> + 2.70005800E-01 1.000000<br> + &lt;/cml:matrix&gt;</span> </td> </tr> <tr> <td> Line 715: </td> <td> Line 715: </td> </tr> <tr> <td> </td> <td> <span>+ &lt;bse:contraction shell="D"&gt;<br> + &lt;cml:matrix rows="1" columns="2" dataType="xsd:double"&gt;<br> + 8.00000000E-01 1.000000<br> + &lt;/cml:matrix&gt;<br> + &lt;/bse:contraction&gt;<br> + <br> + &lt;/bse:contractions&gt;<br> + <br> + &lt;bse:contractions elementType="H" harmonicType="cartesian"&gt;<br> + <br> + &lt;dc:description&gt;???&lt;/dc:description&gt;<br> + </span> </td> </tr> <tr> <td> Line 716: </td> <td> Line 728: </td> </tr> <tr> <td> <span>- &lt;cml:array size="1" dataType="xsd:double"<br> - dictRef="compchem:gbasis_exponents"&gt;<br> - 2.70005800E-01<br> - &lt;/cml:array&gt;<br> - <br> - &lt;cml:array size="1" dataType="xsd:double"<br> - dictRef="compchem:gbasis_coefficients"&gt;<br> - 1.000000<br> - &lt;/cml:array&gt;</span> </td> <td> <span>+ &lt;cml:matrix rows="3" columns="3" dataType="xsd:double"&gt;<br> + 1.87311370E+01 0.033495<br> + 2.82539370E+00 0.234727<br> + 6.40121700E-01 0.813757<br> + &lt;/cml:matrix&gt;</span> </td> </tr> <tr> <td> Line 727: </td> <td> Line 735: </td> </tr> <tr> <td> <span>- &lt;bse:contraction shell="P"&gt;<br> - &lt;cml:array size="1" dataType="xsd:double"<br> - dictRef="compchem:gbasis_exponents"&gt;<br> - 2.70005800E-01<br> - &lt;/cml:array&gt;<br> - <br> - &lt;cml:array size="1" dataType="xsd:double"<br> - dictRef="compchem:gbasis_coefficients"&gt;<br> - 1.000000<br> - &lt;/cml:array&gt;<br> - &lt;/bse:contraction&gt;<br> - <br> - &lt;bse:contraction shell="D"&gt;<br> - &lt;cml:array size="1" dataType="xsd:double"<br> - dictRef="compchem:gbasis_exponents"&gt;8.00000000E-01&lt;/cml:array&gt;<br> - <br> - &lt;cml:array size="1" dataType="xsd:double"<br> - dictRef="compchem:gbasis_coefficients"&gt;1.000000&lt;/cml:array&gt;<br> - &lt;/bse:contraction&gt;<br> - <br> - &lt;/bse:contractions&gt;<br> - <br> - &lt;bse:contractions elementType="H" harmonicType="cartesian"&gt;<br> - <br> - &lt;dc:description&gt;???&lt;/dc:description&gt;<br> - </span> </td> <td> </td> </tr> <tr> <td> Line 754: </td> <td> Line 736: </td> </tr> <tr> <td> <span>- &lt;cml:array size="3" dataType="xsd:double"<br> - dictRef="compchem:gbasis_exponents"&gt;<br> - 1.87311370E+01 2.82539370E+00 6.40121700E-01<br> - &lt;/cml:array&gt;<br> - <br> - &lt;cml:array size="6" dataType="xsd:double"<br> - dictRef="compchem:gbasis_coefficients"&gt;<br> - 0.033495 0.234727 0.813757<br> - &lt;/cml:array&gt;<br> - &lt;/bse:contraction&gt;<br> - <br> - &lt;bse:contraction shell="S"&gt;<br> - &lt;cml:array size="1" dataType="xsd:double"<br> - dictRef="compchem:gbasis_exponents"&gt;<br> - 1.61277800E-01<br> - &lt;/cml:array&gt;<br> - <br> - &lt;cml:array size="1" dataType="xsd:double"<br> - dictRef="compchem:gbasis_coefficients"&gt;<br> - 1.000000<br> - &lt;/cml:array&gt;</span> </td> <td> <span>+ &lt;cml:matrix rows="1" columns="1" dataType="xsd:double"&gt;<br> + 1.61277800E-01 1.000000<br> + &lt;/cml:matrix&gt;</span> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-03-09 18:44:02JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 581: </td> <td> Line 581: </td> </tr> <tr> <td> <span>- * '''createWrapper''' - TODO</span> </td> <td> <span>+ * '''createWrapper''' - for each node in the '''xpath''' nodeset, this will create an enveloping element of type '''elementName''' that will become the child of the node's parent, and hold the node and all of its children. '''id''' and '''dictRef''' arguments are supported.</span> </td> </tr> <tr> <td> Line 589: </td> <td> Line 589: </td> </tr> <tr> <td> <span>- * '''createWrapperMetadata''' - TODO</span> </td> <td> <span>+ * '''createWrapperMetadata''' - for each node in the '''xpath''' nodelist, if the node is one of a cml:scalar, cml:array, cml:list, cml:table or cml:matrix, and has a "dictRef" attribute, it will remove the dictRef attribute and instead wrap the element in a cml:metadata element, so that, e.g. '''{{{&lt;scalar dataType="xsd:string" dictRef="n:basis_type"&gt;ao basis&lt;/scalar&gt;}}}''' becomes:<br> + <br> + {{{<br> + &lt;metadata name="n:basis_type"&gt;<br> + &lt;scalar dataType="xsd:string"&gt;ao basis&lt;/scalar&gt;<br> + &lt;/metadata&gt;<br> + }}}</span> </td> </tr> <tr> <td> Line 598: </td> <td> Line 604: </td> </tr> <tr> <td> <span>- * '''createWrapperParameter''' - TODO</span> </td> <td> <span>+ * '''createWrapperParameter''' - this performs the same operation as createWrapperMetadata, but wraps the element in a cml:parameter with the dictRef of the target element.</span> </td> </tr> <tr> <td> Line 608: </td> <td> Line 614: </td> </tr> <tr> <td> <span>- * '''createWrapperProperty''' - TODO</span> </td> <td> <span>+ * '''createWrapperProperty''' - - this performs the same operation as createWrapperMetadata, but wraps the element in a cml:property with the dictRef of the target element.</span> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-03-08 14:33:24JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 313: </td> <td> Line 313: </td> </tr> <tr> <td> <span>-</span> * '''addChild''' - this will create a child element of the nodes specified by the '''xpath'''. The only required argument is '''elementName''', which specifies the type of element to create. Other supported arguments are: '''id''', '''dictRef''' and '''value'''. '''NB:''' the value argument currently doesn't support the $string() syntax. </td> <td> <span>+</span> * '''addChild''' - this will create a child element of the nodes specified by the '''xpath'''. The only required argument is '''elementName''', which specifies the type of element to create. Other supported arguments are: '''id''', '''dictRef''' and '''value'''<span>. The '''position''' argument specifies where the child will be created in the list of children. position="0" creates it as the first child, "1", the second etc. With no position argument, the child is added as the last child</span>. '''NB:''' the value argument currently doesn't support the $string() syntax. </td> </tr> <tr> <td> Line 320: </td> <td> Line 320: </td> </tr> <tr> <td> </td> <td> <span>+ position="0"</span> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-03-08 14:28:46JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 313: </td> <td> Line 313: </td> </tr> <tr> <td> - <span>&nbsp;* '''addChild''' - </span>this will create a child element of the nodes specified by the '''xpath'''. The only required argument is '''elementName''', which specifies the type of element to create. Other supported arguments are: '''id''', '''dictRef''' and '''value'''. </td> <td> <span>+ * '''addChild''' </span>- this will create a child element of the nodes specified by the '''xpath'''. The only required argument is '''elementName''', which specifies the type of element to create. Other supported arguments are: '''id''', '''dictRef''' and '''value'''.<span>&nbsp;'''NB:''' the value argument currently doesn't support the $string() syntax.</span> </td> </tr> </table> </div> Prototype datahttp://quixote.wikispot.org/Prototype_data2012-03-08 12:05:41 <div id="content" class="wikipage content"> Differences for Prototype data<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 597: </td> <td> Line 597: </td> </tr> <tr> <td> <span>-</span> This should serve as the basis for anything that is done with basis sets in CML. They currently use version 2 of the CML schema (http://www.xml-cml.org/schema/cml2/core), but the only things they use are '''cml:matrix''' and '''cml:elementTypeType'''. The cml:matrix element is used to hold the coefficients and exponents, although it would appear that a cml:array would be more suitable for this. With this single change tak<span>ing</span> into consideration, the following snippet of NWChem output: </td> <td> <span>+</span> This should serve as the basis for anything that is done with basis sets in CML. They currently use version 2 of the CML schema (http://www.xml-cml.org/schema/cml2/core), but the only things they use are '''cml:matrix''' and '''cml:elementTypeType'''. The cml:matrix element is used to hold the coefficients and exponents, although it would appear that a cml:array would be more suitable for this. With this single change tak<span>en</span> into consideration, the following snippet of NWChem output: </td> </tr> </table> </div> Prototype datahttp://quixote.wikispot.org/Prototype_data2012-03-08 12:04:50 <div id="content" class="wikipage content"> Differences for Prototype data<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 655: </td> <td> Line 655: </td> </tr> <tr> <td> <span>-</span> &lt;module xmlns:bse="http://purl.oclc.org/NET/EMSL/BSE"&gt; </td> <td> <span>+</span> &lt;module xmlns:bse="http://purl.oclc.org/NET/EMSL/BSE"<span><br> + xmlns:dc="http://purl.org/dc/elements/1.1/"</span>&gt; </td> </tr> <tr> <td> Line 659: </td> <td> Line 660: </td> </tr> <tr> <td> <span>-</span> &lt;<span>bse</span>:format&gt;???&lt;/<span>bse</span>:format&gt;<br> <span>-</span> <br> <span>-</span> &lt;<span>bse</span>:title&gt;6-31g*&lt;/<span>bse</span>:title&gt; </td> <td> <span>+</span> &lt;<span>dc</span>:format&gt;???&lt;/<span>dc</span>:format&gt;<br> <span>+</span> <br> <span>+</span> &lt;<span>dc</span>:title&gt;6-31g*&lt;/<span>dc</span>:title&gt; </td> </tr> <tr> <td> Line 669: </td> <td> Line 670: </td> </tr> <tr> <td> <span>-</span> &lt;<span>bse</span>:description&gt;???&lt;/<span>bse</span>:description&gt; </td> <td> <span>+</span> &lt;<span>dc</span>:description&gt;???&lt;/<span>dc</span>:description&gt; </td> </tr> <tr> <td> Line 673: </td> <td> Line 674: </td> </tr> <tr> <td> <span>-</span> &lt;<span>bse</span>:description&gt;???&lt;/<span>bse</span>:description&gt; </td> <td> <span>+</span> &lt;<span>dc</span>:description&gt;???&lt;/<span>dc</span>:description&gt; </td> </tr> <tr> <td> Line 750: </td> <td> Line 751: </td> </tr> <tr> <td> <span>-</span> &lt;<span>bse</span>:description&gt;???&lt;/<span>bse</span>:description&gt; </td> <td> <span>+</span> &lt;<span>dc</span>:description&gt;???&lt;/<span>dc</span>:description&gt; </td> </tr> </table> </div> Prototype datahttp://quixote.wikispot.org/Prototype_data2012-03-08 12:01:05 <div id="content" class="wikipage content"> Differences for Prototype data<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 584: </td> <td> Line 584: </td> </tr> <tr> <td> </td> <td> <span>+ <br> + = Changes to CML =<br> + <br> + This is just a temporary place to dump stuff that relates to changes to CML.<br> + <br> + == Basis sets ==<br> + <br> + The [https://bse.pnl.gov/bse EMSL basis set exchange] has already done most of the work to describe basis sets.<br> + <br> + The paper: [http://pubs.acs.org/doi/abs/10.1021/ci600510j Basis Set Exchange: A Community Database for Computational Sciences] describes the infrastructure and the considerations that went into the design.<br> + <br> + The schemas they have developed are available here: https://bse.pnl.gov/bse/docs/schemas/<br> + <br> + This should serve as the basis for anything that is done with basis sets in CML. They currently use version 2 of the CML schema (http://www.xml-cml.org/schema/cml2/core), but the only things they use are '''cml:matrix''' and '''cml:elementTypeType'''. The cml:matrix element is used to hold the coefficients and exponents, although it would appear that a cml:array would be more suitable for this. With this single change taking into consideration, the following snippet of NWChem output:<br> + <br> + {{{<br> + <br> + <br> + Basis "ao basis" -&gt; "" (cartesian)<br> + -----<br> + O (Oxygen)<br> + ----------<br> + Exponent Coefficients<br> + -------------- ---------------------------------------------------------<br> + 1 S 5.48467170E+03 0.001831<br> + 1 S 8.25234950E+02 0.013950<br> + 1 S 1.88046960E+02 0.068445<br> + 1 S 5.29645000E+01 0.232714<br> + 1 S 1.68975700E+01 0.470193<br> + 1 S 5.79963530E+00 0.358521<br> + <br> + 2 S 1.55396160E+01 -0.110778<br> + 2 S 3.59993360E+00 -0.148026<br> + 2 S 1.01376180E+00 1.130767<br> + <br> + 3 P 1.55396160E+01 0.070874<br> + 3 P 3.59993360E+00 0.339753<br> + 3 P 1.01376180E+00 0.727159<br> + <br> + 4 S 2.70005800E-01 1.000000<br> + <br> + 5 P 2.70005800E-01 1.000000<br> + <br> + 6 D 8.00000000E-01 1.000000<br> + <br> + H (Hydrogen)<br> + ------------<br> + Exponent Coefficients<br> + -------------- ---------------------------------------------------------<br> + 1 S 1.87311370E+01 0.033495<br> + 1 S 2.82539370E+00 0.234727<br> + 1 S 6.40121700E-01 0.813757<br> + <br> + 2 S 1.61277800E-01 1.000000<br> + <br> + <br> + <br> + Summary of "ao basis" -&gt; "" (cartesian)<br> + ------------------------------------------------------------------------------<br> + Tag Description Shells Functions and Types<br> + ---------------- ------------------------------ ------ ---------------------<br> + O 6-31g* 6 15 3s2p1d<br> + H 6-31g* 2 2 2s<br> + }}}<br> + <br> + <br> + Would be marked up as follows:<br> + <br> + {{{<br> + &lt;?xml version="1.0" encoding="UTF-8"?&gt;<br> + <br> + &lt;module xmlns:bse="http://purl.oclc.org/NET/EMSL/BSE"&gt;<br> + &lt;bse:basisSet&gt;<br> + <br> + &lt;!-- Need to look into MIME type --&gt;<br> + &lt;bse:format&gt;???&lt;/bse:format&gt;<br> + <br> + &lt;bse:title&gt;6-31g*&lt;/bse:title&gt;<br> + <br> + &lt;bse:basisSetType&gt;orbital&lt;/bse:basisSetType&gt;<br> + <br> + &lt;bse:harmonicType&gt;cartesian&lt;/bse:harmonicType&gt;<br> + <br> + &lt;bse:contractionType&gt;general&lt;/bse:contractionType&gt;<br> + <br> + &lt;bse:description&gt;???&lt;/bse:description&gt;<br> + <br> + &lt;bse:contractions elementType="O" harmonicType="cartesian"&gt;<br> + <br> + &lt;bse:description&gt;???&lt;/bse:description&gt;<br> + <br> + &lt;bse:contraction shell="S"&gt;<br> + &lt;cml:array size="6" dataType="xsd:double"<br> + dictRef="compchem:gbasis_exponents"&gt;<br> + 5.48467170E+03 8.25234950E+02 1.88046960E+02<br> + 5.29645000E+01 1.68975700E+01<br> + 5.79963530E+00<br> + &lt;/cml:array&gt;<br> + <br> + &lt;cml:array size="6" dataType="xsd:double"<br> + dictRef="compchem:gbasis_coefficients"&gt;<br> + 0.001831 0.013950 0.068445 0.232714 0.470193<br> + 0.358521<br> + &lt;/cml:array&gt;<br> + &lt;/bse:contraction&gt;<br> + <br> + &lt;bse:contraction shell="S"&gt;<br> + &lt;cml:array size="3" dataType="xsd:double"<br> + dictRef="compchem:gbasis_exponents"&gt;<br> + 1.55396160E+01 3.59993360E+00 1.01376180E+00<br> + &lt;/cml:array&gt;<br> + <br> + &lt;cml:array size="3" dataType="xsd:double"<br> + dictRef="compchem:gbasis_coefficients"&gt;<br> + -0.110778 -0.148026 1.130767<br> + &lt;/cml:array&gt;<br> + &lt;/bse:contraction&gt;<br> + <br> + &lt;bse:contraction shell="P"&gt;<br> + &lt;cml:array size="3" dataType="xsd:double"<br> + dictRef="compchem:gbasis_exponents"&gt;<br> + 1.55396160E+01 3.59993360E+00 1.01376180E+00<br> + &lt;/cml:array&gt;<br> + <br> + &lt;cml:array size="3" dataType="xsd:double"<br> + dictRef="compchem:gbasis_coefficients"&gt;<br> + 0.070874 0.339753 0.727159<br> + &lt;/cml:array&gt;<br> + &lt;/bse:contraction&gt;<br> + <br> + &lt;bse:contraction shell="S"&gt;<br> + &lt;cml:array size="1" dataType="xsd:double"<br> + dictRef="compchem:gbasis_exponents"&gt;<br> + 2.70005800E-01<br> + &lt;/cml:array&gt;<br> + <br> + &lt;cml:array size="1" dataType="xsd:double"<br> + dictRef="compchem:gbasis_coefficients"&gt;<br> + 1.000000<br> + &lt;/cml:array&gt;<br> + &lt;/bse:contraction&gt;<br> + <br> + &lt;bse:contraction shell="P"&gt;<br> + &lt;cml:array size="1" dataType="xsd:double"<br> + dictRef="compchem:gbasis_exponents"&gt;<br> + 2.70005800E-01<br> + &lt;/cml:array&gt;<br> + <br> + &lt;cml:array size="1" dataType="xsd:double"<br> + dictRef="compchem:gbasis_coefficients"&gt;<br> + 1.000000<br> + &lt;/cml:array&gt;<br> + &lt;/bse:contraction&gt;<br> + <br> + &lt;bse:contraction shell="D"&gt;<br> + &lt;cml:array size="1" dataType="xsd:double"<br> + dictRef="compchem:gbasis_exponents"&gt;8.00000000E-01&lt;/cml:array&gt;<br> + <br> + &lt;cml:array size="1" dataType="xsd:double"<br> + dictRef="compchem:gbasis_coefficients"&gt;1.000000&lt;/cml:array&gt;<br> + &lt;/bse:contraction&gt;<br> + <br> + &lt;/bse:contractions&gt;<br> + <br> + &lt;bse:contractions elementType="H" harmonicType="cartesian"&gt;<br> + <br> + &lt;bse:description&gt;???&lt;/bse:description&gt;<br> + <br> + &lt;bse:contraction shell="S"&gt;<br> + &lt;cml:array size="3" dataType="xsd:double"<br> + dictRef="compchem:gbasis_exponents"&gt;<br> + 1.87311370E+01 2.82539370E+00 6.40121700E-01<br> + &lt;/cml:array&gt;<br> + <br> + &lt;cml:array size="6" dataType="xsd:double"<br> + dictRef="compchem:gbasis_coefficients"&gt;<br> + 0.033495 0.234727 0.813757<br> + &lt;/cml:array&gt;<br> + &lt;/bse:contraction&gt;<br> + <br> + &lt;bse:contraction shell="S"&gt;<br> + &lt;cml:array size="1" dataType="xsd:double"<br> + dictRef="compchem:gbasis_exponents"&gt;<br> + 1.61277800E-01<br> + &lt;/cml:array&gt;<br> + <br> + &lt;cml:array size="1" dataType="xsd:double"<br> + dictRef="compchem:gbasis_coefficients"&gt;<br> + 1.000000<br> + &lt;/cml:array&gt;<br> + &lt;/bse:contraction&gt;<br> + <br> + &lt;/bse:contractions&gt;<br> + <br> + &lt;/bse:basisSet&gt;<br> + <br> + &lt;/module&gt;<br> + &lt;/xml&gt;<br> + }}}</span> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-03-07 21:17:56JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 304: </td> <td> Line 304: </td> </tr> <tr> <td> <span>- * '''addAttribute''' - TODO</span> </td> <td> <span>+ * '''addAttribute''' - add an attribute of type '''name''' and value '''value''' to all nodes in the '''xpath''' nodeset. If value is a string of the form "$string(XPATH)" or $number(XPATH), where XPATH is a valid XPATH, then the value will be the result of evaluating the XPATH relative to the current node in the nodeset evaluated by '''xpath''', in string or number form. If '''name''' consists of two strings separated by a colon, SOMETHING EXCITING HAPPENS...</span> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-03-07 20:14:43JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 549: </td> <td> Line 549: </td> </tr> <tr> <td> <span>-</span> * '''createString''' - if '''xpath''' returns a list arrays, then each array will be converted into a cml:scalar with a dataType xsd:string, with the value of the scalar being the values in the array as strings and separated by whitespace. If '''xpath''' returns a list of cml:scalars, the the first scalar will be converted to type xsd:string, the value of which will be the concatenation of all the values in the remaining scalar nodes. The remaining scalar nodes will then be deleted. If a single node is returned by '''xpath''' and it is of instance text, then a new cml:scalar node will be created in its place with an optional id attribute as specified in the '''id''' argument. </td> <td> <span>+</span> * '''createString''' - if '''xpath''' returns a list arrays, then each array will be converted into a cml:scalar with a dataType xsd:string, with the value of the scalar being the values in the array<span>, concatentated</span> as strings and separated by whitespace. If '''xpath''' returns a list of cml:scalars, the the first scalar will be converted to type xsd:string, the value of which will be the concatenation of all the values in the remaining scalar nodes. The remaining scalar nodes will then be deleted. If a single node is returned by '''xpath''' and it is of instance text, then a new cml:scalar node will be created in its place with an optional id attribute as specified in the '''id''' argument. </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-03-07 18:31:06JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 549: </td> <td> Line 549: </td> </tr> <tr> <td> <span>- * '''createString''' - TODO</span> </td> <td> <span>+ * '''createString''' - if '''xpath''' returns a list arrays, then each array will be converted into a cml:scalar with a dataType xsd:string, with the value of the scalar being the values in the array as strings and separated by whitespace. If '''xpath''' returns a list of cml:scalars, the the first scalar will be converted to type xsd:string, the value of which will be the concatenation of all the values in the remaining scalar nodes. The remaining scalar nodes will then be deleted. If a single node is returned by '''xpath''' and it is of instance text, then a new cml:scalar node will be created in its place with an optional id attribute as specified in the '''id''' argument.</span> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-03-07 18:13:58JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 360: </td> <td> Line 360: </td> </tr> <tr> <td> <span>- * '''addSibling''' - TODO</span> </td> <td> <span>+ * '''addSibling''' - this will add a sibling element to each node in the '''xpath''' nodeset, with the type of element being that specified in '''elementName''' and the elements id attribute as specified by '''id''' argument. The '''position''' argument indicates where the element will be created, "0" creates the element before node, "1" creates it after the current node. If there are multiple siblings to the current node, "-2" would create it 2 nodes down from the current node, "2", one up from it etc.</span> </td> </tr> <tr> <td> Line 365: </td> <td> Line 365: </td> </tr> <tr> <td> <span>-</span> elementName="cml:module" id="l202.group" position="1" /&gt; </td> <td> <span>+</span> elementName="cml:module"<span><br> + </span> id="l202.group"<span><br> + </span> position="1" /&gt; </td> </tr> <tr> <td> Line 692: </td> <td> Line 694: </td> </tr> <tr> <td> <span>- * '''setValue''' - TODO</span> </td> <td> <span>+ * '''setValue''' - with just a simple string as an argument (e.g. value="foo") to the value argument, this will set the value of all nodes in the '''xpath''' to be equal to this string. If value is a string of the form "$string(XPATH)" or $number(XPATH), where XPATH is a valid XPATH, then the value will be the result of evaluating the XPATH relative to the current node in the nodeset evaluated by '''xpath''', in string or number form. With a map argument ...TODO</span> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-03-07 16:56:40 <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 647: </td> <td> Line 647: </td> </tr> <tr> <td> <span>- * '''joinArrays''' - TODO</span> </td> <td> <span>+ * '''joinArrays''' - with a single '''xpath''' argument, this will take the first array in the nodelist and join all the others to it, deleting the other arrays and leaving a single array with the dictRef of the original array. With an additional '''key''' argument, SOMETHING ELSE HAPPENS. With an additional '''from''' argument, SOMETHING ELSE HAPPENS.</span> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-03-06 22:32:41JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 332: </td> <td> Line 332: </td> </tr> <tr> <td> <span>- * '''addId''' - TODO</span> </td> <td> <span>+ * '''addId''' - this adds an '''id'''with the value specified by the '''value''' argument to the nodeset specified in the '''xpath'''.</span> </td> </tr> <tr> <td> Line 340: </td> <td> Line 340: </td> </tr> <tr> <td> <span>- * '''addMap''' - TODO<br> - <br> - {{{<br> - &lt;transform process="addMap" xpath="."</span> </td> <td> <span>+ * '''addMap''' - for every node in the nodeset specified by '''xpath''', this creates a '''cml:map''' with the specified '''id''' that links the '''values''' of the nodes in the '''from''' nodeset to that in the '''to''' nodeset.<br> + <br> + {{{<br> + &lt;transform process="addMap"<br> + xpath="."</span> </td> </tr> <tr> <td> Line 349: </td> <td> Line 350: </td> </tr> <tr> <td> <span>- * '''addNamespace''' - TODO</span> </td> <td> <span>+ * '''addNamespace''' - this will add a namespace element of the form xmlns:'''name'''="'''value'''" to every element in the nodeset returned by '''xpath'''.</span> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-03-06 19:09:59JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 282: </td> <td> Line 282: </td> </tr> <tr> <td> </td> <td> <span>+ The philosophy of the transforms is very similar to the idea of templates in [http://en.wikipedia.org/wiki/XSLT xslt], using the idea of "nodeset" to which operations are applied.<br> + </span> </td> </tr> <tr> <td> Line 290: </td> <td> Line 292: </td> </tr> <tr> <td> <span>-</span> The transforms have a '''process''' which defines the operation that will be carried out, almost all have an '''xpath''' that is an xpath expression indicating the elements the process will be applied to, and a variable number of arguments, depending on the process being carried out. </td> <td> <span>+</span> The transforms have a '''process''' which defines the operation that will be carried out, almost all have an '''xpath''' that is an xpath expression indicating the elements the process will be applied to<span>&nbsp;(the nodeset)</span>, and a variable number of arguments, depending on the process being carried out. </td> </tr> <tr> <td> Line 297: </td> <td> Line 299: </td> </tr> <tr> <td> </td> <td> <span>+ <br> + Various miscellaneous notes will be added in the section below, which will be merged into the documentation in due course.</span> </td> </tr> <tr> <td> Line 702: </td> <td> Line 706: </td> </tr> <tr> <td> </td> <td> <span>+ }}}<br> + <br> + <br> + === Notes on Transforms ===<br> + * where possible, '''id''''s should always be added to nodesets to facilitate later operations.<br> + * the &lt; and &gt; symbols should not be used in xpath comparisons, however &amp;lt; and &amp;gt; can be used, as shown below:<br> + {{{<br> + &lt;transform process="debugNodes" xpath=".//cml:array[position() &amp;gt; 1 and position() &amp;lt; 4]"/&gt;<br> + }}}<br> + <br> + * use "..." for quoting attribute values<br> + * Rather then use relative namespaces (e.g. g:charge), the more reliable namespace-uri syntax can be used:<br> + {{{<br> + foo[@dictRef[namespace-uri()='http://www.xml-cml.org/dict/gaussian' and .='charge']]</span> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-03-06 10:20:03JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 302: </td> <td> Line 302: </td> </tr> <tr> <td> </td> <td> <span>+ {{{<br> + &lt;transform process="addAttribute"<br> + xpath=".//cml:molecule"<br> + name="formalCharge"<br> + value="$number(.//cml:scalar[@dictRef='g:charge'])" /&gt;<br> + }}}<br> + </span> </td> </tr> <tr> <td> Line 309: </td> <td> Line 316: </td> </tr> <tr> <td> <span>-</span> dictRef="cc:jobList"/&gt; </td> <td> <span>+</span> dictRef="cc:jobList"<span>&nbsp;</span>/&gt; </td> </tr> <tr> <td> Line 318: </td> <td> Line 325: </td> </tr> <tr> <td> <span>-</span> value="cc:popanal"/&gt; </td> <td> <span>+</span> value="cc:popanal<span>&nbsp;</span>"/&gt; </td> </tr> <tr> <td> Line 323: </td> <td> Line 330: </td> </tr> <tr> <td> <span>- * '''addInteger''' - TODO</span> </td> <td> <span>+ {{{<br> + &lt;transform process="addId"<br> + value="mol9999"<br> + xpath=".//cml:molecule[starts-with(@id,'a')]" /&gt;<br> + }}}</span> </td> </tr> <tr> <td> Line 327: </td> <td> Line 338: </td> </tr> <tr> <td> </td> <td> <span>+ {{{<br> + &lt;transform process="addMap" xpath="."<br> + id="variableConstantMap"<br> + from=".//cml:scalar[@dictRef='g:variable' or @dictRef='g:const']"<br> + to=".//cml:scalar[@dictRef='g:value']" /&gt;<br> + }}}<br> + </span> </td> </tr> <tr> <td> Line 329: </td> <td> Line 347: </td> </tr> <tr> <td> </td> <td> <span>+ {{{<br> + &lt;transform process="addNamespace"<br> + xpath="."<br> + name="convention"<br> + value="http://www.xml-cml.org/convention/" /&gt;<br> + }}}<br> + <br> + </span> </td> </tr> <tr> <td> Line 331: </td> <td> Line 357: </td> </tr> <tr> <td> </td> <td> <span>+ {{{<br> + &lt;transform process="addSibling"<br> + xpath="./cml:module[@id='calculation']/cml:module[@cmlx:templateRef='l202.rotconst']"<br> + elementName="cml:module" id="l202.group" position="1" /&gt;<br> + }}}<br> + </span> </td> </tr> <tr> <td> Line 332: </td> <td> Line 364: </td> </tr> <tr> <td> </td> <td> <span>+ <br> + {{{<br> + &lt;transform process="addUnits"<br> + xpath=".//cml:scalar"<br> + value="u:jmol-1"/&gt;<br> + }}}</span> </td> </tr> <tr> <td> Line 343: </td> <td> Line 381: </td> </tr> <tr> <td> </td> <td> <span>+ {{{<br> + &lt;transform process="createAngle"<br> + xpath=".//cml:list/cml:list[cml:atom]"<br> + atomRefs="$string(cml:scalar[3]) $string(cml:scalar[1]) $string(cml:atom/@id)" value="$string(cml:scalar[4])" /&gt;<br> + }}}<br> + </span> </td> </tr> <tr> <td> Line 353: </td> <td> Line 397: </td> </tr> <tr> <td> </td> <td> <span>+ {{{<br> + &lt;transform process="createAtom"<br> + xpath=".//cml:scalar[@dictRef='cc:elementType']" /&gt;<br> + }}}<br> + </span> </td> </tr> <tr> <td> Line 355: </td> <td> Line 404: </td> </tr> <tr> <td> </td> <td> <span>+ {{{<br> + &lt;transform process="createDate"<br> + xpath=".//cml:list[@dictRef='g:archive1']/cml:scalar[9]"<br> + format="dd-MMM-yyyy"<br> + dictRef="cc:date"/&gt;<br> + }}}<br> + </span> </td> </tr> <tr> <td> Line 357: </td> <td> Line 413: </td> </tr> <tr> <td> </td> <td> <span>+ {{{<br> + &lt;transform process="createDouble"<br> + xpath=".//cml:list[@dictRef='g:archive.namevalue']/cml:scalar[@dictRef='x:HF']"<br> + dictRef="cc:hfenergy" /&gt;<br> + }}}<br> + </span> </td> </tr> <tr> <td> Line 359: </td> <td> Line 421: </td> </tr> <tr> <td> </td> <td> <span>+ {{{<br> + &lt;transform process="createFormula"<br> + xpath=".//cml:list[@dictRef='g:archive1']/cml:scalar[7]"/&gt;<br> + }}}<br> + </span> </td> </tr> <tr> <td> Line 360: </td> <td> Line 427: </td> </tr> <tr> <td> </td> <td> <span>+ <br> + {{{<br> + &lt;transform process="createLength"<br> + xpath=".//cml:list/cml:list[cml:atom]"<br> + atomRefs="$string(cml:scalar[1]) $string(cml:atom/@id)" value="$string(cml:scalar[2])"/&gt;<br> + }}}</span> </td> </tr> <tr> <td> Line 370: </td> <td> Line 443: </td> </tr> <tr> <td> </td> <td> <span>+ {{{<br> + &lt;transform process="createMatrix33"<br> + xpath="." dictRef="g:axis"<br> + from=".//cml:scalar[contains(@dictRef,':x.') or contains(@dictRef,':y.') or contains(@dictRef,':z.')]" /&gt;<br> + }}}<br> + </span> </td> </tr> <tr> <td> Line 371: </td> <td> Line 450: </td> </tr> <tr> <td> </td> <td> <span>+ <br> + {{{<br> + &lt;transform process="createMatrix33" xpath="."<br> + dictRef="g:axis"<br> + from=".//cml:scalar[contains(@dictRef,':x.') or contains(@dictRef,':y.') or contains(@dictRef,':z.')]" /&gt;<br> + }}}</span> </td> </tr> <tr> <td> Line 450: </td> <td> Line 535: </td> </tr> <tr> <td> </td> <td> <span>+ {{{<br> + &lt;transform process="createNameValue"<br> + xpath="./cml:list/cml:list"<br> + name=".//cml:scalar[@dictRef='x:name']"<br> + value=".//cml:scalar[@dictRef='x:value']" /&gt;<br> + }}}<br> + </span> </td> </tr> <tr> <td> Line 452: </td> <td> Line 544: </td> </tr> <tr> <td> </td> <td> <span>+ {{{<br> + &lt;transform process="createString"<br> + xpath="./cml:list/cml:scalar"/&gt;<br> + }}}<br> + </span> </td> </tr> <tr> <td> Line 454: </td> <td> Line 551: </td> </tr> <tr> <td> </td> <td> <span>+ {{{<br> + &lt;transform process="createTable"<br> + xpath=".//cml:list[@cmlx:templateRef='symmadapt']" /&gt;<br> + }}}<br> + </span> </td> </tr> <tr> <td> Line 455: </td> <td> Line 557: </td> </tr> <tr> <td> </td> <td> <span>+ <br> + {{{<br> + &lt;transform process="createTorsion"<br> + xpath=".//cml:list/cml:list[cml:atom]"<br> + atomRefs="$string(cml:scalar[5]) $string(cml:scalar[3]) $string(cml:scalar[1]) $string(cml:atom/@id)" value="$string(cml:scalar[6])" /&gt;<br> + }}}</span> </td> </tr> <tr> <td> Line 467: </td> <td> Line 575: </td> </tr> <tr> <td> </td> <td> <span>+ {{{<br> + &lt;transform process="createWrapper"<br> + xpath=".//cml:module/text()"<br> + elementName="UNPARSED"/&gt;<br> + }}}<br> + </span> </td> </tr> <tr> <td> Line 469: </td> <td> Line 583: </td> </tr> <tr> <td> </td> <td> <span>+ {{{<br> + &lt;transform process="createWrapperMetadata"<br> + xpath=".//cml:scalar[@dictRef='cc:version' or<br> + @dictRef='cc:date' or<br> + @dictRef='cc:title']"/&gt;<br> + }}}<br> + </span> </td> </tr> <tr> <td> Line 471: </td> <td> Line 592: </td> </tr> <tr> <td> </td> <td> <span>+ {{{<br> + &lt;transform process="createWrapperParameter"<br> + xpath=".//cml:scalar[@dictRef='cc:hostname' or<br> + @dictRef='cc:jobname' or<br> + @dictRef='cc:method' or<br> + @dictRef='cc:basis' ]"/&gt;<br> + }}}<br> + </span> </td> </tr> <tr> <td> Line 473: </td> <td> Line 602: </td> </tr> <tr> <td> </td> <td> <span>+ {{{<br> + &lt;transform process="createWrapperProperty"<br> + xpath=".//*[@dictRef='cc:electronicstate' or<br> + @dictRef='cc:hfenergy' or<br> + @dictRef='cc:dipole' or<br> + @dictRef='cc:dipolederiv' or<br> + @dictRef='cc:polarizability' or<br> + @dictRef='cc:pointgroup' or<br> + @dictRef='cc:rmsd' or<br> + @dictRef='cc:rmsf']"/&gt;<br> + }}}<br> + </span> </td> </tr> <tr> <td> Line 474: </td> <td> Line 615: </td> </tr> <tr> <td> </td> <td> <span>+ <br> + {{{<br> + &lt;transform process="createZMatrix"<br> + xpath="." id="zinitial"/&gt;<br> + }}}</span> </td> </tr> <tr> <td> Line 491: </td> <td> Line 637: </td> </tr> <tr> <td> </td> <td> <span>+ {{{<br> + &lt;transform process="groupSiblings"<br> + xpath=".//cml:module[@id='l202.group']" /&gt;<br> + }}}<br> + </span> </td> </tr> <tr> <td> Line 492: </td> <td> Line 643: </td> </tr> <tr> <td> </td> <td> <span>+ <br> + {{{<br> + &lt;transform process="joinArrays"<br> + xpath=".//cml:list[@cmlx:templateRef='atom']/cml:array" /&gt;<br> + }}}</span> </td> </tr> <tr> <td> Line 525: </td> <td> Line 681: </td> </tr> <tr> <td> </td> <td> <span>+ {{{<br> + &lt;transform process="reparse"<br> + xpath=".//cml:scalar[@id='scraped']"<br> + regexPath=".//record[@id='natoms']"/&gt;<br> + }}}<br> + </span> </td> </tr> <tr> <td> Line 526: </td> <td> Line 688: </td> </tr> <tr> <td> </td> <td> <span>+ <br> + {{{<br> + &lt;transform process="setValue"<br> + xpath=".//cml:list/cml:scalar[2] |<br> + .//cml:list/cml:scalar[4] |<br> + .//cml:list/cml:scalar[6]"<br> + map="//cml:map[@id='variableMap']"<br> + value="$string(.)"/&gt;<br> + }}}</span> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-03-06 09:29:58JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 300: </td> <td> Line 300: </td> </tr> <tr> <td> </td> <td> <span>+ * '''addAttribute''' - TODO<br> + </span> </td> </tr> <tr> <td> Line 319: </td> <td> Line 321: </td> </tr> <tr> <td> </td> <td> <span>+ * '''addId''' - TODO<br> + <br> + * '''addInteger''' - TODO<br> + <br> + * '''addMap''' - TODO<br> + <br> + * '''addNamespace''' - TODO<br> + <br> + * '''addSibling''' - TODO<br> + <br> + * '''addUnits''' - TODO<br> + </span> </td> </tr> <tr> <td> Line 327: </td> <td> Line 341: </td> </tr> <tr> <td> </td> <td> <span>+ * '''createAngle''' - TODO<br> + </span> </td> </tr> <tr> <td> Line 335: </td> <td> Line 351: </td> </tr> <tr> <td> </td> <td> <span>+ * '''createAtom''' - TODO<br> + <br> + * '''createDate''' - TODO<br> + <br> + * '''createDouble''' - TODO<br> + <br> + * '''createForumla''' - TODO<br> + <br> + * '''createLength''' - TODO<br> + </span> </td> </tr> <tr> <td> Line 341: </td> <td> Line 367: </td> </tr> <tr> <td> </td> <td> <span>+ <br> + * '''createMatrix''' - TODO<br> + <br> + * '''createMatrix33''' - TODO</span> </td> </tr> <tr> <td> Line 418: </td> <td> Line 448: </td> </tr> <tr> <td> </td> <td> <span>+ * '''createNameValue''' - TODO<br> + <br> + * '''createString''' - TODO<br> + <br> + * '''createTable''' - TODO<br> + <br> + * '''createTorsion''' - TODO</span> </td> </tr> <tr> <td> Line 428: </td> <td> Line 465: </td> </tr> <tr> <td> </td> <td> <span>+ * '''createWrapper''' - TODO<br> + <br> + * '''createWrapperMetadata''' - TODO<br> + <br> + * '''createWrapperParameter''' - TODO<br> + <br> + * '''createWrapperProperty''' - TODO<br> + <br> + * '''createZMatrix''' - TODO</span> </td> </tr> <tr> <td> Line 442: </td> <td> Line 488: </td> </tr> <tr> <td> </td> <td> <span>+ <br> + * '''groupSiblings''' - TODO<br> + <br> + * '''joinArrays''' - TODO</span> </td> </tr> <tr> <td> Line 472: </td> <td> Line 522: </td> </tr> <tr> <td> </td> <td> <span>+ <br> + * '''reparse''' - TODO<br> + <br> + * '''setValue''' - TODO</span> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-03-05 21:53:40JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 325: </td> <td> Line 325: </td> </tr> <tr> <td> </td> <td> <span>+ }}}<br> + <br> + * '''createArray''' - this will create a cml:array at each of the nodes in the '''xpath''' query from the cml:scalar nodes generated by the '''from''' xpath query. If only one node is supplied, the contents of the node will be separated by whitespace and the array created from these. Arrays can only be created for integer or double data types. The scalar nodes with then be discarded.<br> + <br> + {{{<br> + &lt;transform process="createArray"<br> + xpath="."<br> + from="./cml:list[@cmlx:templateRef='length']/cml:scalar[@dictRef='g:symbol']"/&gt;</span> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-03-05 21:37:56JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 330: </td> <td> Line 330: </td> </tr> <tr> <td> <span>- &lt;transform process="createList" xpath=".//cml:module[@cmlx:templateRef='multipole']"/&gt;</span> </td> <td> <span>+ &lt;transform process="createList"<br> + xpath=".//cml:module[@cmlx:templateRef='multipole']"/&gt;<br> + }}}<br> + <br> + * '''createMolecule''' - this will create a molecule from the list of cml:arrays generated by the 'xpath' query. The length of the arrays indicates the number of atoms, and the '''dictRef''' attribute of each array determines the property of the atom it will be used for. Supported types are: '''x3''', '''y3''', '''z3''' for the coordinates, '''id''', '''elementType''', '''label''' and '''atomTypeRef'''. The molecule will be created as a child of the parent of the first array, and the arrays will then be discarded. The gaussian template [https://bitbucket.org/wwmm/jumbo-converters/src/9aa183a356dd/jumbo-converters-compchem/jumbo-converters-compchem-gaussian/src/main/resources/org/xmlcml/cml/converters/compchem/gaussian/log/templates/l202/l202.orient.xml l202.orient.xml] is shown below as an example.<br> + <br> + {{{<br> + &lt;template id="l202.orient" name="input or standard orientation" repeat="*"<br> + pattern="\s*(Input|Standard)\s*orientation:\s*$\s*\-+\s*"<br> + endPattern="\s*\d.*$\s*\-+\s*" endOffset="2"&gt;<br> + <br> + &lt;comment class="example.input" id="l202.orient"&gt;<br> + Input orientation:<br> + ---------------------------------------------------------------------<br> + Center Atomic Atomic Coordinates (Angstroms)<br> + Number Number Type X Y Z<br> + ---------------------------------------------------------------------<br> + 1 6 0 0.000000 0.000000 0.000000<br> + 2 1 0 0.000000 0.000000 1.093266<br> + 3 1 0 1.030741 0.000000 -0.364422<br> + 4 1 0 -0.515370 -0.892648 -0.364422<br> + 5 1 0 -0.515371 0.892648 -0.364422<br> + ---------------------------------------------------------------------<br> + &lt;/comment&gt;<br> + <br> + &lt;record repeat="5"/&gt;<br> + &lt;record repeat="*" makeArray="true" id="atom"&gt;{I,cc:serial}{I,cc:elementType}{I,g:atomicType}{F,cc:x3}{F,cc:y3}{F,cc:z3}&lt;/record&gt;<br> + &lt;record/&gt;<br> + <br> + &lt;transform process="createMolecule" xpath="./cml:list[@cmlx:templateRef='atom']/cml:array" id="mol.l202.orient"/&gt;<br> + &lt;transform process="pullupSingleton" xpath="./cml:list"/&gt;<br> + <br> + &lt;comment class="example.output" id="l202.orient"&gt;<br> + &lt;module cmlx:templateRef="l202.orient" xmlns="http://www.xml-cml.org/schema" xmlns:cmlx="http://www.xml-cml.org/schema/cmlx"&gt;<br> + &lt;molecule id="mol.l202.orient" cmlx:templateRef="atom"&gt;<br> + &lt;atomArray&gt;<br> + &lt;atom id="a1" elementType="C" x3="0.0" y3="0.0" z3="0.0"&gt;<br> + &lt;scalar dataType="xsd:integer" dictRef="cc:serial"&gt;1&lt;/scalar&gt;<br> + &lt;scalar dataType="xsd:integer" dictRef="g:atomicType"&gt;0&lt;/scalar&gt;<br> + &lt;scalar dataType="xsd:integer" dictRef="cc:atomicNumber"&gt;6&lt;/scalar&gt;<br> + &lt;/atom&gt;<br> + &lt;atom id="a2" elementType="H" x3="0.0" y3="0.0" z3="1.093266"&gt;<br> + &lt;scalar dataType="xsd:integer" dictRef="cc:serial"&gt;2&lt;/scalar&gt;<br> + &lt;scalar dataType="xsd:integer" dictRef="g:atomicType"&gt;0&lt;/scalar&gt;<br> + &lt;scalar dataType="xsd:integer" dictRef="cc:atomicNumber"&gt;1&lt;/scalar&gt;<br> + &lt;/atom&gt;<br> + &lt;atom id="a3" elementType="H" x3="1.030741" y3="0.0" z3="-0.364422"&gt;<br> + &lt;scalar dataType="xsd:integer" dictRef="cc:serial"&gt;3&lt;/scalar&gt;<br> + &lt;scalar dataType="xsd:integer" dictRef="g:atomicType"&gt;0&lt;/scalar&gt;<br> + &lt;scalar dataType="xsd:integer" dictRef="cc:atomicNumber"&gt;1&lt;/scalar&gt;<br> + &lt;/atom&gt;<br> + &lt;atom id="a4" elementType="H" x3="-0.51537" y3="-0.892648" z3="-0.364422"&gt;<br> + &lt;scalar dataType="xsd:integer" dictRef="cc:serial"&gt;4&lt;/scalar&gt;<br> + &lt;scalar dataType="xsd:integer" dictRef="g:atomicType"&gt;0&lt;/scalar&gt;<br> + &lt;scalar dataType="xsd:integer" dictRef="cc:atomicNumber"&gt;1&lt;/scalar&gt;<br> + &lt;/atom&gt;<br> + &lt;atom id="a5" elementType="H" x3="-0.515371" y3="0.892648" z3="-0.364422"&gt;<br> + &lt;scalar dataType="xsd:integer" dictRef="cc:serial"&gt;5&lt;/scalar&gt;<br> + &lt;scalar dataType="xsd:integer" dictRef="g:atomicType"&gt;0&lt;/scalar&gt;<br> + &lt;scalar dataType="xsd:integer" dictRef="cc:atomicNumber"&gt;1&lt;/scalar&gt;<br> + &lt;/atom&gt;<br> + &lt;/atomArray&gt;<br> + &lt;formula formalCharge="0" concise="C 1 H 4"&gt;<br> + &lt;atomArray elementType="C H" count="1.0 4.0"/&gt;<br> + &lt;/formula&gt;<br> + &lt;bondArray&gt;<br> + &lt;bond atomRefs2="a1 a2" id="a1_a2" order="S"/&gt;<br> + &lt;bond atomRefs2="a1 a3" id="a1_a3" order="S"/&gt;<br> + &lt;bond atomRefs2="a1 a4" id="a1_a4" order="S"/&gt;<br> + &lt;bond atomRefs2="a1 a5" id="a1_a5" order="S"/&gt;<br> + &lt;/bondArray&gt;<br> + &lt;property dictRef="cml:molmass"&gt;<br> + &lt;scalar dataType="xsd:double" units="unit:dalton" xmlns:unit="http://www.xml-cml.org/unit/si/"&gt;16.04246&lt;/scalar&gt;<br> + &lt;/property&gt;<br> + &lt;/molecule&gt;<br> + &lt;/module&gt;<br> + &lt;/comment&gt;<br> + &lt;/template&gt;<br> + }}}<br> + <br> + <br> + * '''createVector3''' - for each node specified in the '''xpath''' this will take the nodes listed in the '''to''' argument and create a cml:vector from them, and give it the specified '''dictRef'''. The '''to''' argument must return 3 cml:scalar nodes for this to work.<br> + <br> + {{{<br> + &lt;transform process="createVector3"<br> + xpath="."<br> + dictRef="g:coupling.ten"<br> + from="./cml:list/cml:list/cml:scalar[contains(@dictRef,'.a.t') or contains(@dictRef,'.b.t') or contains(@dictRef,'.c.t')]" /&gt;</span> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-03-05 18:59:13JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 327: </td> <td> Line 327: </td> </tr> <tr> <td> </td> <td> <span>+ * '''createList''' - this will take a list of nodes from the '''xpath''' and, if they are cml modules, it will convert them to cml lists.<br> + <br> + {{{<br> + &lt;transform process="createList" xpath=".//cml:module[@cmlx:templateRef='multipole']"/&gt;<br> + }}}<br> + <br> + </span> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-03-05 18:46:31JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 300: </td> <td> Line 300: </td> </tr> <tr> <td> </td> <td> <span>+ * '''addChild''' - this will create a child element of the nodes specified by the '''xpath'''. The only required argument is '''elementName''', which specifies the type of element to create. Other supported arguments are: '''id''', '''dictRef''' and '''value'''.<br> + <br> + {{{<br> + &lt;transform process="addChild"<br> + xpath="."<br> + elementName="cml:module"<br> + id="jobList1"<br> + dictRef="cc:jobList"/&gt;<br> + }}}<br> + <br> + </span> </td> </tr> <tr> <td> Line 359: </td> <td> Line 370: </td> </tr> <tr> <td> </td> <td> <span>+ <br> + * '''split''' - this will take the nodes in the '''xpath''' and split them (in places) according to their type - a scalar will be split by whitespace and turned into a list, a 1D array will be split into a cml list, and 2D arrays will be split into a list of separate arrays.<br> + <br> + {{{<br> + &lt;transform process="split" xpath=".//cml:array[@dictRef='cc:mulliken']"/&gt;<br> + }}}<br> + </span> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-03-05 17:31:19JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 300: </td> <td> Line 300: </td> </tr> <tr> <td> <span>-</span> * '''addDictRef''' - this will add a '''dictRef''' attribute with the specified value to the nodes defined by <span>the </span>xpath<span>&nbsp;expression</span>. </td> <td> <span>+</span> * '''addDictRef''' - this will add a '''dictRef''' attribute with the specified value to the nodes defined by <span>'''</span>xpath<span>'''</span>. </td> </tr> <tr> <td> Line 308: </td> <td> Line 308: </td> </tr> <tr> <td> </td> <td> <span>+ * '''copy''' - this copies the nodes defined by '''xpath''' to the xpath defined by the '''to''' argument, which is relative to the element being copied. e.g. if '''to''' is ".", then the element and its children will be copied to become children of itself. If the element has an '''id''' attribute, this will have the string ".copy" appended to it, if not, an id of "n.copy" will be created, where n is the index of the node in the original xpath.<br> + <br> + {{{<br> + &lt;transform process="copy"<br> + xpath="(//cml:list[@cmlx:templateRef='l914_excit2'])[1]"<br> + to="."/&gt;<br> + }}}<br> + </span> </td> </tr> <tr> <td> Line 312: </td> <td> Line 320: </td> </tr> <tr> <td> </td> <td> <span>+ xpath="(//cml:list[@cmlx:templateRef='l914_excit2'])[1]"/&gt;<br> + }}}<br> + <br> + * '''debugNodes''' - this just prints out the nodes selected by the xpath and is only useful for developing and debugging the transforms.<br> + <br> + {{{<br> + &lt;transform process="debugNodes"</span> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-03-05 16:51:30JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 298: </td> <td> Line 298: </td> </tr> <tr> <td> </td> <td> <span>+ === Key Transforms ===<br> + <br> + * '''addDictRef''' - this will add a '''dictRef''' attribute with the specified value to the nodes defined by the xpath expression.<br> + <br> + {{{<br> + &lt;transform process="addDictRef"<br> + xpath="//cml:property[cml:module[@cmlx:templateRef='l601.popanal']]"<br> + value="cc:popanal"/&gt;<br> + }}}<br> + <br> + * '''delete''' - this will delete the list of nodes defined by the xpath, along with all of their child nodes.<br> + <br> + {{{<br> + &lt;transform process="delete"<br> + xpath=".//cml:module[not(cml:array)]"/&gt;<br> + }}}<br> + <br> + * '''move''' - this takes one or more nodes, and moves them into the node defined by the '''to''' argument. The '''to''' argument is an xpath that must just return a single element.<br> + <br> + {{{<br> + &lt;transform process="move"<br> + to="."<br> + xpath=".//*[contains(@dictRef,':serial') or contains(@dictRef,':elementType') or contains(@dictRef,':isotop') or contains(@dictRef,':coupling')]" /&gt;<br> + }}}<br> + <br> + * '''moveRelative''' - this is similar to move, but the '''to''' argument is a xpath that is relative to the element being moved, so that if the '''xpath''' returns a list of elements scattered from throughout the document, each will be moved to the '''to''' relative to itself.<br> + <br> + {{{<br> + &lt;transform process="moveRelative"<br> + xpath="//cml:module[@cmlx:templateRef='l4601.virtual']"<br> + to="parent::*/parent::*/parent::*"/&gt;<br> + }}}<br> + <br> + * '''pullup''' - this takes one or more elements defined by an xpath and moves them up out of their current containing element, so that they become children of their current grandparent. The only argument required is the xpath of the nodes to be pulled up.<br> + <br> + {{{<br> + &lt;transform process="pullup"<br> + xpath=".//cml:module[@cmlx:templateRef='l1.version']/cml:*"/&gt;<br> + }}}<br> + <br> + * '''pullupSingleton''' - this takes one or more elements defined by an xpath and, if the element only has one child, replaces the element with the child, thereby deleting the original element and "pulling up" the child.<br> + <br> + {{{<br> + &lt;transform process="pullupSingleton"<br> + xpath=".//cml:list"/&gt;<br> + }}}<br> + </span> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-03-05 16:08:55JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 269: </td> <td> Line 269: </td> </tr> <tr> <td> </td> <td> <span>+ == Transforming the raw XML ==<br> + <br> + As has been mentioned, the parsing is a two-stage process, consisting of marking up the file with xml and then converting the raw XML to valid CML. In some cases, the raw XML may already be valid CML, but it most cases transforms will need to be applied.<br> + <br> + The transforms can either be applied within the template, after the text has been parsed and marked up, or as an entirely separate step, once the whole file has been parsed.<br> + <br> + Taking the current implementation of the Gaussian log parser as an example, the code for the [https://bitbucket.org/wwmm/jumbo-converters/src/9aa183a356dd/jumbo-converters-compchem/jumbo-converters-compchem-gaussian/src/main/java/org/xmlcml/cml/converters/compchem/gaussian/log/GaussianLog2CompchemConverter.java GaussianLog2CompchemConverter], reads in two files:<br> + <br> + * [https://bitbucket.org/wwmm/jumbo-converters/src/c25586883d1b/jumbo-converters-compchem/jumbo-converters-compchem-gaussian/src/main/resources/org/xmlcml/cml/converters/compchem/gaussian/log/templates/topTemplate.xml topTemplate.xml] - as mentioned above, this processes all of the templates.<br> + * [https://bitbucket.org/wwmm/jumbo-converters/src/9aa183a356dd/jumbo-converters-compchem/jumbo-converters-compchem-gaussian/src/main/resources/org/xmlcml/cml/converters/compchem/gaussian/log/gaussian2compchem.xml gaussian2compchem.xml] - this is run subsequent to the file being parsed and runs transforms on the XML. As with the templates, there are a number of transforms that are carried out, the code for which either resides in this file, or files which are '''included''' by it from the templates directory.<br> + <br> + The transformation process relies heavily on the powerful [http://en.wikipedia.org/wiki/XPath XPath] language. A short tutorial on xpath can be found [http://www.w3schools.com/xpath/ here].<br> + <br> + The transforms are carried out by elements like the following:<br> + <br> + {{{<br> + &lt;transform process="addAttribute" xpath="./cml:module[@cmlx:templateRef='job']" name="id" value="job" /&gt;<br> + }}}<br> + <br> + In this case, the attribute '''id="job"''' will be added to all cml modules that are direct children of the document, and have the '''templateRef''' "job".<br> + <br> + The transforms have a '''process''' which defines the operation that will be carried out, almost all have an '''xpath''' that is an xpath expression indicating the elements the process will be applied to, and a variable number of arguments, depending on the process being carried out.<br> + <br> + A brief overview of the key transformations follows below, however, for those with a strong constitution, a more comprehensive documentation can be found by examining the code in the file:<br> + <br> + * [https://bitbucket.org/wwmm/jumbo-converters/src/9aa183a356dd/jumbo-converters-core/src/main/java/org/xmlcml/cml/converters/text/TransformElement.java jumbo-converters-core/src/main/java/org/xmlcml/cml/converters/text/TransformElement.java]<br> + <br> + The text from ~ line 156, starting with the comment '''// process values''' lists the processes that are available.<br> + </span> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-03-05 14:50:48JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 46: </td> <td> Line 46: </td> </tr> <tr> <td> <span>-</span> If a pattern is matched, all subsequent text is "gobbled" and added to the module until the '''endPattern''' is reached, at which point the module is closed, and the next line of text is searched to see if it matches a pattern.<span><br> - <br> - WHAT HAPPENS IF THERE IS NO endPattern?</span> </td> <td> <span>+</span> If a pattern is matched, all subsequent text is "gobbled" and added to the module until the '''endPattern''' is reached, at which point the module is closed, and the next line of text is searched to see if it matches a pattern<span>&nbsp;(if there is no end pattern, the template will swallow all the remaining text in it's enclosing template)</span>. </td> </tr> <tr> <td> Line 117: </td> <td> Line 115: </td> </tr> <tr> <td> <span>- pattern="\s*Isotropic Fermi Contact Couplings.*" repeat="*"<br> - endPattern="\s*"&gt;<br> - }}}<br> - <br> - The template starts with the template tag. The id and name define the template and the cml module that this text will end up in. repeat="*" says that this template can appear multiple times in the file.</span> </td> <td> <span>+ pattern="\s*Isotropic Fermi Contact Couplings.*"<br> + repeat="*"<br> + endPattern="\s*"<br> + offset="0"<br> + endOffset="1"<br> + &gt;<br> + }}}<br> + <br> + The template starts with the template tag. The name is a descriptive name for the template, and the id serves to define the module that the text will be parsed into, so the module for this template will start with:<br> + <br> + {{{&lt;module cmlx:templateRef="l601.fermi" xmlns="http://www.xml-cml.org/schema" xmlns:cmlx="http://www.xml-cml.org/schema/cmlx"&gt;}}}<br> + <br> + repeat="*" says that this template can appear multiple times in the file.</span> </td> </tr> <tr> <td> Line 129: </td> <td> Line 135: </td> </tr> <tr> <td> <span>- The template will swallow all text until the end pattern is encountered, which is a containing nothing but spaces.<br> - <br> - This examples does not include the "endOffest=N", which indicates what happens to the file pointer within the template i.e. whether it rolls back so that the line that was matched by the template is available for processing within the template or not.</span> </td> <td> <span>+ The template will swallow all text until the end pattern is encountered, which is a line containing nothing but spaces.<br> + <br> + '''offset''' indicates where the text made available to the records inside the template starts. The default (0) means that the line that is matched by the '''pattern''' is part of the text within the template. An offset of 1, would mean that the line would not be available to the template, but would become part of the parent template.<br> + <br> + '''endOffset''' indicates where the text made available to the records inside the template stops. The default (0) means that the line that is matched by the '''endPattern''' is not part of the text within the template, but is pushed into the parent template. An offset of 1, would include the line in the template, 2 would mean that the line following the endPattern would also be included in this template<br> + </span> </td> </tr> <tr> <td> Line 177: </td> <td> Line 186: </td> </tr> <tr> <td> <span>-</span> The first integer will be labelled '''cc:serial''', the first string '''x:elementType''' etc. The "cc" stands for '''Computational Chemistry''', as these are attribute in the Computational Chemisty dictionary, and the "g" stands for an entry in the Gaussian dictionary. </td> <td> <span>+</span> The first integer will be labelled '''cc:serial''', the first string '''x:elementType''' etc. The "cc" stands for '''Computational Chemistry''', as these are attribute in the Computational Chemist<span>r</span>y dictionary, and the "g" stands for an entry in the Gaussian dictionary. </td> </tr> </table> </div> Tutorials and problemshttp://quixote.wikispot.org/Tutorials_and_problems2012-03-01 11:30:19JensThomas <div id="content" class="wikipage content"> Differences for Tutorials and problems<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 91: </td> <td> Line 91: </td> </tr> <tr> <td> <span>- <br> - Now cd into the ''&lt;project-root&gt;/jumbo-converters-compchem'' directory.<br> - <br> - {{{<br> - cd jumbo-converters-compchem<br> - }}}</span> </td> <td> <span>+ == Gaussian ==<br> + <br> + To use the Gaussian parser, cd into the ''&lt;project-root&gt;/jumbo-converters-compchem/jumbo-converters-compchem-gaussian'' directory.<br> + <br> + {{{<br> + cd jumbo-converters-compchem/jumbo-converters-compchem-gaussian<br> + }}}<br> + <br> + <br> + and run the converter. With the command:<br> + <br> + {{{<br> + mvn -e exec:java -Dexec.mainClass="org.xmlcml.cml.converters.compchem.gaussian.log.GaussianLog2CompchemConverter" -Dexec.args="logfile.out"<br> + }}}<br> + <br> + where '''logfile.out''' is the path to a gaussian logfile you wish to convert. This will then create a cml file named '''logfile.cml''' in the directory that the command was executed.<br> + <br> + '''mvn exec:java''' takes care of setting up the Java environment (the classpath) so as to locate all compiled files needed for the execution. For this to happen, the previous command must be executed at the jumbo-converters-compchem-gaussian folder.<br> + <br> + The argument '''-Dexec.mainClass''' states the Java class to execute (in this case, the Gaussian converter), and the '''-Dexec.args''' argument passes the space-separated strings in the argument as arguments to the main class.<br> + <br> + == NWChem ==</span> </td> </tr> <tr> <td> Line 110: </td> <td> Line 126: </td> </tr> <tr> <td> <span>- mvn exec:java takes care of setting up the Java environment (the classpath) so as to locate all compiled files needed for the execution. For this to happen,<br> - the previous command must be executed at the jumbo-converters-compchem-nwchem folder.<br> - <br> - The argument '''-Dexec.mainClass''' states the Java class to execute (in this case, the NWChem converter). If you want to pass command-line arguments to the main class,<br> - use the '''-Dexec.args''' argument.<br> - <br> - This will take one of the nwchem output files in '''jumbo-converters-compchem-nwchem/src/test/resources/compchem/nwchem/log/in/''' and create cml file in the directory '''jumbo-converters-compchem-nwchem/test'' folder (for details of what happens, see the file [https://bitbucket.org/wwmm/jumbo-converters/src/c25586883d1b/jumbo-converters-compchem/jumbo-converters-compchem-nwchem/src/main/java/org/xmlcml/cml/converters/compchem/nwchem/log/NWChemLog2CompchemConverter.java src/main/java/org/xmlcml/cml/converters/compchem/nwchem/log/NWChemLog2CompchemConverter.java].<br> - <br> - Similarly, to use the Gaussian parser:<br> - <br> - Go into the ''&lt;project-root&gt;/jumbo-converters-compchem/jumbo-converters-compchem-gaussian'' directory.<br> - <br> - {{{<br> - cd jumbo-converters-compchem/jumbo-converters-compchem-gaussian<br> - }}}<br> - <br> - <br> - and run the converter. With the command:<br> - <br> - {{{<br> - mvn -e exec:java -Dexec.mainClass=org.xmlcml.cml.converters.compchem.gaussian.log.GaussianLog2CompchemConverter<br> - }}}<br> - <br> - Once changed the source code, recompile (mvn clean install), and run the converter:<br> - <br> - {{{<br> - mvn -e exec:java -Dexec.mainClass=org.xmlcml.cml.converters.compchem.gaussian.log.GaussianLog2XMLConverter -Dexec.args="./src/test/resources/echeniquep/inputQC/g03_threeSPs/RHF_6-31Gdp_sp_HCO-L-Ala-NH2_hashP.log output.xml"<br> - }}}<br> - <br> - This will parse the input file '''./src/test/resources/echeniquep/inputQC/g03_threeSPs/RHF_6-31Gdp_sp_HCO-L-Ala-NH2_hashP.log''' and it will create the parsed file '''output.xml'''.<br> - </span> </td> <td> <span>+ This will take one of the nwchem output files in '''jumbo-converters-compchem-nwchem/src/test/resources/compchem/nwchem/log/in/''' and create cml file in the directory '''jumbo-converters-compchem-nwchem/test''' folder (for details of what happens, see the file [https://bitbucket.org/wwmm/jumbo-converters/src/c25586883d1b/jumbo-converters-compchem/jumbo-converters-compchem-nwchem/src/main/java/org/xmlcml/cml/converters/compchem/nwchem/log/NWChemLog2CompchemConverter.java src/main/java/org/xmlcml/cml/converters/compchem/nwchem/log/NWChemLog2CompchemConverter.java].</span> </td> </tr> </table> </div> Tutorials and problemshttp://quixote.wikispot.org/Tutorials_and_problems2012-03-01 09:40:22JensThomas <div id="content" class="wikipage content"> Differences for Tutorials and problems<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 107: </td> <td> Line 107: </td> </tr> <tr> <td> <span>-</span> mvn -e exec:java -Dexec.mainClass=org.xmlcml.cml.converters.compchem.nwchem.log.NWChemLog<span>2XML</span>Converter </td> <td> <span>+</span> mvn -e exec:java -Dexec.mainClass=org.xmlcml.cml.converters.compchem.nwchem.log.NWChemLog<span>Compchem</span>Converter </td> </tr> <tr> <td> Line 116: </td> <td> Line 116: </td> </tr> <tr> <td> <span>- Similarly, for the Gaussian parser:<br> - <br> - Go into the ''&lt;project-root&gt;/jumbo-converters-compchem'' directory.<br> - <br> - {{{<br> - cd jumbo-converters-compchem<br> - }}}<br> - <br> - To use the Gaussian converter, go into the jumbo-converters-compchem-gaussian folder:<br> - <br> - {{{<br> - cd jumbo-converters-compchem-gaussian<br> - }}}<br> - <br> - and run the converter. This currently does not work. Until it is fixed, you have to change the source code at ./src/main/java/org/xmlcml/cml/converters/compchem/gaussian/log/GaussianLog2XMLConverter.java<br> - to read the name of the input file and the output file from the command-line. Comment out the main method, and write the following instead:<br> - <br> - {{{<br> - public static void main(String[] args) throws IOException {<br> - CompchemTemplateConverter converter = new GaussianLog2XMLConverter();<br> - <br> - File in = new File(args[0]);<br> - File out = new File(args[1]);<br> - converter.convert(in, out);<br> - }</span> </td> <td> <span>+ This will take one of the nwchem output files in '''jumbo-converters-compchem-nwchem/src/test/resources/compchem/nwchem/log/in/''' and create cml file in the directory '''jumbo-converters-compchem-nwchem/test'' folder (for details of what happens, see the file [https://bitbucket.org/wwmm/jumbo-converters/src/c25586883d1b/jumbo-converters-compchem/jumbo-converters-compchem-nwchem/src/main/java/org/xmlcml/cml/converters/compchem/nwchem/log/NWChemLog2CompchemConverter.java src/main/java/org/xmlcml/cml/converters/compchem/nwchem/log/NWChemLog2CompchemConverter.java].<br> + <br> + Similarly, to use the Gaussian parser:<br> + <br> + Go into the ''&lt;project-root&gt;/jumbo-converters-compchem/jumbo-converters-compchem-gaussian'' directory.<br> + <br> + {{{<br> + cd jumbo-converters-compchem/jumbo-converters-compchem-gaussian<br> + }}}<br> + <br> + <br> + and run the converter. With the command:<br> + <br> + {{{<br> + mvn -e exec:java -Dexec.mainClass=org.xmlcml.cml.converters.compchem.gaussian.log.GaussianLog2CompchemConverter</span> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-02-29 18:33:19JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 259: </td> <td> Line 259: </td> </tr> <tr> <td> </td> <td> <span>+ <br> + = JUMBO-Converters filesystem structure =<br> + <br> + (I guess some of this is standard for Maven projects but my ignorance forces me to document everything. The bright side is that other newbies like me will feel happy!)<br> + <br> + Under<br> + <br> + {{{<br> + jumbo-converters/<br> + }}}<br> + <br> + we have<br> + <br> + {{{<br> + jumbo-converters/<br> + jumbo-converters-compchem/<br> + }}}<br> + <br> + which is where most of the important stuff for Quixote lies if we do not want to get into too many details.<br> + <br> + The two most important subfolders of this are<br> + <br> + {{{<br> + jumbo-converters/<br> + jumbo-converters-compchem/<br> + src/<br> + target/<br> + }}}<br> + <br> + The second one is where the final compiled Java classes are located ('''any more stuff?''') and we will not care about it for the moment. The {{{src}}} subfolder, as its name indicates, contains the source code associated to the compchem part of JUMBOconverters (i.e., the one most related to the Quixote project). Inside the {{{src}}} subfolder, we have the following chain of folders, at the bottom of which all Java source code is located:<br> + <br> + {{{<br> + jumbo-converters/<br> + jumbo-converters-compchem/<br> + src/<br> + main/<br> + java/<br> + org/<br> + xmlcml/<br> + cml/<br> + converters/<br> + }}}<br> + <br> + Inside {{{converters}}}, we have two main subfolders:<br> + <br> + {{{<br> + jumbo-converters/<br> + jumbo-converters-compchem/<br> + src/<br> + main/<br> + java/<br> + org/<br> + xmlcml/<br> + cml/<br> + converters/<br> + compchem/<br> + marker/<br> + }}}<br> + <br> + The most specific compchem code is in {{{compchem}}} (as you might have guessed!) ordered by the name of the compchem package ({{{gamessus}}}, {{{gaussian}}}, {{{nwchem}}}, etc.), and {{{marker}}} contains more general source code to support the former.<br> + <br> + If you are Java-savy, you might want to check these folders and read the code, but one of the great things about the declarative approach that [http://www-pmr.ch.cam.ac.uk/wiki/Main_Page PMR] has created into JUMBOconverters and we describe in this page is that you don't need to! If you know [http://en.wikipedia.org/wiki/Regular_expression regular expressions] and some very basic [http://www.w3.org/TR/xpath/ XPath] (both of which you could even infer from already made examples), that should be sufficient.<br> + <br> + One important thing to remember though, even if you don't plan to read the Java source code, is that the above folders structure translates into the names of the classes that do all the magic stuff, so, if you want to call these classes in the command line, like in<br> + <br> + {{{<br> + mvn -e exec:java -Dexec.mainClass=org.xmlcml.cml.converters.compchem.nwchem.log.NWChemLog2XMLConverter -Dexec.args="./src/test/resources/compchem/nwchem/log/in/test1.out ./test.cml"<br> + }}}<br> + <br> + you need to have this structure in mind.<br> + <br> + The declarative bits of the parsing infrastructure (i.e., what you, parsers developer, will have to check, understand and probably make a version for your favourite compchem code) are inside a similar folder tree under {{{src/main}}}:<br> + <br> + {{{<br> + jumbo-converters/<br> + jumbo-converters-compchem/<br> + src/<br> + main/<br> + resources/<br> + org/<br> + xmlcml/<br> + cml/<br> + converters/<br> + compchem/<br> + amber/<br> + gamessus/<br> + gaussian/<br> + nwchem/<br> + ...<br> + }}}<br> + <br> + Inside each code folder, one can find subfolders for the different types of file, and inside each one of them a {{{templates}}} subfolder, e.g.,<br> + <br> + {{{<br> + jumbo-converters/<br> + jumbo-converters-compchem/<br> + src/<br> + main/<br> + resources/<br> + org/<br> + xmlcml/<br> + cml/<br> + converters/<br> + compchem/<br> + gaussian/<br> + in/<br> + templates/<br> + log/<br> + templates/<br> + ...<br> + }}}<br> + <br> + In the rest of the sections and in some of the tutorials, we explain in detail how the different bits of declarative parsing are related and how everything works, but let us mention at this point that, at the filetype folders (i.e., at {{{in}}} or {{{log}}}) the top level parsing template list file {{{templateList.xml}}} can be found, while each one of the smaller templates included in this list are located in {{{templates}}}.<br> + <br> + Now, branching out at the same level as {{{main}}}, still inside {{{src}}}, we have a {{{test}}} subfolder, which contains, on the one hand (under {{{java}}}), the Java source code for performing automatic tests, and, on the other hand (under {{{resources}}}), a number of example files produced by the compchem codes that Quixote wants to tackle. The scheme of the folder tree is as follows:<br> + <br> + {{{<br> + jumbo-converters/<br> + jumbo-converters-compchem/<br> + src/<br> + test/<br> + java/<br> + org/<br> + xmlcml/<br> + cml/<br> + converters/<br> + compchem/<br> + amber/<br> + gamessus/<br> + gaussian/<br> + nwchem/<br> + ...<br> + resources/<br> + compchem/<br> + amber/<br> + gamessus/<br> + gaussian/<br> + in/<br> + log/<br> + ...<br> + nwchem/<br> + ...<br> + }}}<br> + <br> + A '''general scheme''' summarizing all the details commented above is the following:<br> + <br> + <br> + {{{<br> + jumbo-converters/ *** Main JUMBOconverters folder<br> + jumbo-converters-compchem/ *** Compchem JUMBOconverters<br> + src/ *** Source code and test files<br> + main/ *** Source code for the parsing machinery<br> + java/ *** Java source code<br> + org/<br> + xmlcml/<br> + cml/<br> + converters/<br> + compchem/<br> + marker/<br> + resources/ *** Declarative parsing source code<br> + org/<br> + xmlcml/<br> + cml/<br> + converters/<br> + compchem/<br> + amber/<br> + gamessus/<br> + gaussian/<br> + in/ *** Top level parsing directives<br> + templates/ *** Subparsers templates<br> + log/<br> + templates/<br> + ...<br> + nwchem/<br> + ...<br> + test/ *** Source code for the automatic testing<br> + java/<br> + org/<br> + xmlcml/<br> + cml/<br> + converters/<br> + compchem/<br> + amber/<br> + gamessus/<br> + gaussian/<br> + nwchem/<br> + ...<br> + resources/ *** Example test files<br> + compchem/<br> + amber/<br> + gamessus/<br> + gaussian/<br> + in/<br> + log/<br> + ...<br> + nwchem/<br> + ...<br> + target/ *** Compiled classes<br> + }}}</span> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-02-29 18:31:38JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 231: </td> <td> Line 231: </td> </tr> <tr> <td> <span>-</span> The individual test can be run from within Eclipse, but from the command-line, it only appears possible to run the entire TemplateTests, using the following command, whilst sat in the '''jumbo-converters/jumbo-converters-compchem/jumbo-converters-compchem-gaussian''' directory: </td> <td> <span>+</span> The individual test can be run from within Eclipse, but from the command-line, it only appears possible to run the entire TemplateTests<span>&nbsp;(see not below)</span>, using the following command, whilst sat in the '''jumbo-converters/jumbo-converters-compchem/jumbo-converters-compchem-gaussian''' directory: </td> </tr> <tr> <td> Line 239: </td> <td> Line 239: </td> </tr> <tr> <td> <span>-</span> {{{<span><br> - </span>==============template=================== </td> <td> <span>+</span> {{{==============template=================== </td> </tr> <tr> <td> Line 242: </td> <td> Line 241: </td> </tr> <tr> <td> <span>-</span> <span>==========</span>XMLDIFF reference<span>=========</span> </td> <td> <span>+</span> <span>&nbsp;&nbsp;&nbsp;&nbsp;</span>XMLDIFF reference </td> </tr> <tr> <td> Line 251: </td> <td> Line 250: </td> </tr> <tr> <td> <span>-</span> The chunk of test after the '''<span>'------------test---------------------'</span>''' line, and excluding the '''&lt;?xml version="1.0" encoding="UTF-8"?&gt;''' line is the output of the test. This should be checked, and if correct, placed in the '''&lt;comment class="example.output" id="l601.fermi"&gt;''' tag in the template. Re-running the test should then lead to a successful result. </td> <td> <span>+</span> The chunk of test after the '''<span>------------test---------------------</span>''' line, and excluding the '''&lt;?xml version="1.0" encoding="UTF-8"?&gt;''' line is the output of the test. This should be checked, and if correct, placed in the '''&lt;comment class="example.output" id="l601.fermi"&gt;''' tag in the template. Re-running the test should then lead to a successful result. </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-02-29 18:26:15JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 234: </td> <td> Line 234: </td> </tr> <tr> <td> <span>- mvn -Dtest="log.TemplateTes" test<br> - }}}</span> </td> <td> <span>+ mvn -Dtest="log.TemplateTest" test<br> + }}}<br> + <br> + The first time this is run, it will fail. However, it will print out the output of running the test, and something like the following:<br> + <br> + {{{<br> + ==============template===================<br> + Error: template expected:&lt;3&gt; but was:&lt;4&gt;<br> + ==========XMLDIFF reference=========<br> + <br> + ------------test---------------------<br> + &lt;?xml version="1.0" encoding="UTF-8"?&gt;<br> + &lt;module cmlx:templateRef="l601.fermi" xmlns="http://www.xml-cml.org/schema" xmlns:cmlx="http://www.xml-cml.org/schema/cmlx"&gt;<br> + &lt;list cmlx:lineCount="3" cmlx:templateRef="fermi.atom"&gt;<br> + &lt;array dataType="xsd:integer" dictRef="cc:serial" size="3"&gt;1 2 13&lt;/array&gt;<br> + }}}<br> + <br> + The chunk of test after the ''''------------test---------------------'''' line, and excluding the '''&lt;?xml version="1.0" encoding="UTF-8"?&gt;''' line is the output of the test. This should be checked, and if correct, placed in the '''&lt;comment class="example.output" id="l601.fermi"&gt;''' tag in the template. Re-running the test should then lead to a successful result.</span> </td> </tr> <tr> <td> Line 244: </td> <td> Line 260: </td> </tr> <tr> <td> <span>- <br> - <br> - = JUMBO-Converters filesystem structure =<br> - <br> - (I guess some of this is standard for Maven projects but my ignorance forces me to document everything. The bright side is that other newbies like me will feel happy!)<br> - <br> - Under<br> - <br> - {{{<br> - jumbo-converters/<br> - }}}<br> - <br> - we have<br> - <br> - {{{<br> - jumbo-converters/<br> - jumbo-converters-compchem/<br> - }}}<br> - <br> - which is where most of the important stuff for Quixote lies if we do not want to get into too many details.<br> - <br> - The two most important subfolders of this are<br> - <br> - {{{<br> - jumbo-converters/<br> - jumbo-converters-compchem/<br> - src/<br> - target/<br> - }}}<br> - <br> - The second one is where the final compiled Java classes are located ('''any more stuff?''') and we will not care about it for the moment. The {{{src}}} subfolder, as its name indicates, contains the source code associated to the compchem part of JUMBOconverters (i.e., the one most related to the Quixote project). Inside the {{{src}}} subfolder, we have the following chain of folders, at the bottom of which all Java source code is located:<br> - <br> - {{{<br> - jumbo-converters/<br> - jumbo-converters-compchem/<br> - src/<br> - main/<br> - java/<br> - org/<br> - xmlcml/<br> - cml/<br> - converters/<br> - }}}<br> - <br> - Inside {{{converters}}}, we have two main subfolders:<br> - <br> - {{{<br> - jumbo-converters/<br> - jumbo-converters-compchem/<br> - src/<br> - main/<br> - java/<br> - org/<br> - xmlcml/<br> - cml/<br> - converters/<br> - compchem/<br> - marker/<br> - }}}<br> - <br> - The most specific compchem code is in {{{compchem}}} (as you might have guessed!) ordered by the name of the compchem package ({{{gamessus}}}, {{{gaussian}}}, {{{nwchem}}}, etc.), and {{{marker}}} contains more general source code to support the former.<br> - <br> - If you are Java-savy, you might want to check these folders and read the code, but one of the great things about the declarative approach that [http://www-pmr.ch.cam.ac.uk/wiki/Main_Page PMR] has created into JUMBOconverters and we describe in this page is that you don't need to! If you know [http://en.wikipedia.org/wiki/Regular_expression regular expressions] and some very basic [http://www.w3.org/TR/xpath/ XPath] (both of which you could even infer from already made examples), that should be sufficient.<br> - <br> - One important thing to remember though, even if you don't plan to read the Java source code, is that the above folders structure translates into the names of the classes that do all the magic stuff, so, if you want to call these classes in the command line, like in<br> - <br> - {{{<br> - mvn -e exec:java -Dexec.mainClass=org.xmlcml.cml.converters.compchem.nwchem.log.NWChemLog2XMLConverter -Dexec.args="./src/test/resources/compchem/nwchem/log/in/test1.out ./test.cml"<br> - }}}<br> - <br> - you need to have this structure in mind.<br> - <br> - The declarative bits of the parsing infrastructure (i.e., what you, parsers developer, will have to check, understand and probably make a version for your favourite compchem code) are inside a similar folder tree under {{{src/main}}}:<br> - <br> - {{{<br> - jumbo-converters/<br> - jumbo-converters-compchem/<br> - src/<br> - main/<br> - resources/<br> - org/<br> - xmlcml/<br> - cml/<br> - converters/<br> - compchem/<br> - amber/<br> - gamessus/<br> - gaussian/<br> - nwchem/<br> - ...<br> - }}}<br> - <br> - Inside each code folder, one can find subfolders for the different types of file, and inside each one of them a {{{templates}}} subfolder, e.g.,<br> - <br> - {{{<br> - jumbo-converters/<br> - jumbo-converters-compchem/<br> - src/<br> - main/<br> - resources/<br> - org/<br> - xmlcml/<br> - cml/<br> - converters/<br> - compchem/<br> - gaussian/<br> - in/<br> - templates/<br> - log/<br> - templates/<br> - ...<br> - }}}<br> - <br> - In the rest of the sections and in some of the tutorials, we explain in detail how the different bits of declarative parsing are related and how everything works, but let us mention at this point that, at the filetype folders (i.e., at {{{in}}} or {{{log}}}) the top level parsing template list file {{{templateList.xml}}} can be found, while each one of the smaller templates included in this list are located in {{{templates}}}.<br> - <br> - Now, branching out at the same level as {{{main}}}, still inside {{{src}}}, we have a {{{test}}} subfolder, which contains, on the one hand (under {{{java}}}), the Java source code for performing automatic tests, and, on the other hand (under {{{resources}}}), a number of example files produced by the compchem codes that Quixote wants to tackle. The scheme of the folder tree is as follows:<br> - <br> - {{{<br> - jumbo-converters/<br> - jumbo-converters-compchem/<br> - src/<br> - test/<br> - java/<br> - org/<br> - xmlcml/<br> - cml/<br> - converters/<br> - compchem/<br> - amber/<br> - gamessus/<br> - gaussian/<br> - nwchem/<br> - ...<br> - resources/<br> - compchem/<br> - amber/<br> - gamessus/<br> - gaussian/<br> - in/<br> - log/<br> - ...<br> - nwchem/<br> - ...<br> - }}}<br> - <br> - A '''general scheme''' summarizing all the details commented above is the following:<br> - <br> - <br> - {{{<br> - jumbo-converters/ *** Main JUMBOconverters folder<br> - jumbo-converters-compchem/ *** Compchem JUMBOconverters<br> - src/ *** Source code and test files<br> - main/ *** Source code for the parsing machinery<br> - java/ *** Java source code<br> - org/<br> - xmlcml/<br> - cml/<br> - converters/<br> - compchem/<br> - marker/<br> - resources/ *** Declarative parsing source code<br> - org/<br> - xmlcml/<br> - cml/<br> - converters/<br> - compchem/<br> - amber/<br> - gamessus/<br> - gaussian/<br> - in/ *** Top level parsing directives<br> - templates/ *** Subparsers templates<br> - log/<br> - templates/<br> - ...<br> - nwchem/<br> - ...<br> - test/ *** Source code for the automatic testing<br> - java/<br> - org/<br> - xmlcml/<br> - cml/<br> - converters/<br> - compchem/<br> - amber/<br> - gamessus/<br> - gaussian/<br> - nwchem/<br> - ...<br> - resources/ *** Example test files<br> - compchem/<br> - amber/<br> - gamessus/<br> - gaussian/<br> - in/<br> - log/<br> - ...<br> - nwchem/<br> - ...<br> - target/ *** Compiled classes<br> - }}}</span> </td> <td> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-02-29 18:19:23JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 231: </td> <td> Line 231: </td> </tr> <tr> <td> </td> <td> <span>+ The individual test can be run from within Eclipse, but from the command-line, it only appears possible to run the entire TemplateTests, using the following command, whilst sat in the '''jumbo-converters/jumbo-converters-compchem/jumbo-converters-compchem-gaussian''' directory:<br> + <br> + {{{<br> + mvn -Dtest="log.TemplateTes" test<br> + }}}<br> + <br> + '''Note:'''' The discussion at [http://stackoverflow.com/questions/1873995/run-a-single-test-method-with-maven stackoverflow] and [http://maven.apache.org/plugins/maven-surefire-plugin/examples/single-test.html maven documentation] suggests that the following syntax should work:<br> + <br> + {{{<br> + mvn -Dtest="log.TemplateTest#testl601Fermi" test<br> + }}}<br> + <br> + But this appears not to be the case. Are we using the junit &lt; 4.7?</span> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-02-29 16:45:33JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 2: </td> <td> Line 2: </td> </tr> <tr> <td> <span>- </span> </td> <td> </td> </tr> <tr> <td> Line 214: </td> <td> Line 213: </td> </tr> <tr> <td> </td> <td> <span>+ The output has been parsed into a module with the templateRef "l601.fermi", which is the id of the template. We then have the 5 arrays (I,A,I,F,F,F,F) parsed by the "fermi.atom" record as a list, followed by that for the "fermi.spindipole" record etc.<br> + <br> + == Testing and developing templates ==<br> + <br> + The templates contain their own internal testing framework, in the form of two of the comment blocks within them.<br> + <br> + The comment block with the class "example.input" should contain a small representative chunk of text that the parsers can be tested with.<br> + <br> + For the Gaussian logfile templates we have been dealing with here, the code that runs these tests lives in the file:<br> + <br> + [https://bitbucket.org/wwmm/jumbo-converters/src/c25586883d1b/jumbo-converters-compchem/jumbo-converters-compchem-gaussian/src/test/java/org/xmlcml/cml/converters/compchem/gaussian/log/TemplateTest.java jumbo-converters-compchem/jumbo-converters-compchem-gaussian/src/test/java/org/xmlcml/cml/converters/compchem/gaussian/log/TemplateTest.java]<br> + <br> + To test and develop an individual template (we will continue using the l601.fermi template as an example), the following line needs to be added to the TemplateTest.java file.<br> + <br> + {{{<br> + @Test public void testl601Fermi() {runTemplateTest("l601/", "l601.fermi");}<br> + }}}<br> + <br> + </span> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-02-29 16:27:50JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 37: </td> <td> Line 37: </td> </tr> <tr> <td> <span>- </span> </td> <td> </td> </tr> <tr> <td> Line 59: </td> <td> Line 58: </td> </tr> <tr> <td> <span>- </span> </td> <td> </td> </tr> <tr> <td> Line 73: </td> <td> Line 71: </td> </tr> <tr> <td> <span>- ...<br> - *********************************************<br> - Gaussian 03: x86-Linux-G03RevB.04 2-Jun-2003<br> - 20-Nov-2006<br> - *********************************************<br> - --------------------------<br> - #N B3LYP/6-31G(d) OPT FREQ<br> - --------------------------<br> - 1/14=-1,18=20,26=3,38=1/1,3;<br> - ...<br> - ...<br> - Job cpu time: 0 days 0 hours 0 minutes 16.2 seconds.<br> - File lengths (MBytes): RWF= 12 Int= 0 D2E= 0 Chk= 7 Scr= 1<br> - Normal termination of Gaussian 03 at Mon Nov 20 14:40:23 2006.<br> - Link1: Proceeding to internal job step number 2.<br> - ------------------------------------------------------------------<br> - #N Geom=AllCheck Guess=Read SCRF=Check GenChk RB3LYP/6-31G(d) Freq<br> - ------------------------------------------------------------------<br> - 1/10=4,29=7,30=1,38=1,40=1,46=1/1,3;<br> - ...</span> </td> <td> <span>+ ...</span> </td> </tr> <tr> <td> Line 96: </td> <td> Line 75: </td> </tr> <tr> <td> <span>- </span> </td> <td> </td> </tr> <tr> <td> Line 104: </td> <td> Line 82: </td> </tr> <tr> <td> <span>- &lt;xi:include href="l1/l1.legal.xml"/&gt;<br> - &lt;xi:include href="l1/l1.citation.xml"/&gt;<br> - &lt;xi:include href="l1/l1.end.xml"/&gt;<br> - &lt;xi:include href="l716.xml"/&gt;</span> </td> <td> </td> </tr> <tr> <td> Line 110: </td> <td> Line 84: </td> </tr> <tr> <td> <span>-</span> &lt;!-- Many more templates --&gt; </td> <td> <span>+ <br> + </span> &lt;!-- Many more templates --&gt;<span><br> + </span> </td> </tr> <tr> <td> Line 114: </td> <td> Line 90: </td> </tr> <tr> <td> <span>- </span> </td> <td> </td> </tr> <tr> <td> Line 121: </td> <td> Line 96: </td> </tr> <tr> <td> <span>-</span> There is then a comment containing some example text to show what this template parses<span>&nbsp;- as will be explained later, this is also used for testing the template</span>.<br> <span>-</span> <br> <span>-</span> There is then a template list, which is the container that holds the templates. Th<span>e</span> first template<span>l</span>ist only holds one template, which is the template that parses the various links (modules) within the Gaussian program. This template <span>itsel</span>f<span>&nbsp;then contains a</span> template list, which includes the other templates that will process the text found by the parent template. </td> <td> <span>+</span> There is then a comment containing some example text to show what this template parses.<br> <span>+</span> <br> <span>+</span> There is then a template list, which is the container that holds the templates. Th<span>is</span> first template<span>L</span>ist only holds one template, which is the template that parses the various links (modules) within the Gaussian program. This template <span>then contains a </span>f<span>urther</span> template list, which includes the other templates that will process the text found by the parent template. </td> </tr> <tr> <td> Line 129: </td> <td> Line 104: </td> </tr> <tr> <td> <span>- </span> </td> <td> </td> </tr> <tr> <td> Line 140: </td> <td> Line 114: </td> </tr> <tr> <td> <span>- Using '''l601.popanal.xml''' as an example, we will go through the template and describe what each bit does.<br> - <br> - {{{<br> - &lt;template id="l301.basis" name="basis" repeat="*"<br> - pattern="\s*Standard basis.*" endPattern="\s*NAtoms=.*"<br> - endOffset="1"<br> - &gt;<br> - }}}<br> - <br> - The template starts with the template tag. The id and name define the template, repeat="*" says that it can occur multiple times in the file. The '''pattern''' is a regular expression stating that this template will start when it matches a line that starts with spaces followed by the phrase "Standard basis" followed by any text.<br> - <br> - The template will swallow all text until the end pattern is encountered, which is a line with spaces followed by '''NAtoms=''' followed by some arbitrary text.<br> - <br> - The endOffest indicates what happens to the file pointer within the template i.e. whether it rolls back so that the line that was matched by the template is available for processing within the template or not.<br> - </span> </td> <td> <span>+ Using '''l601.fermi.xml''' as an example, we will go through the template and describe what each bit does.<br> + <br> + {{{<br> + &lt;template id="l601.fermi" name="Isotropic Fermi Contact Couplings"<br> + pattern="\s*Isotropic Fermi Contact Couplings.*" repeat="*"<br> + endPattern="\s*"&gt;<br> + }}}<br> + <br> + The template starts with the template tag. The id and name define the template and the cml module that this text will end up in. repeat="*" says that this template can appear multiple times in the file.<br> + <br> + The '''pattern''' is a regular expression stating that this template will start when it matches the line:<br> + <br> + {{{<br> + Isotropic Fermi Contact Couplings<br> + }}}<br> + <br> + The template will swallow all text until the end pattern is encountered, which is a containing nothing but spaces.<br> + <br> + This examples does not include the "endOffest=N", which indicates what happens to the file pointer within the template i.e. whether it rolls back so that the line that was matched by the template is available for processing within the template or not.</span> </td> </tr> <tr> <td> Line 159: </td> <td> Line 137: </td> </tr> <tr> <td> <span>- &lt;comment class="example.input" id="l301.basis.09"&gt;<br> - Standard basis: 3-21G (6D, 7F)<br> - Ernie: Thresh= 0.10000D-02 Tol= 0.10000D-05 Strict=F.<br> - There are 7 symmetry adapted basis functions of AG symmetry.<br> - There are 0 symmetry adapted basis functions of B1G symmetry.<br> - There are 2 symmetry adapted basis functions of B2G symmetry.<br> - There are 4 symmetry adapted basis functions of B3G symmetry.<br> - There are 0 symmetry adapted basis functions of AU symmetry.<br> - There are 7 symmetry adapted basis functions of B1U symmetry.<br> - There are 4 symmetry adapted basis functions of B2U symmetry.<br> - There are 2 symmetry adapted basis functions of B3U symmetry.<br> - Integral buffers will be 131072 words long.<br> - Raffenetti 1 integral format.<br> - Two-electron integral symmetry is turned on.<br> - 26 basis functions, 42 primitive gaussians, 26 cartesian basis functions<br> - 8 alpha electrons 8 beta electrons<br> - nuclear repulsion energy 33.7515964544 Hartrees.<br> - IExCor= 0 DFT=F Ex=HF Corr=None ExCW=0 ScaHFX= 1.000000<br> - ScaDFX= 1.000000 1.000000 1.000000 1.000000 ScalE2= 1.000000 1.000000<br> - IRadAn= 0 IRanWt= -1 IRanGd= 0 ICorTp=0<br> - NAtoms= 6 NActive= 6 NUniq= 2 SFac= 4.00D+00 NAtFMM= 50 NAOKFM=F Big=F</span> </td> <td> <span>+ &lt;comment class="example.input" id="l601.fermi"&gt;<br> + Isotropic Fermi Contact Couplings<br> + Atom a.u. MegaHertz Gauss 10(-4) cm-1<br> + 1 C(13) 0.02539 28.54777 10.18656 9.52251<br> + 2 C(13) 0.00582 6.54434 2.33518 2.18296<br> + 13 Cl(35) 0.05688 24.94015 8.89927 8.31914<br> + --------------------------------------------------------<br> + Center ---- Spin Dipole Couplings ----<br> + 3XX-RR 3YY-RR 3ZZ-RR<br> + --------------------------------------------------------<br> + 1 Atom 0.005300 -0.061839 0.056540<br> + 2 Atom -0.039723 -0.068059 0.107782<br> + 13 Atom 0.621221 -2.038530 1.417309<br> + --------------------------------------------------------<br> + XY XZ YZ<br> + --------------------------------------------------------<br> + 1 Atom 0.000010 0.095387 0.000013<br> + 2 Atom 0.005157 0.081893 0.006262<br> + 13 Atom 0.000344 3.043747 0.000390<br> + --------------------------------------------------------<br> + </span> </td> </tr> <tr> <td> Line 188: </td> <td> Line 166: </td> </tr> <tr> <td> <span>- &lt;record id="basis"&gt;\s*Standard basis: {A,cc:basis} {X,cc:diffuse}&lt;/record&gt;<br> - }}}<br> - <br> - The first line reads a record from the file. '''repeat''' is omitted so defaults to one, so this reads a single line that starts with spaces and the phrase "Standard basis:" The terms that follow in the brackets determine what will be marked up into the xml.<br> - <br> - * '''{A,cc:basis}''' this will read the data as a string (A) and put it in a term with the '''dictRef''' cc:basis. The cc prefix is the namespace of the computational chemistry dictionary, so this term is a '''basis''' as that term is defined in the computational chemistry dictionary.<br> - * '''{X,cc:diffuse}''' the X is actually used to discard data, so this will swallow the text that comes after the character data. The dictRef is irrelevant in this case and could be anything.<br> - <br> - Other values for data that can be read are:<br> - * F - float<br> - * I - integer<br> - <br> - {{{<br> - &lt;templateList id="ernie"&gt;<br> - &lt;template pattern="\s*Ernie.*" endPattern=".*" id="ernie"&gt;<br> - &lt;record id="ernie" repeat="*"&gt;\s*Ernie: Thresh={E,g:thresh}\s Tol={E,g:tol}\sStrict={A,g:strict}\.\s*&lt;/record&gt;<br> - &lt;/template&gt;<br> - &lt;/templateList&gt;<br> - &lt;templateList id="symadnucl"&gt;<br> - &lt;template pattern="\s*There are.*" endPattern="\s*nuclear repulsion.*" id="symaddnuc" endOffset="1"&gt;<br> - &lt;record id="symmadapt" repeat="*" makeArray="true"&gt;\s* There are{I,cc:adapted} symmetry adapted basis functions of{A,cc:symm}symmetry\.\s*&lt;/record&gt;<br> - &lt;record id="buffer"&gt;\s*Integral buffers will be {I,g:buffer}\s*words long\.\s*&lt;/record&gt;<br> - &lt;record id="raff"&gt;\s*{A,g:raffenetti}\s{I,g:raff}\sintegral format\.\s*&lt;/record&gt;<br> - &lt;record id="twoe"&gt;\s*{X,g:twoe} integral symmetry is turned on\.\s*&lt;/record&gt;<br> - &lt;record id="basiscount"&gt;\s*{I,cc:basiscount}basis functions,{I,g:primbasis}primitive gaussians,{I,cc:cartesianbasis}cartesian basis functions\s*&lt;/record&gt;<br> - &lt;record id="alphabeta"&gt;\s*{I,cc:alphae}alpha electrons\s*{I,cc:betae}beta electrons\s*&lt;/record&gt;<br> - &lt;record id="nucrep"&gt;\s*nuclear repulsion energy\s*{F,cc:nucrepener}Hartrees\.\s*&lt;/record&gt;<br> - &lt;/template&gt;<br> - &lt;/templateList&gt;<br> - &lt;templateList id="natoms"&gt;<br> - &lt;template pattern="\s*NAtoms=.*" repeat="*" endPattern=".*" id="natoms"&gt;<br> - &lt;record id="natoms" repeat="*"&gt;\s*NAtoms={I,cc:natoms}\sNActive={I,cc:nactiveatoms}\sNUniq={I,cc:uniqatoms}\sSFac={E,g:sfac}\sNAtFMM={I,g:natfmm}.*\sBig={A,g:big}\s*&lt;/record&gt;<br> - &lt;/template&gt;<br> - &lt;/templateList&gt;<br> - &lt;templateList id="misc"&gt;<br> - &lt;template pattern="\s*((IExCor)|(ScaDFX)|(IRadAn)).*" repeat="*" endPattern=".*" id="misc"&gt;<br> - &lt;record id="misc"&gt;\s{1_20A,g:misc}\s*&lt;/record&gt;<br> - &lt;/template&gt;<br> - &lt;/templateList&gt;<br> - }}}<br> - <br> - This starts reading into a module with a templateRef of '''l202''' (from the id) when it encounters the regular expression ".*Enter.*l202.*" and stops when it encounters the regular expression ".*Leave Link +202 .*"<br> - <br> - The '''l202''' module is then processed by matching each line against the patterns in each of the included templates, which in this case is just '''coord.input.temp.xml'''.<br> - <br> - If we look at '''coord.input.temp.xml''':<br> - {{{<br> - &lt;template id="gau:coord" name="Coordinates"<br> - pattern=".*Input orientation:.*" &gt;<br> - &lt;comment class="example"&gt;<br> - Input orientation:<br> - ---------------------------------------------------------------------<br> - Center Atomic Atomic Coordinates (Angstroms)<br> - Number Number Type X Y Z<br> - ---------------------------------------------------------------------<br> - 1 1 0 -3.004390 0.907942 -0.042084<br> - 2 6 0 -2.114020 0.295280 -0.082179<br> - 3 7 0 -0.993069 1.029184 -0.183798<br> - ---------------------------------------------------------------------<br> - &lt;/comment&gt;<br> - &lt;record repeatCount="5" formatType="REGEX"&gt;(.*)&lt;/record&gt;<br> - &lt;record formatType="REGEX" id="charges" makeArray="true"<br> - repeatCount="*"&gt;{I,c:ser} {I,c:atnum} {I,gau:type} {F,c:x3} {F,c:y3} {F,c:z3}&lt;/record&gt;<br> - &lt;record formatType="REGEX"&gt;(.*)&lt;/record&gt;<br> - &lt;/template&gt;<br> - }}}<br> - <br> - This matches a line with the phrase "Input Orientation:" on it and as there is no endPattern, it carries on reading until the end of the module (CORRECT?).<br> - <br> - Each record in the template is then processed in turn.<br> - <br> - The first record reads five lines, and as there is no id, the lines are discarded.<br> - <br> - The next record reads as many lines (repeatCount="*") as match the expression contained within the record into an entity called "charges". The expression that is expected to be matched is 3 integers followed by the floats. The first integer will be labelled''' c:ser''', the second '''c:atnum''' etc. The "c" stands for '''cml''', as these are attribute in the CML dictionary, and the "g" stands for an entry in the Gaussian dictionary.</span> </td> <td> <span>+ &lt;record repeat="2"/&gt;<br> + &lt;record id="fermi.atom" repeat="*" makeArray="true"&gt;\s*{I,cc:serial}{A,x:elementType}\({I,x:isotopeNumber}\)\s{F,cc:coupling,u:au}\s{F,cc:coupling,u:mhz}\s{F,cc:coupling,u:gauss}\s{F,cc:coupling,u:ten4cm-1}\s*&lt;/record&gt;<br> + &lt;record repeat="4"/&gt;<br> + &lt;record id="fermi.spindipole" repeat="*" makeArray="true"&gt;\s*{I,cc:serial}\s*Atom\s*{F,g:spindipole.xx}{F,g:spindipole.yy}{F,g:spindipole.zz}&lt;/record&gt;<br> + &lt;record repeat="3"/&gt;<br> + &lt;record id="fermi.spindipole" repeat="*" makeArray="true"&gt;\s*{I,cc:serial}\s*Atom\s*{F,g:spindipole.xy}{F,g:spindipole.xz}{F,g:spindipole.yz}&lt;/record&gt;<br> + &lt;record repeat="1"/&gt;}}}<br> + <br> + The first line just reads 2 lines from the file. As there is no id on the record, the lines are discarded.<br> + <br> + The next record reads as many lines (repeat="*") as match the expression contained within the record into an entity called "fermi.atom". The expression that is expected to be matched is an integer (I), followed by a character string (A), followed by another integer (I) and then four floats (F).<br> + <br> + The first integer will be labelled '''cc:serial''', the first string '''x:elementType''' etc. The "cc" stands for '''Computational Chemistry''', as these are attribute in the Computational Chemisty dictionary, and the "g" stands for an entry in the Gaussian dictionary.</span> </td> </tr> <tr> <td> Line 265: </td> <td> Line 182: </td> </tr> <tr> <td> <span>- An example result of this, is the following:<br> - <br> - {{{<br> - &lt;module lineCount="31" templateRef="gau:coord"&gt;<br> - −<br> - &lt;list lineCount="7" templateRef="charges"&gt;<br> - &lt;array dataType="xsd:integer" dictRef="c:ser" size="7" delimiter=""&gt;1 2 3 4 5 6 7&lt;/array&gt;<br> - &lt;array dataType="xsd:integer" dictRef="c:atnum" size="7" delimiter=""&gt;6 1 1 1 6 1 1&lt;/array&gt;<br> - &lt;array dataType="xsd:integer" dictRef="gau:type" size="7" delimiter=""&gt;0 0 0 0 0 0 0&lt;/array&gt;<br> - −<br> - &lt;array dataType="xsd:double" dictRef="c:x3" size="7" delimiter=""&gt;<br> - 0.91615 1.82641 1.00333 0.95414 -0.31877 -0.7203 -0.77165<br> - &lt;/array&gt;<br> - −<br> - &lt;array dataType="xsd:double" dictRef="c:y3" size="7" delimiter=""&gt;<br> - 0.07243 -0.22339 1.14742 -0.44492 -0.26078 0.42201 -1.24177<br> - &lt;/array&gt;<br> - −<br> - &lt;array dataType="xsd:double" dictRef="c:z3" size="7" delimiter=""&gt;<br> - 4.99221 5.53974 4.80723 4.02871 5.75176 6.49042 5.67704<br> - &lt;/array&gt;<br> - &lt;/list&gt;<br> - }}}<br> - </span> </td> <td> <span>+ Once this pattern stops being matched, the parser will skip 4 lines (&lt;record repeat="4"/&gt;) and then the next record will process a matching block of Integers and Floats into arrays as described earlier.<br> + <br> + Finally we get to the last block in the template, which is a comment if class "example.output" showing what this template processes when fed the text that was in the "example.input" comment.<br> + <br> + {{{<br> + &lt;comment class="example.output" id="l601.fermi"&gt;<br> + &lt;module cmlx:templateRef="l601.fermi" xmlns="http://www.xml-cml.org/schema" xmlns:cmlx="http://www.xml-cml.org/schema/cmlx"&gt;<br> + &lt;list cmlx:lineCount="3" cmlx:templateRef="fermi.atom"&gt;<br> + &lt;array dataType="xsd:integer" dictRef="cc:serial" size="3"&gt;1 2 13&lt;/array&gt;<br> + &lt;array dataType="xsd:string" dictRef="x:elementType" size="3"&gt;C C Cl&lt;/array&gt;<br> + &lt;array dataType="xsd:integer" dictRef="x:isotopeNumber" size="3"&gt;13 13 35&lt;/array&gt;<br> + &lt;array dataType="xsd:double" dictRef="cc:coupling" size="3"&gt;0.02539 0.00582 0.05688&lt;/array&gt;<br> + &lt;array dataType="xsd:double" dictRef="cc:coupling" size="3"&gt;28.54777 6.54434 24.94015&lt;/array&gt;<br> + &lt;array dataType="xsd:double" dictRef="cc:coupling" size="3"&gt;10.18656 2.33518 8.89927&lt;/array&gt;<br> + &lt;array dataType="xsd:double" dictRef="cc:coupling" size="3"&gt;9.52251 2.18296 8.31914&lt;/array&gt;<br> + &lt;/list&gt;<br> + &lt;list cmlx:lineCount="3" cmlx:templateRef="fermi.spindipole"&gt;<br> + &lt;array dataType="xsd:integer" dictRef="cc:serial" size="3"&gt;1 2 13&lt;/array&gt;<br> + &lt;array dataType="xsd:double" dictRef="g:spindipole.xx" size="3"&gt;0.0053 -0.039723 0.621221&lt;/array&gt;<br> + &lt;array dataType="xsd:double" dictRef="g:spindipole.yy" size="3"&gt;-0.061839 -0.068059 -2.03853&lt;/array&gt;<br> + &lt;array dataType="xsd:double" dictRef="g:spindipole.zz" size="3"&gt;0.05654 0.107782 1.417309&lt;/array&gt;<br> + &lt;/list&gt;<br> + &lt;list cmlx:lineCount="3" cmlx:templateRef="fermi.spindipole"&gt;<br> + &lt;array dataType="xsd:integer" dictRef="cc:serial" size="3"&gt;1 2 13&lt;/array&gt;<br> + &lt;array dataType="xsd:double" dictRef="g:spindipole.xy" size="3"&gt;1.0E-5 0.005157 3.44E-4&lt;/array&gt;<br> + &lt;array dataType="xsd:double" dictRef="g:spindipole.xz" size="3"&gt;0.095387 0.081893 3.043747&lt;/array&gt;<br> + &lt;array dataType="xsd:double" dictRef="g:spindipole.yz" size="3"&gt;1.3E-5 0.006262 3.9E-4&lt;/array&gt;<br> + &lt;/list&gt;<br> + &lt;/module&gt;<br> + &lt;/comment&gt;<br> + }}}</span> </td> </tr> </table> </div> Declarative parsing syntaxhttp://quixote.wikispot.org/Declarative_parsing_syntax2012-02-29 15:58:35JensThomas <div id="content" class="wikipage content"> Differences for Declarative parsing syntax<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 10: </td> <td> Line 10: </td> </tr> <tr> <td> <span>- {{{<br> - jumbo-converters/jumbo-converters-compchem/src/main/resources/org/xmlcml/cml/converters/compchem/gaussian/log<br> - }}}</span> </td> <td> <span>+ [https://bitbucket.org/wwmm/jumbo-converters/src/c25586883d1b/jumbo-converters-compchem/jumbo-converters-compchem-gaussian/src/main/resources/org/xmlcml/cml/converters/compchem/gaussian/log/templates/topTemplate.xml jumbo-converters-compchem/jumbo-converters-compchem-gaussian/src/main/resources/org/xmlcml/cml/converters/compchem/gaussian/log/templates/topTemplate.xml]</span> </td> </tr> <tr> <td> Line 36: </td> <td> Line 34: </td> </tr> <tr> <td> <span>-</span> For example, in the case of the list in the previous section, the <span>first item in the </span>list includes the file {{{l1.<span>tem</span>p.xml}}}, which is located in </td> <td> <span>+</span> For example, in the case of the list in the previous section, the list includes the file {{{l<span>60</span>1.p<span>opanal</span>.xml}}}, which is located in </td> </tr> <tr> <td> Line 38: </td> <td> Line 36: </td> </tr> <tr> <td> <span>- {{{<br> - jumbo-converters/jumbo-converters-compchem/src/main/resources/org/xmlcml/cml/converters/compchem/gaussian/log/templates<br> - }}}</span> </td> <td> <span>+ [https://bitbucket.org/wwmm/jumbo-converters/src/c25586883d1b/jumbo-converters-compchem/jumbo-converters-compchem-gaussian/src/main/resources/org/xmlcml/cml/converters/compchem/gaussian/log/templates/l601 jumbo-converters-compchem/jumbo-converters-compchem-gaussian/src/main/resources/org/xmlcml/cml/converters/compchem/gaussian/log/templates/l601/]</span> </td> </tr> <tr> <td> Line 42: </td> <td> Line 38: </td> </tr> <tr> <td> <span>-</span> together with al<span>l</span> template files e<span>xcept for {{{topTemp</span>late.<span>xml}}}-</span> </td> <td> <span>+</span> together with <span>sever</span>al<span>&nbsp;other</span> template files <span>r</span>elate<span>d to the l601 link</span>. </td> </tr> <tr> <td> Line 47: </td> <td> Line 43: </td> </tr> <tr> <td> <span>- &lt;template id="l1" repeatCount="*" pattern=".*Enter.*l1[^\d].*"<br> - endPattern=".*Leave Link +1[^\d].*"&gt;<br> - &lt;comment&gt;</span> </td> <td> <span>+ &lt;template id="l601.popanal" name="Population analysis using the SCF density"<br> + repeat="*"<br> + pattern="\s*\*+\s*$\s*$\s*Population analysis using the SCF density.*"<br> + endPattern="\sN\-N\=.*" endOffset="1"<br> + xmlns:xi="http://www.w3.org/2001/XInclude"<br> + &gt;<br> + &lt;comment class="example.input" id="l601.popanal"&gt;</span> </td> </tr> <tr> <td> Line 51: </td> <td> Line 51: </td> </tr> <tr> <td> <span>- &lt;/comment&gt;<br> - &lt;templateList id='l1_list' xmlns:xi="http://www.w3.org/2001/XInclude"&gt;<br> - &lt;xi:include href="l1.programversion.temp.xml"/&gt;</span> </td> <td> <span>+ &lt;/comment&gt;<br> + <br> + &lt;record repeat="5"/&gt;<br> + &lt;record repeat="2"/&gt;<br> + &lt;templateList&gt;<br> + &lt;xi:include href="l601.condensed.xml"/&gt;<br> + &lt;xi:include href="../l401/l4601.virtual.xml"/&gt;<br> + ...</span> </td> </tr> <tr> <td> Line 55: </td> <td> Line 60: </td> </tr> <tr> <td> <span>- &lt;/template&gt;</span> </td> <td> <span>+ <br> + &lt;comment class="example.output" id="l601.popanal"&gt;<br> + ...<br> + &lt;/comment&gt;<br> + <br> + &lt;/template&gt;</span> </td> </tr> <tr> <td> Line 61: </td> <td> Line 71: </td> </tr> <tr> <td> <span>- pattern=".*Enter.*l1[^\d].*"</span> </td> <td> <span>+ pattern"\s*\*+\s*$\s*$\s*Population analysis using the SCF density.*"</span> </td> </tr> <tr> <td> Line 64: </td> <td> Line 74: </td> </tr> <tr> <td> <span>-</span> which matches the line in the logfile that read<span>s</span> </td> <td> <span>+</span> which matches the line<span>s</span> in the logfile that read<span>:</span> </td> </tr> <tr> <td> Line 67: </td> <td> Line 77: </td> </tr> <tr> <td> <span>- Entering Link 1 = /apps/apps64/g03/l1.exe PID= 6730.</span> </td> <td> <span>+ **********************************************************************<br> + <br> + Population analysis using the SCF density.</span> </td> </tr> <tr> <td> Line 75: </td> <td> Line 87: </td> </tr> <tr> <td> <span>-</span> endPattern="<span>.*Leave Link +1[^</span>\<span>d]</span>.*" </td> <td> <span>+</span> endPattern="\<span>sN\-N\=</span>.*" </td> </tr> <tr> <td> Line 81: </td> <td> Line 93: </td> </tr> <tr> <td> <span>- Leave Link 1 at Thu Feb 17 17:30:26 2011, MaxMem= 117964800 cpu: 0.2</span> </td> <td> <span>+ N-N= 8.247004252289D+02 E-N=-3.066552593713D+03 KE= 6.044489055531D+02</span> </td> </tr> <tr> <td> Line 87: </td> <td> Line 99: </td> </tr> <tr> <td> <span>- id="l1"</span> </td> <td> <span>+ l601.popanal</span> </td> </tr> <tr> <td> Line 93: </td> <td> Line 105: </td> </tr> <tr> <td> <span>- &lt;module lineCount="87" templateRef="l1"&gt; Entering Link 1 = /apps/apps64/g03/l1.exe PID= 6730.<br> - ...<br> - &lt;/module&gt;<br> - Leave Link 1 at Thu Feb 17 17:30:26 2011, MaxMem= 117964800 cpu: 0.2<br> - ...</span> </td> <td> <span>+ <br> + &lt;module cmlx:templateRef="l601.popanal" xmlns="http...<br> + <br> + &lt;/module</span> </td> </tr> <tr> <td> Line 108: </td> <td> Line 119: </td> </tr> <tr> <td> <span>-</span> When no further match is found, the parser will proceed to the next item in the list, in this case {{{l101.<span>temp</span>.xml}}}, which will do the same process but '''without having access''' to the already captured modules. </td> <td> <span>+</span> When no further match is found, the parser will proceed to the next item in the list, in this case {{{l<span>60</span>1<span>/l6</span>01.<span>polariz</span>.xml}}}, which will do the same process but '''without having access''' to the already captured modules. </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-02-29 11:49:26JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 193: </td> <td> Line 193: </td> </tr> <tr> <td> <span>-</span> * ''{A,cc:basis}''' this will read the data as a string (A) and put it in a term with the '''dictRef'' cc:basis. The cc prefix is the namespace of the computational chemistry dictionary, so this term is a '''basis''' as that term is defined in the computational chemistry dictionary. </td> <td> <span>+</span> * <span>'</span>''{A,cc:basis}''' this will read the data as a string (A) and put it in a term with the '''dictRef''<span>'</span> cc:basis. The cc prefix is the namespace of the computational chemistry dictionary, so this term is a '''basis''' as that term is defined in the computational chemistry dictionary. </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-02-29 11:47:06JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 109: </td> <td> Line 109: </td> </tr> <tr> <td> <span>-</span> &lt;xi:include href="l<span>601/l6</span>01.<span>pop</span>a<span>nal</span>.xml"/&gt;<span><br> - &lt;xi:include href="l601/l601.polariz.xml"/&gt;</span> </td> <td> <span>+</span> &lt;xi:include href="l<span>3</span>01.<span>b</span>a<span>sis</span>.xml"/&gt; </td> </tr> <tr> <td> Line 124: </td> <td> Line 123: </td> </tr> <tr> <td> <span>-</span> There is then a template list, which is the container that holds the templates. The first templatelist only holds one template, which is the template that parses the various links (modules) within the Gaussian program. This template then contains a template list, which includes the other templates that will process the text found by the parent template. </td> <td> <span>+</span> There is then a template list, which is the container that holds the templates. The first templatelist only holds one template, which is the template that parses the various links (modules) within the Gaussian program. This template<span>&nbsp;itself</span> then contains a template list, which includes the other templates that will process the text found by the parent template. </td> </tr> <tr> <td> Line 128: </td> <td> Line 127: </td> </tr> <tr> <td> <span>-</span> [https://bitbucket.org/wwmm/jumbo-converters/src/c25586883d1b/jumbo-converters-compchem/jumbo-converters-compchem-gaussian/src/main/resources/org/xmlcml/cml/converters/compchem/gaussian/log/templates/l<span>6</span>01<span>/l6</span>01.<span>popanal</span>.xm<span>&nbsp;jumbo-converters-compchem/jumbo-converters-compchem-gaussian/src/main/resources/org/xmlcml/cml/converters/compchem/gaussian/log/templates/l601/l601.popanal.xm</span> </td> <td> <span>+</span> [https://bitbucket.org/wwmm/jumbo-converters/src/c25586883d1b/jumbo-converters-compchem/jumbo-converters-compchem-gaussian/src/main/resources/org/xmlcml/cml/converters/compchem/gaussian/log/templates/l<span>3</span>01<span>.basis.xml jumbo-converters-compchem/jumbo-converters-compchem-gaussian/src/main/resources/org/xmlcml/cml/converters/compchem/gaussian/log/templates/l3</span>01.<span>basis</span>.xm<span>l]</span> </td> </tr> <tr> <td> Line 141: </td> <td> Line 140: </td> </tr> <tr> <td> <span>- Using '''l601.popanal.xml'' as an example:<br> - <br> - {{{<br> - &lt;template id="l202" repeatCount="*" pattern=".*Enter.*l202.*"<br> - endPattern=".*Leave Link +202.*"&gt;<br> - &lt;comment&gt;<br> - ... COMMENT EXPLAINING WHAT THIS SECTION IS ABOUT ...<br> - &lt;/comment&gt;<br> - &lt;templateList id='id2' xmlns:xi="http://www.w3.org/2001/XInclude"&gt;<br> - &lt;xi:include href="coord.input.temp.xml"/&gt;<br> - &lt;/templateList&gt;<br> - &lt;/template&gt;</span> </td> <td> <span>+ Using '''l601.popanal.xml''' as an example, we will go through the template and describe what each bit does.<br> + <br> + {{{<br> + &lt;template id="l301.basis" name="basis" repeat="*"<br> + pattern="\s*Standard basis.*" endPattern="\s*NAtoms=.*"<br> + endOffset="1"<br> + &gt;<br> + }}}<br> + <br> + The template starts with the template tag. The id and name define the template, repeat="*" says that it can occur multiple times in the file. The '''pattern''' is a regular expression stating that this template will start when it matches a line that starts with spaces followed by the phrase "Standard basis" followed by any text.<br> + <br> + The template will swallow all text until the end pattern is encountered, which is a line with spaces followed by '''NAtoms=''' followed by some arbitrary text.<br> + <br> + The endOffest indicates what happens to the file pointer within the template i.e. whether it rolls back so that the line that was matched by the template is available for processing within the template or not.<br> + <br> + <br> + Next comes a comment:<br> + <br> + {{{<br> + &lt;comment class="example.input" id="l301.basis.09"&gt;<br> + Standard basis: 3-21G (6D, 7F)<br> + Ernie: Thresh= 0.10000D-02 Tol= 0.10000D-05 Strict=F.<br> + There are 7 symmetry adapted basis functions of AG symmetry.<br> + There are 0 symmetry adapted basis functions of B1G symmetry.<br> + There are 2 symmetry adapted basis functions of B2G symmetry.<br> + There are 4 symmetry adapted basis functions of B3G symmetry.<br> + There are 0 symmetry adapted basis functions of AU symmetry.<br> + There are 7 symmetry adapted basis functions of B1U symmetry.<br> + There are 4 symmetry adapted basis functions of B2U symmetry.<br> + There are 2 symmetry adapted basis functions of B3U symmetry.<br> + Integral buffers will be 131072 words long.<br> + Raffenetti 1 integral format.<br> + Two-electron integral symmetry is turned on.<br> + 26 basis functions, 42 primitive gaussians, 26 cartesian basis functions<br> + 8 alpha electrons 8 beta electrons<br> + nuclear repulsion energy 33.7515964544 Hartrees.<br> + IExCor= 0 DFT=F Ex=HF Corr=None ExCW=0 ScaHFX= 1.000000<br> + ScaDFX= 1.000000 1.000000 1.000000 1.000000 ScalE2= 1.000000 1.000000<br> + IRadAn= 0 IRanWt= -1 IRanGd= 0 ICorTp=0<br> + NAtoms= 6 NActive= 6 NUniq= 2 SFac= 4.00D+00 NAtFMM= 50 NAOKFM=F Big=F<br> + &lt;/comment&gt;<br> + }}}<br> + <br> + The class of "example.input" means that this block of test will be used in the tests for the parser (see below). We can see how the first line is that which is matched by the template's '''pattern''' and the last one that matched by '''endPattern'''.<br> + <br> + Now we get to where the file is actually processed.<br> + <br> + {{{<br> + &lt;record id="basis"&gt;\s*Standard basis: {A,cc:basis} {X,cc:diffuse}&lt;/record&gt;<br> + }}}<br> + <br> + The first line reads a record from the file. '''repeat''' is omitted so defaults to one, so this reads a single line that starts with spaces and the phrase "Standard basis:" The terms that follow in the brackets determine what will be marked up into the xml.<br> + <br> + * ''{A,cc:basis}''' this will read the data as a string (A) and put it in a term with the '''dictRef'' cc:basis. The cc prefix is the namespace of the computational chemistry dictionary, so this term is a '''basis''' as that term is defined in the computational chemistry dictionary.<br> + * '''{X,cc:diffuse}''' the X is actually used to discard data, so this will swallow the text that comes after the character data. The dictRef is irrelevant in this case and could be anything.<br> + <br> + Other values for data that can be read are:<br> + * F - float<br> + * I - integer<br> + <br> + {{{<br> + &lt;templateList id="ernie"&gt;<br> + &lt;template pattern="\s*Ernie.*" endPattern=".*" id="ernie"&gt;<br> + &lt;record id="ernie" repeat="*"&gt;\s*Ernie: Thresh={E,g:thresh}\s Tol={E,g:tol}\sStrict={A,g:strict}\.\s*&lt;/record&gt;<br> + &lt;/template&gt;<br> + &lt;/templateList&gt;<br> + &lt;templateList id="symadnucl"&gt;<br> + &lt;template pattern="\s*There are.*" endPattern="\s*nuclear repulsion.*" id="symaddnuc" endOffset="1"&gt;<br> + &lt;record id="symmadapt" repeat="*" makeArray="true"&gt;\s* There are{I,cc:adapted} symmetry adapted basis functions of{A,cc:symm}symmetry\.\s*&lt;/record&gt;<br> + &lt;record id="buffer"&gt;\s*Integral buffers will be {I,g:buffer}\s*words long\.\s*&lt;/record&gt;<br> + &lt;record id="raff"&gt;\s*{A,g:raffenetti}\s{I,g:raff}\sintegral format\.\s*&lt;/record&gt;<br> + &lt;record id="twoe"&gt;\s*{X,g:twoe} integral symmetry is turned on\.\s*&lt;/record&gt;<br> + &lt;record id="basiscount"&gt;\s*{I,cc:basiscount}basis functions,{I,g:primbasis}primitive gaussians,{I,cc:cartesianbasis}cartesian basis functions\s*&lt;/record&gt;<br> + &lt;record id="alphabeta"&gt;\s*{I,cc:alphae}alpha electrons\s*{I,cc:betae}beta electrons\s*&lt;/record&gt;<br> + &lt;record id="nucrep"&gt;\s*nuclear repulsion energy\s*{F,cc:nucrepener}Hartrees\.\s*&lt;/record&gt;<br> + &lt;/template&gt;<br> + &lt;/templateList&gt;<br> + &lt;templateList id="natoms"&gt;<br> + &lt;template pattern="\s*NAtoms=.*" repeat="*" endPattern=".*" id="natoms"&gt;<br> + &lt;record id="natoms" repeat="*"&gt;\s*NAtoms={I,cc:natoms}\sNActive={I,cc:nactiveatoms}\sNUniq={I,cc:uniqatoms}\sSFac={E,g:sfac}\sNAtFMM={I,g:natfmm}.*\sBig={A,g:big}\s*&lt;/record&gt;<br> + &lt;/template&gt;<br> + &lt;/templateList&gt;<br> + &lt;templateList id="misc"&gt;<br> + &lt;template pattern="\s*((IExCor)|(ScaDFX)|(IRadAn)).*" repeat="*" endPattern=".*" id="misc"&gt;<br> + &lt;record id="misc"&gt;\s{1_20A,g:misc}\s*&lt;/record&gt;<br> + &lt;/template&gt;<br> + &lt;/templateList&gt;</span> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-02-29 10:38:03JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 56: </td> <td> Line 56: </td> </tr> <tr> <td> <span>-</span> For example, the Gaussian template<span>List</span>.xml file: </td> <td> <span>+</span> For example, the Gaussian t<span>opT</span>emplate.xml file: </td> </tr> <tr> <td> Line 119: </td> <td> Line 119: </td> </tr> <tr> <td> <span>- One of the templates it references is the file:</span> </td> <td> <span>+ <br> + The first few lines just declare this as a template and declare the include namespaces, so that we can use this to include other templates.<br> + <br> + There is then a comment containing some example text to show what this template parses - as will be explained later, this is also used for testing the template.<br> + <br> + There is then a template list, which is the container that holds the templates. The first templatelist only holds one template, which is the template that parses the various links (modules) within the Gaussian program. This template then contains a template list, which includes the other templates that will process the text found by the parent template.<br> + <br> + One of the templates the parent references is the file:</span> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-02-29 10:23:10JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 40: </td> <td> Line 40: </td> </tr> <tr> <td> <span>-</span> A '''template<span>List</span>.xml''' file, is a top-level file that creates the parent template and then defines all the templates which it itself contains. </td> <td> <span>+</span> A '''t<span>opT</span>emplate.xml''' file, is a top-level file that creates the parent template and then defines all the templates which it itself contains. </td> </tr> <tr> <td> Line 44: </td> <td> Line 44: </td> </tr> <tr> <td> <span>- Each template has a '''pattern''', which is a regular expression defining the text in the logfile where the module starts, may have an '''endPattern''', which is a regular expression defining where the module ends, and a '''repeatCount''', which describes how many times the module can occur within the file.</span> </td> <td> <span>+ Each template has a '''pattern''', which is a regular expression defining the text in the logfile where the module starts, and an '''endPattern''', which is a regular expression defining where the module ends, and a '''repeatCount''', which describes how many times the module can occur within the file.</span> </td> </tr> <tr> <td> Line 54: </td> <td> Line 54: </td> </tr> <tr> <td> <span>-</span> Each module is then parsed in turn, either by another template, or by the '''records''' in the template. </td> <td> <span>+</span> Each module is then parsed in turn, either by another template<span>&nbsp;within this template</span>, or by the '''records''' in the template. </td> </tr> <tr> <td> Line 58: </td> <td> Line 58: </td> </tr> <tr> <td> <span>- {{{<br> - jumbo-converters/jumbo-converters-compchem/src/main/resources/org/xmlcml/cml/converters/compchem/gaussian/log/templateList.xml<br> - }}}</span> </td> <td> <span>+ [https://bitbucket.org/wwmm/jumbo-converters/src/c25586883d1b/jumbo-converters-compchem/jumbo-converters-compchem-gaussian/src/main/resources/org/xmlcml/cml/converters/compchem/gaussian/log/templates/topTemplate.xml jumbo-converters-compchem/jumbo-converters-compchem-gaussian/src/main/resources/org/xmlcml/cml/converters/compchem/gaussian/log/templates/topTemplate.xml]<br> + </span> </td> </tr> <tr> <td> Line 66: </td> <td> Line 65: </td> </tr> <tr> <td> <span>- &lt;template id='gaussian.log' output="VERBOSE"&gt;<br> - &lt;templateList id='main' xmlns:xi="http://www.w3.org/2001/XInclude"&gt;<br> - &lt;xi:include href="l202.temp.xml"/&gt;<br> - &lt;xi:include href="l000.temp.xml"/&gt;<br> - &lt;xi:include href="l1.temp.xml"/&gt;<br> - &lt;xi:include href="l101.temp.xml"/&gt;<br> - ... MORE TEMPLATES ...<br> - &lt;xi:include href="lany.temp.xml"/&gt;</span> </td> <td> <span>+ &lt;template id='gaussian.log'<br> + xmlns:xi="http://www.w3.org/2001/XInclude"<br> + &gt;<br> + &lt;comment&gt;<br> + Entering Gaussian System, Link 0=/usr/local/gaussian/g03/g03<br> + Initial command:<br> + /usr/local/gaussian/g03/l1.exe /tmp/webmo/1/Gau-28330.inp -scrdir=/tmp/webmo/1/<br> + Entering Link 1 = /usr/local/gaussian/g03/l1.exe PID= 28333.<br> + ...<br> + *********************************************<br> + Gaussian 03: x86-Linux-G03RevB.04 2-Jun-2003<br> + 20-Nov-2006<br> + *********************************************<br> + --------------------------<br> + #N B3LYP/6-31G(d) OPT FREQ<br> + --------------------------<br> + 1/14=-1,18=20,26=3,38=1/1,3;<br> + ...<br> + ...<br> + Job cpu time: 0 days 0 hours 0 minutes 16.2 seconds.<br> + File lengths (MBytes): RWF= 12 Int= 0 D2E= 0 Chk= 7 Scr= 1<br> + Normal termination of Gaussian 03 at Mon Nov 20 14:40:23 2006.<br> + Link1: Proceeding to internal job step number 2.<br> + ------------------------------------------------------------------<br> + #N Geom=AllCheck Guess=Read SCRF=Check GenChk RB3LYP/6-31G(d) Freq<br> + ------------------------------------------------------------------<br> + 1/10=4,29=7,30=1,38=1,40=1,46=1/1,3;<br> + ...<br> + Job cpu time: 0 days 0 hours 0 minutes 12.7 seconds.<br> + File lengths (MBytes): RWF= 12 Int= 0 D2E= 0 Chk= 7 Scr= 1<br> + Normal termination of Gaussian 03 at Mon Nov 20 14:40:36 2006.<br> + <br> + &lt;/comment&gt;<br> + <br> + &lt;templateList&gt;<br> + &lt;template id="job" pattern="\s*((Link1\:\s+Proceeding to internal job step number)|(Entering Gaussian System)).*"<br> + endPattern="\s*Normal termination of.*" endOffset="1" repeat="*"&gt;<br> + &lt;templateList id='main'&gt;<br> + &lt;xi:include href="l0.entering.xml"/&gt;<br> + &lt;xi:include href="l1/l1.legal.xml"/&gt;<br> + &lt;xi:include href="l1/l1.citation.xml"/&gt;<br> + &lt;xi:include href="l1/l1.end.xml"/&gt;<br> + &lt;xi:include href="l716.xml"/&gt;<br> + &lt;xi:include href="l601/l601.anisospin.xml"/&gt;<br> + &lt;xi:include href="l601/l601.popanal.xml"/&gt;<br> + &lt;xi:include href="l601/l601.polariz.xml"/&gt;<br> + &lt;!-- Many more templates --&gt;<br> + &lt;/templateList&gt;<br> + &lt;/template&gt;</span> </td> </tr> <tr> <td> Line 75: </td> <td> Line 115: </td> </tr> <tr> <td> </td> <td> <span>+ </span> </td> </tr> <tr> <td> Line 78: </td> <td> Line 119: </td> </tr> <tr> <td> <span>- The first template it references is the file:<br> - <br> - {{{<br> - jumbo-converters/jumbo-converters-compchem/src/main/resources/org/xmlcml/cml/converters/compchem/gaussian/log/templates/l202.temp.xml<br> - }}}</span> </td> <td> <span>+ One of the templates it references is the file:<br> + <br> + [https://bitbucket.org/wwmm/jumbo-converters/src/c25586883d1b/jumbo-converters-compchem/jumbo-converters-compchem-gaussian/src/main/resources/org/xmlcml/cml/converters/compchem/gaussian/log/templates/l601/l601.popanal.xm jumbo-converters-compchem/jumbo-converters-compchem-gaussian/src/main/resources/org/xmlcml/cml/converters/compchem/gaussian/log/templates/l601/l601.popanal.xm<br> + </span> </td> </tr> <tr> <td> Line 94: </td> <td> Line 134: </td> </tr> <tr> <td> <span>-</span> Using '''l<span>2</span>0<span>2</span>.<span>tem</span>p.xml<span>'</span>'' as an example: </td> <td> <span>+</span> Using '''l<span>6</span>0<span>1</span>.p<span>opanal</span>.xml'' as an example: </td> </tr> </table> </div> Chempoundhttp://quixote.wikispot.org/Chempound2012-02-26 12:23:54JensThomas <div id="content" class="wikipage content"> Differences for Chempound<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 133: </td> <td> Line 133: </td> </tr> <tr> <td> <span>-</span> The easiest way to get to grips with SPARQL is to disect a simple query: </td> <td> <span>+</span> The easiest way to get to grips with SPARQL is to dis<span>s</span>ect a simple query: </td> </tr> </table> </div> Chempoundhttp://quixote.wikispot.org/Chempound2012-02-26 12:22:35JensThomas <div id="content" class="wikipage content"> Differences for Chempound<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 197: </td> <td> Line 197: </td> </tr> <tr> <td> <span>- As the Chempound SPARQL endpoint exposes a RESTful API , we can query it directly, as the following python script demonstrates.</span> </td> <td> <span>+ As the Chempound SPARQL endpoint exposes a RESTful API , we can query it directly. The following python script executes a SPARQL query against chempound and then saves the result as a csv (comma-separated variable) file, so that the results of the query can be imported into a spreadsheet program for (e.g.) plotting a graph of the results.</span> </td> </tr> <tr> <td> Line 204: </td> <td> Line 204: </td> </tr> <tr> <td> <span>-</span> # The SPARQL query <span>you</span> want to execute </td> <td> <span>+</span> # The SPARQL query <span>we</span> want to execute </td> </tr> <tr> <td> Line 212: </td> <td> Line 212: </td> </tr> <tr> <td> <span>-</span> # url of the chempound SPARQL endpoint <span>you</span> want to query </td> <td> <span>+</span> # url of the chempound SPARQL endpoint <span>we</span> want to query </td> </tr> <tr> <td> Line 226: </td> <td> Line 226: </td> </tr> <tr> <td> <span>-</span> #Set up our GET query to the SPARQL endpoint </td> <td> <span>+</span> #<span>&nbsp;</span>Set up our GET query to the SPARQL endpoint </td> </tr> <tr> <td> Line 234: </td> <td> Line 234: </td> </tr> <tr> <td> <span>- # Accept: text/turtle, application/rdf+xml<br> - #request.add_header('Accept','application/sparql-results+json')</span> </td> <td> </td> </tr> <tr> <td> Line 239: </td> <td> Line 237: </td> </tr> <tr> <td> <span>- </span> </td> <td> </td> </tr> <tr> <td> Line 253: </td> <td> Line 250: </td> </tr> <tr> <td> <span>- #variables = etree.findall('.//{%s}variable' % namespace)</span> </td> <td> </td> </tr> <tr> <td> Line 258: </td> <td> Line 254: </td> </tr> <tr> <td> <span>-</span> # Loop through results adding the relevant bindings to the dictionary<br> <span>-</span> # <span>c</span>urrently only support uri </td> <td> <span>+</span> # Loop through results adding the relevant bindings to the dictionary<span>.</span><br> <span>+</span> # <span>C</span>urrently only support uri </td> </tr> <tr> <td> Line 270: </td> <td> Line 266: </td> </tr> <tr> <td> <span>- </span> </td> <td> </td> </tr> </table> </div> Chempoundhttp://quixote.wikispot.org/Chempound2012-02-26 12:17:46JensThomas <div id="content" class="wikipage content"> Differences for Chempound<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 202: </td> <td> Line 202: </td> </tr> <tr> <td> <span>- <br> - # url of the chempound SPARQL endpoint<br> - baseurl = "http://quixote.ch.cam.ac.uk/sparql/"<br> - <br> - # The SPARQL query<br> - query="""SELECT ?molecule</span> </td> <td> <span>+ import xml.etree.ElementTree as ET<br> + <br> + # The SPARQL query you want to execute<br> + query="""SELECT ?molecule ?inchi</span> </td> </tr> <tr> <td> Line 211: </td> <td> Line 209: </td> </tr> <tr> <td> </td> <td> <span>+ ?molecule &lt;http://www.xmlcml.org/rdf-schema#inchi&gt; ?inchi .</span> </td> </tr> <tr> <td> Line 213: </td> <td> Line 212: </td> </tr> <tr> <td> </td> <td> <span>+ # url of the chempound SPARQL endpoint you want to query<br> + baseurl = "http://quixote.ch.cam.ac.uk/sparql/"<br> + <br> + # The "comma-separated variable" file where the results should<br> + # go so they can be imported into a spreadsheet program<br> + csvFile = "/Users/jmht/sparql_results.csv"<br> + <br> + <br> + # The real work starts here!<br> + <br> + # SPARQL namespace - shouldn't need to change this<br> + namespace="http://www.w3.org/2005/sparql-results#"<br> + # NB: results format based on: http://www.w3.org/2001/sw/DataAccess/rf1/<br> + <br> + #Set up our GET query to the SPARQL endpoint</span> </td> </tr> <tr> <td> Line 214: </td> <td> Line 228: </td> </tr> <tr> <td> <span>- urlparam = { "query" : query<br> - "format" : "json",<br> - "output" : "json" }</span> </td> <td> <span>+ urlparam = { "query" : query }</span> </td> </tr> <tr> <td> Line 218: </td> <td> Line 230: </td> </tr> <tr> <td> </td> <td> <span>+ request = urllib2.Request(baseurl,querystr)</span> </td> </tr> <tr> <td> Line 220: </td> <td> Line 233: </td> </tr> <tr> <td> <span>- request = urllib2.Request(baseurl,querystr)<br> -</span> request.add_header('Accept','application/<span>j</span>son<span>'</span> <span>)<br> - </span>request.add_header('Accept','sparql-results<span>/</span>json') </td> <td> <span>+</span> request.add_header('Accept','application/s<span>parql-results+xml')<br> + # Accept: text/turtle, applicati</span>on<span>/rdf+xml<br> +</span> <span>#</span>request.add_header('Accept','<span>application/</span>sparql-results<span>+</span>json') </td> </tr> <tr> <td> Line 226: </td> <td> Line 239: </td> </tr> <tr> <td> <span>- print response.read()</span> </td> <td> <span>+ <br> + <br> + # We now have the results in SPARQL xml so we need to turn them into<br> + # a csv file - we use etree to do this:<br> + # http://effbot.org/zone/element-index.htm<br> + <br> + # Parse results to create etree &amp; get root element<br> + etree = ET.parse(response)<br> + root = etree.getroot()<br> + <br> + # Sparql query always returns 2 elements: head and results<br> + head,results = root[:]<br> + <br> + # Get head and create dictionary for variables<br> + #variables = etree.findall('.//{%s}variable' % namespace)<br> + resultsDict = {}<br> + for var in head:<br> + resultsDict[var.get("name")] = []<br> + <br> + # Loop through results adding the relevant bindings to the dictionary<br> + # currently only support uri<br> + nresults=len(results)<br> + for result in results:<br> + for binding in result:<br> + # One element for each binding of type: uri, literal or label<br> + # currently only deal with uri<br> + if ( len(binding) == 1 and binding[0].tag == "{%s}uri" % namespace ):<br> + resultsDict[binding.get("name")].append(binding[0].text)<br> + else:<br> + raise RuntimeError("Results only supported for uri!")<br> + <br> + <br> + # output as csv file<br> + rfile = open(csvFile,'w')<br> + <br> + # column headers<br> + headers = resultsDict.keys()<br> + rfile.write(",".join(headers)+"\n")<br> + <br> + # data<br> + for i in range(nresults):<br> + newline=[]<br> + for header in headers:<br> + newline.append(resultsDict[header][i])<br> + rfile.write(",".join(newline)+"\n")<br> + <br> + rfile.close()</span> </td> </tr> </table> </div> Mavenhttp://quixote.wikispot.org/Maven2012-02-23 12:27:40PeterMurrayRustadded petermr tips for Maven <div id="content" class="wikipage content"> Differences for Maven<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 76: </td> <td> Line 76: </td> </tr> <tr> <td> </td> <td> <span>+ == If you are offline ==<br> + <br> + Although Maven has an offline setting, I (petermr) have never been able to use it reliably. If you have to work offline I'd suggest that you<br> + {{{<br> + mvn clean install<br> + }}}<br> + everything you need before you get on the train. Then there is a reasonable chance that Maven will use your local repo for the jars you depend on. If it doesn't you'll have to wait till you get back home.<br> + <br> + == Your local repo ==<br> + <br> + If something goes wrong in a build you can end up with rubbish in the repo. It's worth cleaning out grot from time to time. Even deleting the whole repo. But only do this with a good connection as Maven downloads an awful lot.<br> + <br> + == Scope for compile and test ==<br> + <br> + Classes and jars for tests can be defined as<br> + {{{<br> + &lt;scope&gt;test&lt;/scope&gt;<br> + }}}<br> + and most of the time {{junit}} is so defined. However sometimes the classes with the test routines (e.g. {{org.junit.Assert}}) are used to build test utilities and will need the<br> + {{{<br> + &lt;scope&gt;compile&lt;/scope&gt;<br> + }}}<br> + <br> + == Maven and Eclipse ==<br> + <br> + Eclipse can do a lot of "smart" things such as caching classes and rebuilding the {{target}} immediately after deleting it. If Eclipse is giving problems (unresolved classes) I then run Maven to clean things. Maven is much stricter about the order of dependencies and you should always run Maven to verify that your code is clean (Hudson will insist on it anyway).<br> + <br> + If Eclipse is Open, then sometimes it will (at least on Windows) stop classes being deleted and crash Maven. I usually close Eclipse before running Maven.<br> + <br> + == Maven and Hudson ==<br> + <br> + Hudson is even stricter than Maven. Maven can resolve relative directories, Hudson cannot. So use the URL method - I think I have abandoned relativeUrl.</span> </td> </tr> </table> </div> Mavenhttp://quixote.wikispot.org/Maven2012-02-23 11:36:43JensThomas <div id="content" class="wikipage content"> Differences for Maven<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 233: </td> <td> Line 233: </td> </tr> <tr> <td> <span>- * '''-Dexec.mainClass''' defines the variable '''exec.mainClass''', which specifies the main class to execute. In the cmllite-validator code example, the class we wish to execute is the CmlLiteValidator</span> </td> <td> <span>+ * '''-Dexec.mainClass''' defines the variable '''exec.mainClass''', which specifies the main class to execute. In the cmllite-validator code example, the class we wish to execute is the CmlLiteValidator class, which lives in the directory: &lt;project_root&gt;/src/main/java/org/xmlcml/www/CmlLiteValidator.java In the maven convention, the class path for the file is therefore '''org.xmlcml.www.CmlLiteValidator''', so this is what we pass in using the mainClass argument. The main class of this file will then be executed/<br> + * -Dexec.args - this specifies a list of space-separated arguments that will be passed to the main method.</span> </td> </tr> </table> </div> Mavenhttp://quixote.wikispot.org/Maven2012-02-23 11:29:27JensThomas <div id="content" class="wikipage content"> Differences for Maven<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 207: </td> <td> Line 207: </td> </tr> <tr> <td> </td> <td> <span>+ <br> + <br> + == Remove all project artifacts that have been installed into your local Maven repository ==<br> + <br> + The repository is usually found at {{{~/.m2}}}). Run this command from ''&lt;project-root&gt;'':<br> + <br> + {{{<br> + mvn build-helper:remove-project-artifact<br> + }}}<br> + <br> + == Execute java classes from the command-line ==<br> + <br> + It is often useful to be able to execute individual classes from the maven command line - particularly during development.<br> + <br> + An example for the [https://bitbucket.org/cml/cmllite-validator-code] is the following:<br> + <br> + {{{<br> + mvn \<br> + -o \<br> + exec:java \<br> + -Dexec.mainClass="org.xmlcml.www.CmlLiteValidator" \<br> + -Dexec.args="/Users/jmht/Documents/quixote/repositories/dictionary-compchem/dictionary-compchem.cml"<br> + }}}<br> + <br> + * The -o flag runs maven in offline mode, so it doesn't check for any updates to dependencies<br> + * [http://mojo.codehaus.org/exec-maven-plugin/java-mojo.html exec:java] sets the maven goal to execute some java code<br> + * '''-Dexec.mainClass''' defines the variable '''exec.mainClass''', which specifies the main class to execute. In the cmllite-validator code example, the class we wish to execute is the CmlLiteValidator</span> </td> </tr> </table> </div> Tutorials and problemshttp://quixote.wikispot.org/Tutorials_and_problems2012-02-23 11:23:21JensThomas <div id="content" class="wikipage content"> Differences for Tutorials and problems<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 31: </td> <td> Line 31: </td> </tr> <tr> <td> <span>- </span> </td> <td> </td> </tr> <tr> <td> Line 38: </td> <td> Line 37: </td> </tr> <tr> <td> <span>- }}}<br> - <br> - * Remove all project artifacts that have been installed into your local Maven repository (usually at {{{~/.m2}}}). Run from ''&lt;project-root&gt;'':<br> - <br> - {{{<br> - mvn build-helper:remove-project-artifact<br> - }}}<br> - <br> - * Clean your cloned project folder of all compiled files. Run from ''&lt;project-root&gt;'':<br> - <br> - {{{<br> - mvn clean</span> </td> <td> </td> </tr> </table> </div> JUMBO-Convertershttp://quixote.wikispot.org/JUMBO-Converters2012-02-20 14:24:34JensThomas <div id="content" class="wikipage content"> Differences for JUMBO-Converters<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 8: </td> <td> Line 8: </td> </tr> <tr> <td> </td> <td> <span>+ = Running the converters =<br> + <br> + This page is concerned with the philosophy and design of the converters. If you are just interested in running them, please go the the [http://quixote.wikispot.org/Tutorials_and_problems#head-3f3979152d76625356b090db3fc81300f47cf760 Tutorials and problems] page.</span> </td> </tr> </table> </div> NWChemhttp://quixote.wikispot.org/NWChem2012-02-20 10:55:53JensThomas <div id="content" class="wikipage content"> Differences for NWChem<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 1: </td> <td> Line 1: </td> </tr> <tr> <td> </td> <td> <span>+ = NWChem =<br> + <br> + The [http://www.nwchem-sw.org/ NWChem homepage] provides more information on NWChem. This page just details some information on building/running NWChem.<br> + <br> + == Building and running on OSX ==<br> + <br> + For pre-Lion OSX's (i.e. OSX &lt; 10.7), pre-compiled binaries of NWChem 6.0 are available from the [http://www.nwchem-sw.org/index.php/Download#NWChem_6.0 NWChem website].<br> + <br> + The tar file will unpack to give a directory called (e.g.) '''nwchem-6.0-binary'''.<br> + <br> + In order to run the serial binary, you will need to set the environment variable NWCHEM_BASIS_LIBRARY to point to path to the basis set library directory, which is the directory '''nwchem-6.0-binary/usr.local.lib.nwchem/libraries/'''. You need the full path and do not omit the trailing slash, e.g.:<br> + <br> + {{{<br> + export NWCHEM_BASIS_LIBRARY=/User/jmht/nwchem-6.0-binary/usr.local.lib.nwchem/libraries/"<br> + }}}<br> + <br> + The current serial binaries will not work on Lion, for which you will need to build NWChem as described below.<br> + <br> + === Building NWChem on OSX ===<br> + <br> + The following steps show how to build NWChem on a mac without using MPI, i.e. suitable for running serially on a laptop/workstation.<br> + <br> + * Install the OSX [https://developer.apple.com/xcode/ developer tools] from Apple<br> + * Install [http://gcc.gnu.org/wiki/GFortranBinaries#MacOS Gfortran]<br> + * Download the latest NWChem source from the [http://www.nwchem-sw.org/index.php/Download website].<br> + * Unpack the source, and '''cd ''' into the resulting directory (which will be called something like '''nwchem-src-2012-Feb-16'''.<br> + * Create a script called '''build.sh''' with the following in it:<br> + <br> + {{{<br> + #!/bin/bash<br> + <br> + # Change NWCHEM_TOP to match the source directory on your machine.<br> + export NWCHEM_TOP=/Users/jmht/Documents/quixote/nwchem/nwchem-src-2012-Feb-16<br> + export NWCHEM_TARGET=MACX64<br> + export NWCHEM_MODULES=all<br> + <br> + export USE_MPI=""<br> + export BLASOPT=" "<br> + # Below may not be needed<br> + export MSG_COMMS=TCGMSG<br> + <br> + cd $NWCHEM_TOP/src<br> + <br> + # Uncomment the following 2 lines and run the first time you use the script to<br> + # configure nwchem. Once this has been done once, comment them out and rerun<br> + # the script to run the build<br> + #make nwchem_config<br> + #exit<br> + <br> + make \<br> + FC=gfortran \<br> + CC=gcc \<br> + USE_64TO32=y<br> + }}}<br> + <br> + Make the script executable ('''chmod +x ./build.sh''') and run it, and (with luck) it will build NWChem, which you will find in the directory:<br> + <br> + '''nwchem-src-2012-Feb-16/bin/MACX64/nwchem'''</span> </td> </tr> </table> </div> Resources and technologyhttp://quixote.wikispot.org/Resources_and_technology2012-02-20 10:22:57JensThomas <div id="content" class="wikipage content"> Differences for Resources and technology<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 24: </td> <td> Line 24: </td> </tr> <tr> <td> </td> <td> <span>+ <br> + == Optional (Opensource) Components ==<br> + * ["NWChem"] - a powerful Opensource electronic structure code.<br> + * [http://avogadro.openmolecules.net/ Avogadro] - an Opensource molecular modelling environment</span> </td> </tr> </table> </div> Creating dictionarieshttp://quixote.wikispot.org/Creating_dictionaries2012-02-20 08:30:35JensThomas <div id="content" class="wikipage content"> Differences for Creating dictionaries<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 1: </td> <td> Line 1: </td> </tr> <tr> <td> </td> <td> <span>+ = CREATING A NEW CML DICTIONARY =<br> + <br> + === Create bitbucket repository ===<br> + <br> + The repository should contain a single .cml file in its root directory, containing the dictionary. The dictionary must either be the root element of the document, or be the child of a &lt;cml&gt; that is the root of the document.<br> + <br> + The root directory can contain non-cml files and subdirectories.<br> + <br> + === Create hudson job ===<br> + <br> + In hudson click 'New Job' (in left-hand menu).<br> + <br> + Enter job name (e.g. cml-dictionary-compchem-gaussian) - this should be prefixed '''cml-dictionary-'''.<br> + Select copy existing job, and enter 'cml-dictionary-compchem'<br> + Click 'Ok'<br> + <br> + In the 'Source Code Management' section update the URLs to point at the new bitbucket repository (e.g. [https://bitbucket.org/cml/dictionary-compchem-gaussian]).<br> + <br> + In the 'Build Triggers' section tick 'Trigger builds remotely' box, and enter an Authentication Token (e.g. W3y7N3Ha). (This may already be set, in which case you can just keep the existing value).<br> + <br> + Click save.<br> + <br> + <br> + === Get the bitbucket repo to trigger hudson ===<br> + <br> + Go back to the bitbucket repository and click 'Admin' on the top menu and then 'Services' in the LH menu.<br> + <br> + Select 'POST' from the drop down menu and click 'Add Service'.<br> + <br> + In the URL box enter the hudson trigger url in the format:<br> + <br> + '''{{{https://hudson.ch.cam.ac.uk/job/&lt;JOB_NAME&gt;/build?token=&lt;AUTH_TOKEN&gt;}}}''''<br> + <br> + e.g.<br> + <br> + '''{{{https://hudson.ch.cam.ac.uk/job/cml-dictionary-compchem-gaussian/build?token=W3y7N3Ha}}}'''<br> + <br> + And click 'Save Settings'<br> + <br> + Now, when you push to bitbucket a hudson job should trigger immediately. This will validate the dictionary and upload it to the CML website. If validation fails then the job will fail and the dictionary won't get uploaded.<br> + <br> + You can check the status of the jobs on the 'CML Dictionaries' page in hudson:<br> + <br> + [https://hudson.ch.cam.ac.uk/view/cml-dictionaries/]</span> </td> </tr> </table> </div> Resources and technologyhttp://quixote.wikispot.org/Resources_and_technology2012-02-20 08:27:22JensThomas <div id="content" class="wikipage content"> Differences for Resources and technology<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 120: </td> <td> Line 120: </td> </tr> <tr> <td> </td> <td> <span>+ * ["Creating dictionaries"]</span> </td> </tr> </table> </div> Chempoundhttp://quixote.wikispot.org/Chempound2012-02-18 01:58:18 <div id="content" class="wikipage content"> Differences for Chempound<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 197: </td> <td> Line 197: </td> </tr> <tr> <td> <span>-</span> As the Chempound SPARQL endpoint exposes a REST<span>FUL</span> <span>api</span>, we can query it directly, as the following python script demonstrates. </td> <td> <span>+</span> As the Chempound SPARQL endpoint exposes a REST<span>ful</span> <span>API </span>, we can query it directly, as the following python script demonstrates. </td> </tr> <tr> <td> Line 214: </td> <td> Line 214: </td> </tr> <tr> <td> <span>- urlparam = { "query" : query }</span> </td> <td> <span>+ urlparam = { "query" : query<br> + "format" : "json",<br> + "output" : "json" }</span> </td> </tr> <tr> <td> Line 219: </td> <td> Line 221: </td> </tr> <tr> <td> <span>- request.add_header('Accept','text/html' )</span> </td> <td> <span>+ request.add_header('Accept','application/json' )<br> + request.add_header('Accept','sparql-results/json')</span> </td> </tr> </table> </div> Chempoundhttp://quixote.wikispot.org/Chempound2012-02-17 23:22:48 <div id="content" class="wikipage content"> Differences for Chempound<p><strong></strong></p><table> <tr> <td> <span> Deletions are marked with - . </span> </td> <td> <span> Additions are marked with +. </span> </td> </tr> <tr> <td> Line 191: </td> <td> Line 191: </td> </tr> <tr> <td> </td> <td> <span>+ EXPLAIN HOW TO USE THE DICTIONARIES TO DISCOVER WHAT TERMS ARE AVAILABLE.<br> + <br> + == Remote Chempound SPARQL queries with Python ==<br> + <br> + The Chempound SPARQL page will return the results as html or rdf/xml. The rdf/xml can of course be saved and processed offline, but it is more useful to be able to query and download the results all from within a single script.<br> + <br> + As the Chempound SPARQL endpoint exposes a RESTFUL api, we can query it directly, as the following python script demonstrates.<br> + <br> + {{{<br> + import urllib<br> + import urllib2<br> + <br> + # url of the chempound SPARQL endpoint<br> + baseurl = "http://quixote.ch.cam.ac.uk/sparql/"<br> + <br> + # The SPARQL query<br> + query="""SELECT ?molecule<br> + WHERE<br> + {<br> + ?molecule &lt;http://www.xmlcml.org/rdf-schema#formula&gt; "H 2 O 1" .<br> + }"""<br> + <br> + # Encode the parts of the query string into a form suitable for POST<br> + urlparam = { "query" : query }<br> + querystr=urllib.urlencode(urlparam)<br> + <br> + # Add the header to state what we want back<br> + request = urllib2.Request(baseurl,querystr)<br> + request.add_header('Accept','text/html' )<br> + <br> + # Get the results<br> + response = urllib2.urlopen(request)<br> + print response.read()<br> + }}}<br> + </span> </td> </tr> </table> </div>