CML-XML Quixote Meeting 22nd-23rd March STFC Daresbury Laboratory
A meeting to create the first Open distributed repository for electronic simulations
Sponsored by the Hartree Centre
For a general overview of the Quixote Project, please follow the links on the Front Page.
Meeting's EtherPad is at http://okfnpad.org/quixotemeeting.
Primary: build and populate a Chempound repository for outputs from electronic codes. Repository should search for chemical concepts, job meta-data and core physico-chemical properties.
Subsidiary: build Core CML-XML parsers for 3- 8 commonly used codes in atomistic QM and planewave simulations. Use Lensfield to scrape directories for files.
Chemistry (not QM methodology).
practical achievement (running code and a working repository).
The meeting will have two main strands:
Strand 1 - ChemPound
use case: searching the repository for given concept over several filetypes
preparation: several CML files (with index data) for several codes
Files for data set (final CML, original TXT, index.rdf, etc.) aggregated under ORE
Aggregation uploaded by SWORD (METS, ZIP, etc.. as agreed)
index in simple name-value search (?RDF, ?Mongo)
Strand 2 - Lensfield, JUMBOConverters
Use-case: disk populated with calculations of given kind where (say) suffix = FOO .
Arbitrary directory structure.
Lensfield trawls disk for *.foo
JumboFooConverter/Parser converts to CML (failures => null files).
conversion supported by FooDict.cml
The parsing need not be complete. The CML is used for specific data and indexing. The results COULD (if all goes well) be fed into the Chempound repo.
Small Parsers can be written on-the-fly for codes if the format is pre-explored. We do not have to do the whole job.
The codes should all have mini-dictionaries (some will be large)
The indexes will be normalized (e.g. dates, units, formulae, etc.). This means that users search a single conceptual model but the original data is not affected and can be retrieved as part of the ORE aggregation.
NB: - this is a draft timetable and subject to change.
|Monday 21st March|
|Arrival at Daresbury Park Hotel - check-in from 15:00|
|Tuesday 22nd March|
|9:00||Taxis from Hotel to Lab|
|9:30-9:50||Introduction to Conference and Quixote Project - Jens Thomas|
|9:50-10:10||Q5cost, a possible common format for Quantum Chemistry - Elda Rossi| q5cost.pdf|
|10:10-10:30||TURBOMOLE and CML - Semi-direct vs. parser approach - Andreas Gloess| talk.pdf|
|10:30-10:50||Avogadro & Kitware - Marcus Hanwell|
|10:50-11:10||Quantum Expresso, Moka & XML - Riccardo Sabatini|
|11:25-12:30||Introduction to parsers, dictionaries, CML and databases - Peter Murray-Rust et al.|
|13:30-15:30||Work on parsers/database|
|15:45-18:30||Work on parsers/database|
|19:00||Dinner at the Ring O'Bells|
|Wednesday 23rd March|
|8:30||Taxis from Hotel to Lab|
|9:30-11:00||Work on parsers/database|
|11:15-12:30||Work on parsers/database|
|13:30-15:00||Work on parsers/database|
|15:15-16:30||Work on parsers/database|
Venue and Travel
The meeting will take place at STFC Daresbury Laboratory.
For invited attendees, accommodation has been arranged for the nights of the 21st and 22nd at the Daresbury Park Hotel
For further information about the meeting, or if you are interested in attending, please contact Jens Thomas: email@example.com