-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parsing of .magres files #171
Comments
Hi @jkshenton thanks a lot for getting in touch. We will happy to implement this. Just to speed up the development, do you have any input/output example files at hand that we can directly have a look? I tried looking but couldn't find these examples. Furthermore, the more complete folder / files you can share with us, the better, as we can prepare the parsing and cross-reference with ASE :-) |
Thanks for the super fast reply and for taking up this implementation! CCP-NC has a repository of many thousands of such .magres files: I grabbed a few random examples from there are put them in the attached tarball. The Let me know if you would like any more information to make the implementation easier. |
Perfect, thanks a lot for the share. This is very interesting, we were not aware of such an initiative. I will take a look on the details of the project and possibly come back to you with some questions, if that's fine. We can further discuss whether you want to share the data in the database in NOMAD, and how can we help each other with computational or experimental data. |
We have recently been thinking about ways to make a version 2 of our NMR database more FAIR, including some integration/sharing with databases such as the NOMAD one, so we would be very happy to discuss this! |
Hi @jkshenton , I am coming back to this issue to let you know we are starting to work on the magnetic properties support in NOMAD (you can check a recent issue opened in #174). I think before starting to work on the magres parser, it is a good idea if we can meet in Zoom, let's say 30min - 1h, so that we can understand your goals, how to merge the #174 idea with yours, and how NOMAD can help. Furthermore, I would like to discuss the workflows typically done in magres calculations, how to integrate this, and how is the data in the CCP-NC structured (and how it compares with NOMAD). What do you think? We can also talk by email ([email protected]) and organize the meeting by private email. Whatever feels more comfortable for you 🙂 |
Happy to meet and discuss our goals - I've just sent an email to arrange that. A bit more context here to help with a discussion:
The NMR-related results can include site-based (e.g. magnetic shielding and electric field gradient) tensors, pair-wise (e.g. J-couplings) tensors or global (e.g. magnetic susceptibility) tensors. Each quantity is reported along with its units. The file can also have a Although a magres file in isolation is very useful for post-processing and sharing the results of first-principles solid-state NMR calculations, we would ideally like to provide more context in our (/your) database in the future. The workflow would typically be something like:
In terms of our goals: we're currently in the planning stage of a major re-development of our database stack and we're looking at different options to improve the value the database provides to the solid-state NMR community. This includes better search/filtering functionality, better metadata capture (including workflow context) and some data visualisation options through integration of some of our other python and javascript tools. |
Very good. I think the workflow can be covered with the current NOMAD infrastructure, albeit some details we can discuss over Zoom (like which files should be included in the upload for these).
Then, FAIRmat can help on this. I am speaking internally with some engineers to see whether they can join the discussion. But for a first meeting, we can definitely sit and see what are the best options; maybe, @ladinesa are you available for joining the discusion? If so, I will send you the emails for the Zoom. |
Thanks for including me in the discussion. Is the date already set? I will be on holiday next week, so it would be great if we schedule it thia week. |
Brief summary of our meeting:
I think these bullet points summarize the meeting. Feel free to add or ask anything. |
Thanks for sharing your summary! I think you captured the essential bits. For the magresview visualiser, I would rather link to our custom "2.0" version which essentially completely replaces the previous JMOL-based version. For the workflow / link between different DFT output files, I've attached a tarball with a very basic two step procedure that might be typical of the sort of ssNMR calculations with CASTEP that one might upload to NOMAD: 1. a geometry optimisation (seedname |
Thanks a lot, this is indeed what is needed to fully develop the parser 👍🏻 If you have more examples, do not hesitate in sharing them with us; the more, the better, as this will help on preparing better other options. Now, @jkshenton @ladinesa I was wondering about the workplan: I think, we (either Alvin or myself) can develop the initial version of the parser. Then, on the long term and if you are convinced of using NOMAD, it is better if you (or Sathya) take over maintaining the parser. I was very recently discussing with other devs, and you could even think in the more longer term about using the developed parser as an I/O wrapper for your applications (without the need of having scripts over the place). Let me know what you think. If agreed, I'll suggest you to star this repository, and I will keep you informed of important changes that affect you. P.S.: should we also tag Sathya's Github profile? |
Your proposed workplan sounds good to me - thanks! As we mentioned before, the broader context would be that we would like to be able to easily (=via dashboard/API) access NMR data from any of the DFT codes that compute it. These include (non-exhaustive list): Parsing magres files is a very useful first step towards this, since they have been adopted by two of the major DFT NMR codes (CASTEP and QE) and the specification for the file format introduces the rationale behind the structure of key bits of NMR data. There's also an accompanying JSON schema , in case that is helpful. In terms of using the nomad parser as an I/O wrapper - I am all for re-using code and well-built libraries, though I would note the ongoing development of a standalone CASTEP parsing library to play such a role: https://github.com/oerc0122/castep_outputs Good idea to tag @Sathya-S3 |
Very good. I will work on the schema and parser mid December. Sorry, I am going on holidays two weeks.
This is very interesting. We have to definitely join efforts here, as I don't see the point of maintaining several parsers for the same code and double the work 🙂 We will pay attention to when this is integrated in CASTEP, but in the meanwhile, @ladinesa do you mind checking the repo and seeing how it compares with our current CASTEP parser? |
will create interface to it in #184 . |
Hi @ladinesa, I'm in the process of preparing a technical stack review document for the CCP-NC main working group. The goal is to present the different development options for the CCP-NC database website. It'd be valuable to know your thoughts as well, on the below section from @JosePizarro3's meeting notes, when time permits. Thank you very much.
For reference @jkshenton |
I refer to the approach we took with the other databases supported in nomad e.g. materials project, aflow, oqmd. We would host your data in nomad and develop an app for a customised search of nmr data in central nomad. Regarding the ccp-nc website, you start with a nomad oasis deployment where you can further customise schema, visualisation etc. This will also enable the synching of data with nomad central. Depending on the long-term goals of the project, you can then |
Just wanted to say that I am almost finished with the initial version of the parser for magres. Just had a couple of minor doubts:
Thanks! |
Hi! Exciting - thanks for working on it!
Hope that at least partially answers your questions (?). |
Great, thanks a lot. Let's then put the focus first on CASTEP, test it, and then extend the support for QE if you like it.
Thanks once more! 🙂 |
Hi @jkshenton @Sathya-S3 , I finished preparing a magres parser. I included the parsing of the quantities in your file format, and I managed to connect with the CASTEP i/o files if these are present in the upload. I think it makes sense if you can check, with some examples, if the parser works as you think it should. Then, we can set up another meeting to tackle more seriously how to integrate this parsing into your database. From my side, I think the best would be to have for your database CCP-NC to be the front-end of whatever is stored in NOMAD from NMR, but I would be happy to hear your thoughts. |
@jryates Further to my email earlier today, I'm tagging you in this magres parser development thread to help move the conversation forward. best wishes, |
Addressing a few comments further up the thread:
|
|
Thank you for your work on including the magres parser to NOMAD. The magres parser looks and works seamlessly. The parsing speed was quite quick, it took only a couple of seconds for each upload. We tested the magres parser with a few sample magres uploads (test upload, but not published) - one special inorganic material 'wadsleyite' and a well-known inorganic material 'coesite' (where we tested two variations of symmetry information in the magres file). Comments and questions from testing
Many thanks in advance. EDIT: Attaching the magres files we used for the test, for your reference. |
Thank you very much for testing the changes and giving feedback. Also, sorry for the long reply, I would like to comment 3 main things which directly affect you. New NOMAD plugins structureNOMAD will become more modular, so that people can develop independent packages (or plugins) and use them in their own installations or in the central one after approval. This means that:
Answering questions by @Sathya-S3 and @jryatesAbout the symmetry, NOMAD uses a package called MatID to classify and extract symmetry information. Am I understanding correctly that the symmetry was extracted properly by NOMAD, or due to missing these pieces of information, was it not?
So the units in NOMAD are defined based on the S.I., and we handle Quantities following pint. Units can be then changed by multiplying with You can test your uploads using NORTH in NOMAD. This allows you to launch a Jupyter notebook directly in a folder where you can find your uploaded data. Maybe @ladinesa can tell you the exact details on importing the
You are totally right, thanks for spotting this. It is clearly a mistake from my side, I will fix it asap 🙂
Very good point. However, as we work with
You can let me know what you think. A screenshot or demo of option 2 might be better to fully get the idea 🙂 CCP-NC and NOMADWe should maybe meet and talk of solutions to work from both databases. I have some feedback from other NOMAD devs, and I think we can talk very nice options. Let me know if you want to meet, and when. |
Hi @JosePizarro3, thank you for the detailed responses to our questions and additional new information on NOMAD platform's development direction.
We'll keep watching the link for updates.
Yes, definitely. We look forward to directly being involved in further parser development. To the part about our initial questions,
Yes it was extracted correctly, even when we deliberately entered incomplete symmetry information in the magres header. I think @jryates' and my comment really was that magres file's symmetry information is ignored my NOMAD. Down the line, it might be desirable to use magres symmetry information to calculate symmetry as an extra validation check?
I'm keen to test this further and will set some time aside for this. I'll wait first to see if @ladinesa has more information to add as you suggest. Your ideas for the atom labels, both short and long term, sound good. Please let me know if I can be of help either with the development or by providing periodic feedback during development. CCP-NC and NOMADI have a positive response from the CCP-NC working group to proceed talks with NOMAD about our partnership. I'll prepare a list of technically focussed questions surrounding CCP-NC data in a NOMAD supported database. My colleagues from Physical Sciences Data Infrastructure (PSDI) also have data-centric and logistical questions of their own to add. We'll aim to get these questions to you within a week's time. I'm aiming to arrange a sit-down between your team and us (me + PSDI) in the first instance, some time next week. If your team members are on holiday next week (around Easter time), we can aim to block a time slot for the week after. We can deal with the meeting specifics through email. |
Ok, that sounds good. But I need to understand a bit better how to do this validation, and how the symmetry operations compare with the MatID. Give some days, and perhaps I will even write you some email with more specific questions. And perfect about the positive response 🥳 I am happy we can further collaborate and improve both NOMAD and the CCP-NC. I will let some colleagues know once you send me the questions and invitation for the Zoom, it might require some other expertises that @ladinesa or I do not have 🙂 |
Just a short follow-up:
|
.magres
files are output by both Quantum Espresso's GIPAW and CASTEP when NMR calculations are run. It would be great to have NOMAD be able to parse them.A parser already exists in ASE, for reference:
https://gitlab.com/ase/ase/-/blob/master/ase/io/magres.py
The specification of the file format exists here:
https://www.ccpnc.ac.uk/docs/magres/magres-format.pdf
The text was updated successfully, but these errors were encountered: