ICAT Collaboration Meeting - 28th March 2024
Attendance
Attendees:
- Rolf Krahl
- Salman Matalgah
- Louise Davies
- Viktor Bozhinov
- Kirsty Syder
- Alex de Maria
- Marjolaine Bodin
- Kevin Phipps
- Malik Almohammad
- Alan Kyffin
- Patrick Austin
Agenda
Contact with SciCat
In addition to the potential meeting at NOBUGS, Rolf has been contacted by the SciCat developers.
Carlo Minotti having a SciCAT meeting in July, asking if ICAT wants to make a contribution at their meeting.
Alex: Elliot mentioned Kevin could present re: ICAT usage at ISIS & DLS Kevin: first I've heard of it!
Good to have closer relations with SciCAT project e.g. common metadata standards etc.
Kevin: are we expecting to hear more about this? not heard about this before now - are we going to get an email invite? Rolf: SciCat meeting 2-4th of July, suggested the ICAT community should make one or more contribution to that.
Alex posted in zoom chat - email from Carlo:
this year we will host the annual SciCat f2f meeting at PSI. The meeting will have this format:
- Day 1:
- Afternoon: Opening, features presentation and institution updates
- Day 2:
- Morning: User session, users feedback
- Afternoon: Developers session 1
- Day 3:
- Early Morning: Developer session 2
- Late Morning: OPEN, review and futures plan (available online and recorded)
We would like to include, on the morning of the second day, a keynote from ICAT to share common experiences and strengthen our collaboration.
How do you find the idea? Would anyone be willing to present (even outside this group ofc)?
A couple of ideas could be:
- the scientists' perspective on data curation: SWOT
- the strengths of ICAT
- what can we learn from each other? Naively I can think of the "unstructured" and "structured" DB schema, which has pros and cons (easier finding vs easier creation). Maybe here we could expand on what we plan to use to overcome these challenges and see if we find together better solutions.
- should there exist a common export/import format?
Rolf: useful for Alex and e.g. Kevin to report on experiences from both ESRF & STFC facilities
Schema changes
Rolf: I sent out a date poll: https://terminplaner6.dfn.de/p/a7018a6ed12d29bec10b204f83d2f479-645917, had some responses but not all April 2nd has 7 yeses and only 1 no (the no was Louise, and there's other STFC developers represented)
Collaborative agenda document - currently only proposals from Rolf
Kevin: we're only interested in proposals Rolf has already put forward
Alex: we're still interested in our proposals, but not urgent Rolf: if you want it to happen you need to put it in the issues Alex: again not urgent, but we can add them to discuss
Patrick: there's a lot of open schema issues on icat.server issue list - if one isn't proposed to be discussed does that mean no one wants it anymore? Rolf: we'll have to decide, can always go through in a second lighter pass to close/keep open lower priority issues
Site Updates
SESAME
Don't have huge update - developing final version of ICAT to go out to public Still internal only
HZB
Not much to report - still recovering from last year's incident.
ICAT still running but not accessible from outside. Discussion with security consultant, in principle green light for reverse proxy in DMZ for access to ICAT.
- Only issue is they require 2FA, which isn't supported by ICAT
Will start with opening up data publication landing pages & OAIPMH endpoint since those are fully anonymous
- ICAT will have to wait until sort out proper access management with SSO
- Kevin: has anyone considered this?
- Rolf: would probably use something like keycloak, and use SSO to handle it. Running keycloak ourselves is out of our scope though, dependent on another group.
ESRF
Working on Technique - beamlines each have PaNET ID - sent to ingest and linked in ICAT DB
Working a lot with MX crystallography, presenting processed data, filtering results/setting thresholds, letting users rank the best results
Rolf: q on Techniques Renaud and Ioannis, multiple techniques per instrument, and not trivial to assign which technique is assigned for each dataset etc. Alex: only the other of 4-5 techniques, relatively stright forward. Each dataset expected to have PaNET ID, and if it doesn't then we assume the parent technique (more general) e.g. x-ray spectroscopy.
ISIS
No-one from ISIS. Louise: been testing 6.1.0 snapshot (changes to Lucene component). Also having to rollout CentOS 7 to Rocky 9 changes - currently have dev stack on Rocky 9
DLS
Chris not here, Kevin reporting.
Most concerned with migrating machines off CentOS 7 and (ideally) onto Rocky 9. Lots of machines, storage stack below also needs to be done, so it's a lot of work
Kirsty: initial metadata project has concluded, scientific metadata into ICAT - fed info into one of Rolf's proposals
Component Updates
python-icat
Release 1.3.0 - 3rd release of XML ingestion stuff. Rolf expects it's mostly stable (despite saying experimental in the docs)
HZB plan for ingestion: directories corresponding to data, top level has XML metadata file. XML is validated, and uses XSLT Should be pretty extensible, but default should work for most facilities
Kevin: documentations or examples? Rolf: was the most important part of 1.3.0 release, dedicated documentation section
Ingest format (XML) is a more restricted version of the "generic" format, which is either XML or YAML. Documentation page describes exactly how this works
icat.server
Alan: been looking at code where the code is specifically single threaded or using synchronised blocks. Made a change to make a class thread safe
ids.server
Rolf: Markus working on refactoring IDS. He's done quite a bit of work, but Rolf hasn't had time to review so can't report more specifically
AOB
Quarkus
Alan: trying to get things threadsafe with the intent of using Quarkus Rolf: is the intent to replace Payara? Alan: ideally get it so you can use either Kevin: would be nice to have another option
Next F2F? NOBUGS
Not decided yet, but was proposed and many ICAT developers will be at NOBUGS anyway Rolf plans to attend. Sesame planning to attend. Hopefully can be joint. Kevin: it was mentioned last meeting right? Louise/Patrick: yes Patrick: when's the deadline? Rolf: April for proposal submissions Louise: general registration is much later though I think