ICAT Collaboration Meeting - 30th January 2025
Attendance
Attendees:
- Rolf Krahl
- Louise Davies
- Marjolaine Bodin
- Kevin Phipps
- Malik Almohammad
- Patrick Austin
- Santhosh Anandarama
- Viktor Bozhinov
- Chris Prosser
- Andy Gotz
- Allan Pinto
Agenda
Site Updates
SESAME
Portal running. Looking forward to add features to the Portal.
HZB
Not much to report. Working on setting up central online storage - set up ingest workflows.
ESRF
Still working with MX people to integrate ispyb into dataportal. Also merging in sample tracking to dataportal as well. Aiming for March to have both in.
Users can reserve a DOI - and mint when data is published so leave data opening to last moment. OSCARS project called SHARE - database, similar to Paleo database & Human Organ Atlas, dedicated to Cultural Heritage
ISIS
Adding new instruments. Certifcate updates. Updating the user office authenticator - migration from SOAP to REST - went into production last week.
Currently working on migrating backup server from CentOS 7 to Rocky 9.
DLS
New version of icat ingest client in place just before xmas. Kevin & DLS working on replacement for SQL jobs that propogate e.g. users, experiment info into ICAT - icat.sync. Python-based - using ICAT API rather than direct DB access. Went into production earlier this week.
Up & coming - testing new version of DGW & ICAT 6.1 & new search (icat.lucene v3).
Issue with tape library - tickets raised with vendor.
Sirius
ICAT in production mode. passed security screening, ready to expose to external users. Right now, data is not open to external users but will in coming months.
Working on extracting data from beamlines. Constructing pipeline to extract data from beamlines automatically to ICAT.
We're ICAT now! :tada:
Component Updates
icat.server & icat.lucene
Changes: started 3 years ago??? Been ready for a while. Main motivation: bring back datafile search (previously limited to 2B files due to max int size). Also added some new features, e.g. expanded fields able to search, sorting search results, faceted search.
New major version of icat.lucene - requires re-indexing. Minor version of icat.server as icat.lucene is an optional component. icat.server - oppurtunity to not run icat.lucene & instead run Open/Elasticsearch directly. Should work with both as they had a similar interface when the search was done. Alan is working on some further work on the search to make it more compatible with Quarkus.
icat.lucene now actually provides you with actual data, rather than just list of IDs (and you'd have to go back to ICAT to get the data from those IDs). Some more info being stored in the index, e.g. user info. Still checks via icat.server rules, but can do "rough" authz by doing searches like "show me results where I'm an investigationuser" which requires less work on icat.server authz. Also on authz, we batch authz checks rather than going 1 by 1 per result. Also added some timeouts for long running searches.
Rolf: 2 Qs - on replacing with ElasticSearch - limitation of icat.lucene was you couldn't cluster - is that still the same? Patrick: still the same for icat.lucene. We've implemented a crude implemation of sharding to get around 2B index limit. Running against ElasticSearch you can use a cluster - limitations are that it might not load balance "properly", only sends requests to one server but that server can push requests on. Focus went away from ElasticSearch as DLS wanted to keep icat.lucene
Rolf: would be nice to have comparison between icat.lucene and ElasticSearch. might be interesting to have an overview of pros & cons of each method. Patrick: supposed to have the same functionality - performance would likely be the main diff. Would be an interesting comparison.
ICAT stack containerisation/Quarkus
Alan is still working on this. Main aim is to get it running on Quarkus so it can run in k8s - main issues are around threading.
It has been brought up whether we can support both Payara & Quarkus with the same code - it might be leaning towards no but Alan isn't here to comment on that.
Context for this is new facility in STFC coming online pushing us towards containerisation.
Rolf: why need to move from Payara to run in containers? Kevin: Quarkus is aimed towards containers first. e.g. start up for Quarkus is magnitudes faster than Payara.
ids.server & python-icat
No activity
Andy: had a user ask about downloading .tar from IDS? Rolf: it's hard coded in IDS that it creates ZIP so would require a code change. Still have the issue that the codebase has lots of technical debt that needs refactoring - this would allow for easier developments of new features like this. Could easily change the current code, but the more you develop in the current code the more complicated the code gets. Andy: .tar is better if it gets corrupted Rolf: yeah, in back of mind have ideas of delivering more sensible return data. E.g. metadata in the returned data. But more complex, requires querying ICAT etc. In the future we might consider e.g. deliver an RO-Crate. Andy: yeah e.g. we don't currently deliver the licence either Rolf: yeah, thinking in the future including metadata would be nice Andy: also with ZIP file you lose original datafile datetimes for create and mod times. Rolf: might not be a ZIP limitation, might be IDS or storage limitation. Theoretically the metadata is in ICAT, so could query ICAT and use that.
Rolf: can just open an issue - even if not to solve immediately just to keep track for the future.
Kevin: did you have someone working on refactoring the IDS? Rolf: yes, but sadly that project has stalled. Difficult as more urgent things are needed at HZB right now, e.g. icat.server schema changes.
AOB
ESRF news
ESRF has been added as a trusted data repository for europe. Kicks off next week. Part of Fidelis project.
Also published paper on DRAC in synchrotron radiation news: https://www.tandfonline.com/doi/full/10.1080/08940886.2024.2432816
DOI landing pages - not compliant with google datasets. Rolf: HZB also needs to work on this as well
Presenting next week on the PaN EOSC node. 11 catalogues brought online as part of EOSC federation. Rolf: HZB not mature enough for that yet we think
DataPortal: exploring accessing multiple ICATs. That was in TopCAT though I think? Louise: Yes, but was not used! Kevin: Relies on having the same metadata Andy: Yes! Of course, would work for ESRF & ALBA as they're both ingesting the same data for MX (due to sharing ispyb specification). Andy: how did it work in TopCAT? Louise: selected the facility first, just basically two "TopCATs" sharing the same URL domain. Andy: MX has been wanting this for decades.
Rolf: PaNOSC search api and PAN data portal was supposed to be this no? search api in bad state, but should probably work more on the search api. Not equivalent to Andy's usecase as it's more specialised, but search api could help generalised.
Andy: Hiring someone in OSCARS to work on search-api. Plus the AI search project which likely wouldn't use the search api.
Rolf: useful to work on multiple tracks at same time: specilised MX usecase, and also general case search-api. (search API is not currently useful). Andy: search api does work! just not useful Louise: actually the free text search doesn't even work for ISIS as it has too much open data. (Andy - we might work on that with the guy funded by OSCARS) Rolf/Andy: need more work on the data model. Andy: need to move on. Get experience for other community. GoTriple in social sciences. Rolf: keep us posted!
Database exceptions
Kevin: Database exceptions, root cause of the stack trace is connection closed in batches. Not often but everyday, 5-10 at a time until the next time in like 8hrs-a day. Bug in Payara? Bug in glassfish? Looked online & tried to be fixed multiple times but the same bug reoccurs.
Preprod server only had one instance in like a year, but that's not used very much
Marjolaine: yes we have it on production server on Payara 5.
Issue seems to be stale connections. We have a check to see if a connection is live before using. If it's found stale it's supposed to be killed but it's not - it's returned from the pool. Then later something will pick up the stale connections.
Patrick: if we use to Quarkus then we won't have that problem!
Andy: related to when things get slow & we have to restart everything to fix it?
Marjolaine: not sure, one time we had a lot of connections and there were too many and it caused an issue.
Andy: one time last year load was so high on the server it had to be killed manually. We weren't sure what triggered it though.
Rolf: an argument for clustering/k8s env as you could get higher availability
Digital Curation Conference Workshop
https://dcc.ac.uk/events/idcc25/workshop-programme
Andy: is anyone going? it's online. Just before Digital Curation Conference & FAIRFest.
Kevin: In the Hague? Antony our group leader is going to FAIRFest