ICAT Project

Search

ICAT Collaboration Meeting - 25th April 2024

Attendance

Attendees:

  • Rolf Krahl
  • Andy Gotz
  • Salman Matalgah
  • Louise Davies
  • Kirsty Syder
  • Marjolaine Bodin
  • Kevin Phipps
  • Malik Almohammad
  • Patrick Austin
  • Gabriel Preuß
  • Santhosh Anandarama
  • Viktor Bozhinov

Agenda

New ICAT collab chair

After Alejandra left, no one in place to officially chair the montly meetings. Rotate?

Andy to chair until June Rolf to chair after that until August Meeting in September (at NOBUGS) will be effectively F2F.

  • Rolf: Is this sufficient or is something else needed as an ICAT F2F?

Schema changes

Still a few questions to discuss... Had a meeting on proposed schema changes for next major release Idea was everyone who wants one to put it on table

Rolf was the only one to table suggested changes 3 discussed:

  • Sample - Investigation relationship #231
  • Properties of SampleType #326
  • New entity - Subject #327

Bottom line is they are OK in principle. Since these were Rolf's proposal he would be happy to write PRs. First one has one already but will need updating. Not clear when but hopefully before summer. Definitely not in May.

Need to discuss (justified concern) that... some facilities need these changes "soonish". On the other hand other facilities are happy but would need time to do the upgrade. Need to discuss timeline of releases that is supposed to come. Rolf would suggest (open to discussion) lets maintain 2 major releases for the interim period. Hopefully not too long. Still in principle provide patches to v6 while the schema changes go into v7.

ESRF care most about Sample-Investigation relationship.

Mostly a question for icat.server core maintainers.

Agreement that this is appropriate course of action. Attempt Autumn for v7.0.0. Maybe by NOBUGS?

Andy: Does this influence version number?

  • In general major is for incompatable changes, which schema definitely is. So this will be a major version change.
  • Kevin: it won't affect all the components (just icat.server and icat.client) like the ICAT 6 change did as that was associated with Payara6.

NOBUGS

Malik already submitted an abstract for SESAME use case. Act as knowledge trasnfer/training - should cover all parts in use by SESAME. Haven't had confirmation on submission

  • Rolf: I can see it, so it's definitely submitted
  • Andy: will get confirmation it's accepted after review stage

Rolf submitted his abstract - latest python-icat changes (i.e. the new ingest stuff)

Hector also submitting an abstract re: NeXuS reader

ESRF going to submit one on micro-frontends

Satellite meeting

Tuesday 24th to Thursday 26th for main conference, satellite meetings before and after on the Monday/Friday.

SciCAT supports a half day discussion on metadata catalogues. 50/50 ICAT SciCAT. Planned for Friday morning

How many planning on coming to NOBUGS?

  • Obviously those who submit abstracts
  • SESAME will attend for ICAT but also for CentOS -> Rocky migration
  • STFC not discussed in depth, would expect to send at least a couple (probably won't submit any abstracts though)
    • Patrick & Kevin are helping with Malik's submission
    • People who aren't attending physically could attend remotely

Do we need an additional F2F just dedicated to ICAT?

  • Are enough ICAT people attending in person?
  • Could have F2F on Monday or Friday afternoon?

To be discussed further in May's meeting.

SciCat meeting

First week of July - 2nd to the 4th? At PSI.

Carlo contacted STFC/Elliot from DLS. Elliot can't go, Kevin can't go, Louise can go on behalf of STFC. No further discussion since Carlo's email. Committed to sending someone but not considered what to present.

Rolf also not available to attend.

Kevin: is Alex de Maria attending?

  • Andy: he's been on holiday for 10 days, so we don't know
  • Patrick: last time they had remote attendance - they might have it again
    • This is an option on the registration form again

Site Updates

SESAME

Now using ICAT on LAN, not public. Have appointment with a contractor for a cybersecurity assessment before opening it up. Had another meeting last month with Alex to start making a new frontend on a different VM that's public facing. Everything else can stay inside the network.

Already mentioned NOBUGS paper.

HZB

Discussions on cybersecurity - would be interested in the results of SESAME's review. Some parts of their instance are accessible from the outside once again. Cannot allow login from outside without 2FA. This excludes direct access to metadata catalogue. Can have landing pages and OIAPMH. For login, would need to wait for proper ID management on a separate SSO instance that does a proper login and stuff to make the catalogue again reachable from the outside.

Otherwise, nothing new to report, only some discussions that are WIP. Started to work on extra storage where experiments can drop data that didn't exist before. Also, first meeting on ID management and bringing that forward. A big discussion and will take a little while before they're at that point.

ID management is to replace LDAP?

  • Problem is don't have any management for scientific users. Have it for staff but not the visitors that do the measurements. 90% of users therefore do not have a login. Cannot change it unless having a separate id management for the scientific user sessions for the people who genreate data at HZB. This is now understood by everybody but to get to the acknowledgment that AD is not enough was an effort.
  • Need a new authenticator?
    • Don't know yet. Start by gathering requirements. Guess will have an external contractor to provide the concept and whatever comes out of that, if it delivers OIDC then we are OK. Hope that it will, one requirement is that it uses industry standards.

ESRF

Making progress for integration of technique in ICAT. Testing workflow with data acq. and PANET and IDS during acq. Integrates with DOIs from DataCite. Do some tests over the next weeks. Will also ingest new data from the human organ atlas. New portal for paleontology - processed data from the past 10 years, moved it to ICAT. Old database got attacked by SQL ingestion, so good timing on the move to ICAT. Holidays have slowed work on MX

Mentioned ICAT in talk to DAPHNE. Boost processed data, will have more of it in ICAT. Minor issues with performance of the new portal (linked to the network and/or IT installation).

Generally, things are going well.

ISIS

Working on a couple of roadmap items. Data Corrections. Some of the Investigations need to have their RB numbers corrected. ISIS User office SOAP endpoints will be migrated to REST - need to update e.g. authenticator.

For dev instance have Data Corrections, need to do this on production.

DLS

No particular updates, maybe Kevin or Patrick to talk about?

Kevin: long running queries - queries on the Datafile level, especially when filtering. Causing lots of load on the DB. Trying to investigate, it's come back off and on for a while. Change Payara timeouts. Underlying problem is we have too many Datafiles - need to start addressing that. More data is generated & users get access to more data (more complex authz)

Rolf: Can't really reduce data coming off of the instruments? Kevin: no deletion policy Andy: Database sharding? Rolf: Or only allow access to Datasets for old data

Patrick: maybe don't allow users to browse Datafiles at all, only let them restore datasets Andy: could discuss in F2F meeting

Component Updates

None

AOB

OSCARS proposal on PaNOSC search API

Majid Ounsy from Soleil was trying to encourage Max Novelli to submit a proposal. Not very successful, but might be successful for second round. Rolf thinks we should push this as he thinks there are weaknesses of PaNOSC search API

Who has deployed the API?

  • ESRF do as a proof of concept. Used for human organ atlas but will replace it with ICAT+ very soon.
  • Rolf: For some calls the pid is a path parameter (GET for an individual Dataset uses pid as a path argumnet). Slash is a separator?
    • Majorlaine says to ask Alex
    • Louise: This was discussed at the time.
    • Viktor: Someone at HZB raised it. If it was more modern then it would decode it and it would work?
      • Rolf: cannot possibly be solved. regardless of encoding url should resolve the same. If you resolve differently, you are wrong. Would work if they do what they promised and used ids on the path parameter... See this as one of the weaknesses of their data model/API spec.
  • ISIS do technically have it deployed but not the scoring. This runs off open data and is not performant for ISIS which have 100K open. Failed at 30K. Dependent on RAM. Therefore ISIS could not be part of the portal. Raised with developers at the time. ESS asked us to test the fix but we haven't done that.
    • If this was fixed we could then deploy for ISIS and get on the portal. This would be the only use case we'd care about.

No indepedent spec of the whole thing - this is another problem. Should look into in the future