ICAT Project


Meeting 115 – 29th March 2018


Stuart Pullinger (SP), Chris Prosser (CP), Brian Ritchie (BR), Rolf Krahl (RK), Heike  , Alex di Maria (AM), Maxime Chaillet (MC).

Apologies: Silvia Da Graca Ramos.

Actions from previous

102.2 – BR still working on this.

113.1 – SP to look at IDS pull requests. SP can’t remember where he’d got to. Rolf sent responses today; back to Stuart to run Rolf’s test script. Rolf ran test, no problems.

113.2 – Catherine (CJ) to organise the Topcat development meeting. Probably not going to happen this month, maybe some time late in May. Lots of us away.

ICAT Components

Topcat (BR) –  DLS has problem with cart which revealed that add-to-cart client code was casting longs (in json) to signed ints, resulting in overflow. Fixed and released as (as DLS are using 2.3.2); fix applied to master, but not to current release (2.3.6). Hope to release 2.4.0 rather than patch 2.3.6.

IJP (BR) –  progress in prototype deployment for Octopus – finally succeeded in submitting IJP job to Scarf (Platform LSF cluster).  Continuing to develop HTCondor batch connector.

**IDS (RK) **–  snapshot one month ago, no comments. Considering releasing ids plugin 1.4, ids storage, ids server 1.9.0, sometime in April. Stuart will test these at same time as trying out Rolf’s script. (RK: script could be tried on production as it’s read-only. SP will try on DLS pre-prod.)

Site Updates

**DLS (CP) – ** experiencing a number of problems. Service usage is increasing: 1000 recalls in last 6 months. There are issues with recalls moving from prep to expired. The ‘prepareData’ call is not timely in responding (exceeding the 30 minute timeout in topcat?) DLS have a mechanism to replay recalls, but tedious & manual. CP thinks root cause is the query generated by Java/JPA for large filesets: Oracle not able to execute it quickly enough. Also experiencing probs with Topcat download test script, during high (IDS) load, tests can time out.

RK: open issue on ids.server from DLS, “archive can be too slow” (last Sep); perhaps filling up ids cache?

CP: if several users request large files, the cache can get full.

RK: needs more info than described in the issue to determine circumstances.

HZB (RK): not too much. 2nd data publication. no workflow yet so improvising.

ESRF (AM): DOI / landing page probs – datacite. Not sure about features. More beamlines. No icat issues.

SP: After upgrading to Payara, increased memory usage on DLS preprod, related to lucene (icat.server 4.8.0 – so Lucene component is still integrated). Possibly some differences in default configuration between Payara and Glassfish? No-one else sees similar problems.   Nor ISIS.

Review Outstanding Schema Changes (SP)

The proposals are drawn from presentations at the Face to Face meeting. This review is only intended to decide whether a proposal has enough support to be considered. The proposals will be looked at in more detail at future meetings. Comments are by SP unless indicated.

Rolf’s presentation

The first and last proposals are uncontroversial, the middle proposal is larger and may need more discussion.

Persistent ID Proposal: we have persistent IDs for investigation, dataset, datafile, but not for instrument, parameter type, sample. Propose to add doi field to these entities.
Conclusion: Consider adopting this.

Data Publication Workflow: Workflow to include fields from DataCite. Would require adding new classes/relations. SP sees no problem, but thinks it needs more discussion, so suggests pushing it back (behind other changes).

SP: wonders if it has any crossover with collaboration/integrations with other data-sharing standards eg. OAIPMH?
RK: suggests DataCite metadata. Depends on which objects we want to describe. For current schema, Investigation doesn’t perfectly match Datacite (would need to be lax on semantics). Dataset would need creator and maybe contributors. InvestigationUsers may not be Creators, etc.
AM: Using InvestigationUser as Creators.
RK: proposal matches DataCite
Conclusion: Consider adopting. Discuss after other, easier, schema changes.

Add Relation from Shift to Instrument:

SP: small, easy.
Conclusion: Consider adopting this.

Steve’s presentation

Eliminate unused entities: for example: study, publication, keyword.
SP: worth keeping and reviewing.

Rename Investigation to Visit, introduce Proposal/Investigation: SP: not sure it’s worth the effort. Alex: makes sense to us (and DLS).
Conclusion: Consider adopting this.

Add count of datafiles and total size to dataset and up:
Worth considering. No objections raised.
Conclusion: Consider adopting this.

Removing dataset replacing with Directory (possibly hierarchical)
Remote Datafile and Dataset/Directory from RDBMS.
These are more radical changes; but see whether DLS or others would like it.
AM: would like Datasets to be able to contain Datasets.
RK: need to discuss v. thoroughly – big impact.
Conclusion: Consider adopting. Discuss after other, easier, schema changes and after getting more feedback from DLS.

Final proposal

Add json type to parameter fields.
RK: how search for individual items? Other than that, happy.
AM: considering creating another component, json-based, to add searchable data linked to ICAT.
RK: would need a prototype to experiment.
Conclusion: Consider adopting this.

Ticket Review (SP)

  • Looking at the Waffle.io 'Kanban'-style issue tracker which displays the open issues in the Icatproject Github repositories. There are 4 columns: Backlog (containing most of the issues), Ready (currently empty), In Progress and Done.
  • Running out of time so only reviewing issues in ‘In Progress’
  • ‘In Progress’ should mean the owner expects to report on it at next meeting.
  • IJP-related tickets to be reviewed by BR
  • #368 Topcat shaking screen: BR has a fix in test
  • IDS items: RK happy to keep them in ‘In Progress’
  • SP: asks all to look at Backlog and move to Ready any that are higher-priority.


Item Description Assigned
102.2 Produce new releases of IJP components Brian Ritchie
113.1 Look at and test Rolfs IDS Server and IDS Plugin pull requests Stuart Pullinger
113.2 Organise a meeting for the discussion of a Topcat replacement Catherine Jones
115.1 Look at issues in the Backlog on waffle.io and move to Ready any that are higher-priority All
115.2 Review IJP-related tickets in ‘In Progress’ column on waffle.io Brian Ritchie
115.3 Create a timetable for discussions of schema changes Stuart Pullinger

Next meeting will be Thursday 26th April, 3pm UK time