eResearch

You are here

Mirage Software Products

This page lists the more significant software products that have been developed as part of the Mirage project. These include:

  • The ACLS Proxy - this service provide a primitive bridge between ACLS access control and other components.
  • The Data Grabber - this service "grabs" data files written to a file share, assembles it into Datasets, and  presents them as an ATOM feed.
  • The ACLS Authentication Adapter - this service provides simple HTTP authentication using an ACLS server as its authentication authority.
  • ACLSlib - this Java library implements the client and server sides of the proprietary ACLS login console protocol(s).
  • The MyTardis atom-feed app - this app ingests Datasets provided by an ATOM feed.
  • The MyTardis oai-pmh app - this app implements an OAI-PMH service that allows harvesting of RIF-CS collection metadata for Experiments.
  • The MyTardis related-info app - this app allows the user to designate an Experiment as being related to external publications or resources.
  • The MyTardis anzsrc-code app - this app allows the user to attach ANZSRC "field of research" (FoR) codes to Experiments.
  • The MyTardis data-migration app - this app moves data between different storage locations in a MyTardis implementation.

These source code for all of these components are available from GitHub (see below).  Our MyTardis components are BSD licensed (to be consistent with MyTardis), and the other components are GPLv3.

The Mirage project has also made extensive contributions to the core MyTardis codebase including enhancements to the core data model and the user interfaces.  We are continuing to contribute as time permits.

The ACLS Proxy (Eccles)

The ACLS Proxy is a service that is designed to sit between an ACLS server and a number of systems running the ACLS "login console" application. To the ACLS server, the proxy appears to be a collection of login consoles on the same IP address using the "host id" identification scheme.  To each login console, itappears to be the ACLS server.

The proxy works by accepting requests from the login console apps, and passing them through to the real server.  When the server responds, the replies are passed back to the console app.  In the process, the proxy keeps a record of the login and logout events for each console in a shared database table for use by the Data Grabber.  The proxy also makes use of descriptors for the instruments and their consoles that are maintained using the Data Grabber's admin web UI.

The ACLS Proxy is packaged as a free-standing Java service that is run (on Linux) via an "/etc/init.d" service script.  The source code is on GitHub in the https://github.com/UQ-CMM-Mirage/CMM-data-grabber repository, in the "eccles" subdirectory.

The Data Grabber (Paul)

The Data Grabber is the service that "watches" for files written to each instrument's file shares.  When files are written or updated, the service gathers related files to form a Dataset, copies them to a staging area, and then creates an entry in the ingestion queue.  Datasets that can be identified as belonging to a specific user (i.e. that were saved to instrument share while the user was logged in) are made available for immediate ingestion.  Other Datasets are "held" until some user claims them.

While the Data Grabber service is designed to run all of the time, it is sometimes necessary to shut it down for a period.  When the service restarts, it goes through a "catchup" process where it processes files that arrived while it wasn't "grabbing".  It can also be used to check that files were grabbed correctly by comparing the copies of the original files in the file shares against the grabbed copies and / or their administrative metadata.

The Data Grabber is a webapp implemented using J2E servlet technology, and is designed to run in a Tomcat 6 or 7 web container.  We use Hibernate, Spring MVC and JSPs for the web UI.  Java 7 is a prerequisite because we make use of the "WatchService" APIs introduced in Java 7. The source code is on GitHub in the https://github.com/UQ-CMM-Mirage/CMM-data-grabber repository, in the "paul" subdirectory.

The ACLS Authentication Adapter (Benny)

The ACLS Authentication Adapter is a simple HTTP service that can be used to check username / password pairs against ACLS.  Another service (e.g. MyTardis or the Data Grabber) sends a sends an HTTP GET request to the adapter.  The service responds with a 403 response containing an authentication challenge.  When the other service provides the username and password, the adapter sends an ACLS console login request to the ACLS service (using a special "dummy" instrument name).  Depending on whether or not the ACLS service accepts the username and password, the adapter responds to say that login is allowed or denied.

The Adapter is a tiny webapp, implemented as a J2E servlet.  The source code is on GitHub in the https://github.com/UQ-CMM-Mirage/CMM-data-grabber repository, in the "benny" subdirectory.

ACLSlib

The Eccles and Benny services depend on a common protocol library called ACSLib.  This Java library allows an application or service to send and receive ACLS console protocol messages in both the client and server roles.  Unfortunately, the protocol has no authoritative specification, and it exists in a number of different versions.  The library is a best effort attempt to hide this messyness.  It was produced by reverse engineering the source code of a couple of versions of the ACLS login console that were provided to the project. We cannot guarantee it will work with other versions, or continue to work is ACLS is changed.

The ACLSlib source code is on GitHub in the https://github.com/UQ-CMM-Mirage/ACLS-protocol-library repository.

The mytardis-atom-feed app

Mirage uses an ATOM feed provided by the Data Grabber to find out what new data is available for ingestion.  This ATOM feed is processed by the atom-feed app.  This python / Django app is specifically designed for the MyTardis framework, and works by polling the ATOM feed on a regular basis.  When it finds a new entry, it creates a MyTardis Dataset and Datafiles (and if necessary an Experiment) populated with the basic ownership metadata and Data Grabber URLs.  Then it creates an asynchronous task to "stage" the Datafiles, which causes them to be "pulled" to the repository by the MyTardis core system.

The source code for the atom-feed app is on GitHub in the https://github.com/UQ-CMM-Mirage/mytardis-app-atom repository.

The oai-pmh app

The oai-pmh app is used to make the metadata for Experiments available for external harvesting using the OAI-PMH protocol.  As the app currently stands, there is support for two metadata formats: the mandatory DC format required the protocol, and RIF-CS.  Only "public" Experiments with a non-empty description are made available.  (Other "publishability" checks are implemented in the MyTardis front-end when the user attempts to make an Experiment "public".)  This app is part of the core MyTardis codebase, and the code is available from the "master" MyTardis repository on GitHub as https://github.com/mytardis/mytardis in the tardis/apps/oaipmh directory.

The related-info app

The related-info app allows a user to relate an Experiment to an external resource; e.g. a website or publication.  When provided, this information will be included in the RIF-CS records published by the oai-pmh app. This app is part of the core MyTardis codebase, and the code is available from the "master" MyTardis repository on GitHub as https://github.com/mytardis/mytardis in the tardis/apps/related_info directory.

The anzsrc-codes app

The anzsrc-code app allows the user to attach FoR codes to Experiments.  When provided, these are included in the RIF-CS records published by the oai-pmh app. This app is part of the core MyTardis codebase, and the code is available from the "master" MyTardis repository on GitHub as https://github.com/mytardis/mytardis in the tardis/apps/anzsrc_codes directory.

The data-migration app

The CMM Mirage repository is deployed on infrastructure that does not have enough primary online disc space to meet our needs.  We are addressing this by setting up secondary storage locations on campus (e.g. using QCIF provided "cloud" facilities), and using the to hold "less important" data files. The migration app is being developed to support this.  It supports (or will support):

  • migration of datafile replicas from one storage location to another,
  • mirroring of datafile replicas,
  • automated selection of datafiles for migration based on a scoring system, and
  • archiving of entire experiments for off-line storage.

The data-migration app is part of the core MyTardis codebase, and the code is available from the "master" MyTardis repository on GitHub as https://github.com/mytardis/mytardis in the tardis/apps/migration directory.

(At the time of writing, the MyTardis codebase has an "early release" version of the data-migration app that only knows about one replica of any given datafile.  We are currently working on a version that tracks multiple replicas.)