- Administration interface
Koala consists of multiple separate processes (red) and middleware components (blue).
Data Management (DB)¶
Primary Datastore. Contains Asset metadata, information regarding ingest and retrieval requests as well as known filetypes, schemas and user information. Currently supported is mysql
Message Queue (MQ)¶
The message queue is used to asynchronously route ingest and retrieval requests to loader/retriever applications. Currently supported is RabbitMQ
Physically stores the assets. Multiple implementations of an Archival Storage client are available, for example TSM, filesystem.
Stores heartbeat informations about the running applications which is shown on the web application. Retrieves and caches statistics about the archived packages from the main database. Currently supported is redis
Monitors the upload directory for new SIPs and publishes the data for further processing to the message queue. Only one scheduler must be running at the same time.
Multiple Loaders can be started simultaneously which then connects to the message broker and processes the SIPs
Processing includes among others the following steps:
- validating files and metadata
- calculating checksums
- storing metadata in DB and assets in Archival Storage.
The status of the ingest process is continuously updated and can be monitored through the Web UI or API.
A reverse proxy for SSL offloading and serving of static content.
Provides a http api for third party applications and a web ui for admin operations.
Multiple Retriever can be started simultaneously which then connects to the message broker and processes retrieval requests from clients. AIPs are fetched from the Archival Storage system, afterwards converted to a DIP and saved in the http accessable downloadrea. The status of the retrieval process is continuously updated and can be monitored through the Web application.
Monitors the downloadarea filesystem size and removes old packages once the highwatermark is reached.
Queries the main database, calculates statistics and saves the information in the cache.
The ingest process is modeled as a state machine with done/error/fatal end states. As soon as the client gets a "done" ticket by the web application, the AIP is saved on stable storage (archival storage) and every participating system has the required data e.g. mdqi, data management. Every ingest of a SIP is also a "transaction" which means that a failing step leads to a recovery/rollback operation. The recovery is done by the loader error state or in case the operation was fatal by the purger application. During the recovery procedure the purger locks the corresponding workarea and blocks the loader application from starting.
When the client requests a specific package, a retriever application saves the data to the http accessible downloadarea. The DIPs remain accessible until a configured threshold is reached. Typically 70% downloadarea usage. Then DIPs are being removed from oldest to newest by the purger application.
koala can be configured to use local and a remote authentication mechanisms. Local accounts are saved in the database with hashed and salted passwords. Remote accounts can be authorized with LDAP or Single Sign-On via OpenId Connect, see Configuration. To allow a remote account to access koala it is nevertheless required to manually create a local account and assign appropriate permissions, see Permissions.
Functional accounts with permission=api must be created as local accounts.
koala defines a set of permissions which restrict the access to different ui and api functions. Permissions are specified while creating or editing users via the ui.
The following permissions are available:
- admin - grants the ability to access all ui functions
- monitor - grants the ability to view the dashboard
- api - grants the ability to execute all api functions
- api_restricted - grants the ability to execute all api functions except deletion of assets and AIPs
Ingest status query¶
Query ingest status by SIP name.
Ingest status result¶
Show result of ingest status query.
Configure accepted file types.
Show administrative actions like login/logout or deletion of assets.
Configure schemata to validate against.
Manage administrative and API user accounts and their permissions.