Kooperatives Langzeitarchiv. A digital archiving system written in python. Inspired by DIAS-Core from the kopal project.
Koala is a service which can be used to form a long term archival solution in combination with a pre existing data repository. It is not designed to face end users, rather to be used as a backend service in a service oriented architecture. Gathering of technical metadata and basic validation must be done before ingesting data into koala.
Example use case¶
The GWDG is a cooperation partner of the Deutsche Nationalbibliothek (DNB) in the field of digital long-term archiving. Its task is to collect, permanently archive, comprehensively document and record bibliographically all German and German-language publications and includes digital records as well.
Workflow: The producer of assets (for example a publishing company) delivers books in a digital format to the DNB. The DNB does some basic validation and extracts technical metadata followed by the creation of a Submission information package (SIP) package. A SIP is an archive file like zip or tar which includes content files and descriptive metadata. The SIP gets send to koala which ensures long term data integrity and accounting.
- Fast and scalable: Can be deployed on multiple hosts.
- Easy deployment and upgrade with docker images.
- DIAS interface compatible, see: API
- Build upon proven open source components.
- Supports multiple archival storage backends: Tape archival with IBM Spectrum Protect (formerly Tivoli Storage Manager), filesystem.
- Supports SFTP for ingesting SIPs
- Supports multiple SIP archive formats: zip, tar, tar.gz, 7z.
- Secured connection: TLS for Web UI and API.
- Protected endpoints: http basic auth for API; Local DB/LDAP/Single Sign-On for Web UI
- Web UI for: live statistics, administrative changes, auditing
- DROID signature file importer
There a two main actors using the koala system:
An API client is another remote application which consumes the REST and SFTP interface to ingest and retrieve assets.
A human administrator who controls the system through the provided web interface. The admin can monitor the running system which includes reviewing the status of the running applications and getting live informations about the growing size of the archival storage system.