Skip to content

Installation and Deployment

Installation

Clone repository and edit config files

root@koala:~# git clone https://gitlab-ce.gwdg.de/koala/koala.git
root@koala:~# cd koala/install/compose-koala
root@koala:/root/koala/install/compose-koala# l
common.env  dias-koala/  docker-compose.yml  dsm.sys  koala.key  koala.pem  mysql_dump.sql
  • Edit config files: common.env, dsm.sys
  • Ensure paths exists, for example: /dias/workarea
  • Create ssl cert with private key

Login to docker registry

root@koala:~# docker login docker.gitlab-ce.gwdg.de/koala/koala
Username: user@gwdg.de
Password:
Login Succeeded

Startup stack

root@koala:/root/koala/install/compose-koala# export NR_OF_CPUS=$(grep -c processor /proc/cpuinfo)
root@koala:/root/koala/install/compose-koala# docker-compose up -d --scale loader=$NR_OF_CPUS

Deployment

The deployment of the koala stack is done via docker containers and orchestrated via docker-compose. A sample docker-compose.yml can be found in the install directory. A basic deployment consists of the following application containers: web, scheduler, purger, loader, retriever, stats. Load dependently additional loader/retriever can be spawned.

Minimum requirements:

  • x86_64 Linux
  • Docker >= 17.03.0-ce
  • 2 CPU,
  • 4 GB RAM,
  • HDD System 20 GB
  • HDD DB 50 GB
  • HDD Workarea 200 GB,
  • HDD Downloadarea 200 GB
  • HDD Uploadarea 200 GB

Note

Size estimation based on SIPs < 50 GB. Larger SIPs require appropriate dimensioned upload- and workareas.

Typical

Hardware

- koala-test koala-prod
Description Test system Productive system
Applications Database, MQ, Cache, Koala apps Database, MQ, Cache, Koala apps
CPU 2 4
RAM 8 GB 6 GB
HDD 1 TB 256 GB
SSD - 50 GB (uploadarea), 100 GB (workarea)
Platform VM VM

Cluster

Variant A

Applicable if loader processing time is the bottleneck.

  • shared storage for koala-1 and koala-2
  • web application only on main instance
  • middleware only on main instance
  • ingests get distributed across two nodes
  • koala-2 does not need a public IP

cluster-a

Variant B

Applicable if transfering the files with SFTP is the bottleneck

  • retriever and web application only on main instance
  • middleware only on main instance
  • ingests get distributed across two nodes
  • retrieves only on main instance
  • no shared storage needed
  • koala-2 needs a public IP
  • purger app per host needed
  • client application must ask with /api/ftpinfo which hostname to use for uploading files

cluster-b

Variant C

Applicable on A, B and: multiple retrieves simultaneously, high write/read load on middleware.

  • same as B
  • koala-1 and koala-2 http and sftp interfaces can be used interchangeably
  • koala-db as separate host for middleware
  • koala-db does not need a public IP

cluster-c

Software

Frontend

frontend

Backend

backend