Building and Deploying (with Docker)

Information on building and deploying the application in Docker.

Prerequisites

  • Create a database (e.g. terminologydb) in your postgres instance with UTF-8 character encoding. For example,

    psql> CREATE DATABASE terminologydb WITH encoding 'UTF-8';

    NOTE: when redeploying a .dump file for the second time, you need to remember to first DROP and then re-create your database as described above. Otherwise, the pg_restore command will simply add additional data.

  • Ensure docker is set up in a way that the running image can use up to 4G of memory (the server process itself is capped at 3500M).

Details

A deployment involves three artifacts:

Following are the steps to deploy the terminology server with a specified data set (example shows the testing dataset).

# Start by setting information about artifacts and your postgres config, e.g.: dockerImage=wcinformatics/wci-terminology-service:1.2.1-20240108 dumpUrl=https://wci-us-west-2.s3-us-west-2.amazonaws.com/term-server-v2/data/wci-terminology-db-TEST-2024.dump indexUrl=https://wci-us-west-2.s3-us-west-2.amazonaws.com/term-server-v2/data/wci-terminology-indexes-TEST-2024.zip PGDATABASE=terminologydb PGHOST=localhost PGPORT=5432 PGUSER=postgres PGPASSWORD=pgP@ssw0rd # Choose a directory where indexes will live indexDir=/data/index # Restore database (see lower in this document for restoring from a plain text dump) wget -O data.dump $dumpUrl pg_restore -O -n public -Fc --dbname=$PGDATABASE --username=$PGUSER data.dump # Unpack indexes # NOTE: ensure the docker user will be able to access the index files. # NOTE: if deploying with Kubernetes, you will want to use a persistent volume # (the other option is to put the data at an accessible URL and # the pod can be configured to download that data and unpack it locally) # mkdir -p $indexDir wget -O $indexDir/index.zip $indexUrl unzip $indexDir/index.zip -d $indexDir chmod -R 777 $indexDir # Pull and run docker image (use -d to put it in the background) # NOTE: these commands assume "sudo" is required to run docker # and expose the process on port 8080 of the machine sudo docker run -d --rm -e PGHOST=$PGHOST -e PGPORT=$PGPORT -e PGUSER=$PGUSER -e PGPASSWORD=$PGPASSWORD \ -e PGDATABASE=$PGDATABASE -p 8080:8080 -v "$indexDir":/index $dockerImage

 

After launching, you should be able to access the application via http://localhost:8080/terminology-ui/index.html

Script for Loading Database and Indexes

If operating in an environment where you have local psql client tools available and connectivity to the database, you can use this handy load-data.sh script to make the process of loading (or reloading) the database a little easier. See: https://github.com/WestCoastInformatics/wci-terminology-service-in-5-minutes/tree/master/load-data

Watching Logs

Logs can be easily viewed by just watching the docker logs (e.g. sudo docker logs -f <container>). However, the application uses a JSON logging format that can be a little hard to follow. We find that this perl script is useful in turning the logs into a more readable form.

$ cat > jlog.pl << 'EOF' #!/usr/bin/perl while(<>) { $et = ""; if (/.*"extendedStackTrace":\[([^\]]*).*/) { $et = $1; } if (/.*"thrown":\{.*"message":"(.*)","name".*/) { $em = "$1"; # $em =~ s/(.{1,200}).*/$1/; } if (/.*"name":"([^"]*).*/) { $name = $1; } /.*"level":"([^"]*).*"message":"(.*)","(endOfBatch|thrown).*"time":"([^"]*).*/; $level = $1; $time = $4; $x = "$2"; if (!$x && /"url":"([^"]*).*"status-code":"([^"]*)/) { $url = "$1"; $status = "$2"; $url =~ s/.*http.*\/\/.*\//\//; $x = "$status $url"; } $x =~ s/\\"/"/g; $x =~ s/\\n/\n/g; # $x =~ s/(.{0,200}).*/$1/; print "$time $level $x\n" if $x; if ($et) { $indent = " "; print "$name: $em\n"; foreach $trace (split /\},\{/, $et) { $trace =~ s/.*"file":"([^"]*).*"line":(\d+),.*/$1\:$2/; print "$indent$trace\n"; if (length($indent)<20) { $indent .= " "; } } } } EOF $ chmod 755 jlog.pl

With this script in hand, something like this can be done to see the logs more easily:

sudo docker logs <container> | jlog.pl

Connecting to Postgres with SSL

One additional environment variable can be passed to the docker container to add JDBC URL parameters - PGJDBCPARAMS.

This mechanism can be used to inject SSL parameters, for example (use a non-validating SSL connection):

For a situation where you know the PGSSLROOTCERT parameter you would use to connect via psql, the following can be used:

In the case of using PGSSLROOTCERT, it must be set to a path that is accessible within the docker container. The easiest way to achieve this is to use the already mounted volume in the container and put your root certificate file in $indexDir and then set PGSSLROOTCERT to /index/<cert file>. The certificate file can be a PEM encoded X509v3 certificate.

NOTE: while you’re testing this, you may want to get your psql client connecting to the server with SSL first, so you can work out the proper “sslmode” and “sslrootcert” parameters you’ll want to use in the PGJDBCPARAMS. psql has parameters for --sslmode and -sslrootcert that you can pass when called to simulate what we will do with the JDBC params.

Troubleshooting

Need to restore postgres from a plain-text dump

This is what dumping to plain looks like:

The plan dump can be unpacked this way