Differences

This shows you the differences between two versions of the page.

--- priv:navrh:backups [2019/03/25 18:34] – fiserp
+++ tutorial:adm:backups [2020/03/20 10:13] (current) – [Restoring IdM application] fiserp
@@ Line 1: / Line 1: @@
-====== Backups and recovery ======
+====== Server preparation - Backup and Recovery ======
 When it comes to backup, the CzechIdM deployments consists of three parts:
@@ Line 8: / Line 8: @@
 Depending on your deployment, there can also be sets of scripts, the Vault and so on. But those are highly deployment-specific.
+<note important>All backups should be shipped off the production machine. In case the production server fails, it must be possible to rebuild it from scratch (documentation) and to restore the repository and configuration from recent backup.</note>
 ===== Repository backups =====
-Using PostgreSQL allows us to do the ''pg\_dump'' of the repository which is a primary backup strategy. Because identity manager contains company's private data, it is desirable to encrypt the repository backup. We will store all backups in the ''/opt/backup/...'' location. To create encrypted backups, we will use following shell script:
+Using PostgreSQL allows us to do the ''pg\_dump'' of the repository which is a primary backup strategy. Because identity manager contains company's private data, it is desirable to encrypt the repository backup. We will store all backups in the ''/opt/backup/...'' location. To create encrypted backups, we will use [[https://github.com/bcvsolutions/czechidm-monitoring/blob/master/backups/encrypted_backup.sh|this shell script]].
-<file bash enc_backup.sh>
-#!/bin/bash
-# ********************************** READ ME **********************************
-#
-# General:
-# Script is intended to do encrypted backups of whatever you implement in parts
-# "do the dump" and "pack the dump". The result of your doing should be a tar
-# archive called "current_backup.tgz". This name is automtically recognized and
-# script will take care of everything else. Presumed shell is BASH.
-#
-# Output of the script is saved into BACKUP_LOC directory in an encrypted form.
-# Each backup consists of two files - symmetric key and public key. Because en-
-# cryption is done by openssl, which cannot process an arbitrary file directly
-# with RSA, files are first encrypted with random 32B key using AES-256-CBC.
-# This 32B key is encrypted with RSA public key which is stored on the machine.
-# Private RSA key SHOULD NOT be found anywhere on the same machine. If it was,
-# you could do plain backups and not bother with this at all and security would
-# be the same.
-#
-# Needed binaries and builtins:
-# test,echo,stat,id,tar,openssl,touch,chmod,rm,mv,find,date,basename
-#
-# Setup:
-# 1) Create separate system user to run this script, do not run it as root.
-# 2) Generate public-private key pair of at least 2048b:
-#		openssl genrsa -out backups-rsa-key 2048
-#		openssl rsa -in backups-rsa-key -out backups-rsa-key.pub \
-#			-outform PEM -pubout
-# 3) The backups-rsa-key file contains private key, store it in the keepass
-#	 or somewhere safe. Do not leave it on the machine!
-# 4) Move backups-rsa-key.pub to BACKUP_ROOT, set correct privileges (400),
-#	 name it as you wish and set RSA_ENC_KEY_FILE accordingly.
-# 5) Fill in the "do the dump" and "pack the dump" parts of the script to suit
-#	 your needs.
-# 6) Adjust other settings in the script as needed. Ensure that service user
-#	 used for dumping the DB, LDAP, whatever is dedicated to this and has
-#	 read-only privileges! This is IMPORTANT!
-# 7) Run the script as a cronjob. Preferred setting is in the crontab, not in the
-#	 /etc/cron.*/whatever file. But it does not really matter.
-#
-# Recovering backups:
-# Backups are stored in BACKUP_LOC as a pair of files. One file is an actual
-# backup encrypted symmetrically. The other file is a symmetric key for the
-# specific backup. (New symmetric key is generated for each backup run.)
-# Symmetric key is encrypted with RSA.
-#
-# To recover backups, do the following:
-# 1) Get you backups, we will call them "data.tgz.e" and "key.bin.e".
-# 2) Get your private RSA key "backups-rsa-key".
-# 3) Decrypt the AES key, you will obtain "key.bin" file:
-#		openssl rsautl -decrypt -inkey backups-rsa-key \
-#			-in key.bin.e -out key.bin
-# 4) Decrypt the actual backup, you will get a tarball:
-#		openssl enc -d -aes-256-cbc -in data.tgz.e -out data.tgz \
-#			-pass file:key.bin
-# 5) Extract the tarball:
-#		tar xf data.tgz
-# *****************************************************************************
-#
-# TODO:
-#		* better backups naming
-#		* something like .d directory where backup scripts will lay to make whole
-#			thing a bit more modular
-#		* add actions like "init", "recover" and "backup" to make script more
-#			user-friendly
-#
-# Revision history:
-# 2017-05-16  Petr Fiser  <petr.fiser@bcvsolutions.eu>
-#		* removed hardwired LDAP variables (original script was for LDAP backups)
-#		* removed hardwired lockfile name
-#		* PASS_FILE made optional
-#		* backup timestamp with granularity to seconds instead of hours
-# 2016-02-25  Petr Fiser  <petr.fiser@bcvsolutions.eu>
-#		* first version of the script
-# basic setup
-export PATH="/bin:/usr/bin"
-#directory where everything happens
-#should be empty except for backup scripts, keys and BACKUP_LOC folder
-BACKUP_ROOT="/opt/czechidm/backup"
-#hic sunt backupes
-BACKUP_LOC="${BACKUP_ROOT}/repository"
-#lockfile
-RUN_LOCK="${BACKUP_ROOT}/`basename ${0}`.lock"
-BACKUP_PREFIX="backup."
-BACKUP_SUFFIX=".tgz.e"
-BACKUP_AES_KEY_PREFIX="backup."
-BACKUP_AES_KEY_SUFFIX=".aes.key.e"
-#files with public RSA key and password file
-#if the password file is not needed, leave the setting blank or the file nonexistent
-PASS_FILE="${BACKUP_ROOT}/user.passwd"
-RSA_ENC_KEY_FILE="${BACKUP_ROOT}/repo-backups-rsa.pub"
-#backups retention period
-BACKUP_KEEP_DAYS="30"
-# setup runtime variables
-NOW=$(date +"%Y-%m-%d-%H%M%S")
-BACKUP_FILE_NAME="${BACKUP_PREFIX}${NOW}${BACKUP_SUFFIX}"
-BACKUP_AES_KEY_FILENAME="${BACKUP_AES_KEY_PREFIX}${NOW}${BACKUP_AES_KEY_SUFFIX}"
-# check root, must not run as root
-if test "$EUID" -eq 0; then
-	echo "Script MUST NOT be run as root." >&2
-	exit 1
-fi
-# check lock
-if test -e "$RUN_LOCK"; then
-	echo "${RUN_LOCK} exists. Assuming ${0} already running." >&2
-	exit 1
-fi
-# check binaries we need
-if test ! -x `which tar`; then
-	echo "'tar' not found or not executable" >&2
-	exit 1
-fi
-if test ! -x `which openssl`; then
-	echo "'openssl' not found or not executable" >&2
-	exit 1
-fi
-# check files privileges (euid==owner, egid==owner, file privileges r-- --- ---)
-if [ -f "$PASS_FILE" ]; then
-	if test ! $(stat -c %a "${PASS_FILE}") -eq 400 || ! test $(stat -c %u "${PASS_FILE}") -eq "$EUID" || ! test $(stat -c %g "${PASS_FILE}") -eq `id -g`; then
-	        echo "File ${PASS_FILE} has incorrect permissions (should be 400) or owner/group (should be `stat -c %U ${0}`)." >&2
-	        exit 1
-	fi
-fi
-if test ! $(stat -c %a "${RSA_ENC_KEY_FILE}") -eq 400 || ! test $(stat -c %u "${RSA_ENC_KEY_FILE}") -eq "$EUID" || ! test $(stat -c %g "${RSA_ENC_KEY_FILE}") -eq `id -g`; then
-        echo "File ${RSA_ENC_KEY_FILE} has incorrect permissions (should be 400) or owner/group (should be `stat -c %U ${0}`)." >&2
-        exit 1
-fi
-#create lock so we cannot run it more than once
-touch "${RUN_LOCK}"
-#cd to our working dir
-cd "$BACKUP_ROOT"
-#generate symmetric key here and push it (asymmetrically encrypted) into a file. this file will accompany symmetrically encrypted tar
-#we use aes-256 to encrypt our dumps so we need 32*8=256b symmetric key
-SYM_KEY=`openssl rand -base64 32`
-#encrypt the symmetric key
-openssl rsautl -encrypt -pubin -inkey "$RSA_ENC_KEY_FILE" -out current_key.bin.e <<< "$SYM_KEY"
-chmod 600 current_key.bin.e
-#do the dump
-# say we run the actual backup and create dump1.dmp, dump2.dmp and dump3.dmp here
-#pack the dump
-#tar usage "tar [parameters] archive_name file1 [file2 file3 ...]"
-tar --remove-files -czf current_backup.tgz dump1.dmp dump2.dmp dump3.dmp
-chmod 600 current_backup.tgz
-#encrypt the dump with current symmetric key, also add a pinch of salt
-openssl enc -aes-256-cbc -salt -in "current_backup.tgz" -out "current_backup.tgz.e" -pass stdin <<< "$SYM_KEY"
-#remove unencrypted dump and key
-rm -f current_backup.tgz
-#move encrypted things to backup_loc
-mv current_backup.tgz.e "${BACKUP_LOC}/${BACKUP_FILE_NAME}"
-mv current_key.bin.e "${BACKUP_LOC}/${BACKUP_AES_KEY_FILENAME}"
-#clean up backups older than $BACKUP_KEEP_DAYS days
-find "$BACKUP_LOC" -name "${BACKUP_PREFIX}*${BACKUP_SUFFIX}" -type f -mtime "+${BACKUP_KEEP_DAYS}" -delete
-find "$BACKUP_LOC" -name "${BACKUP_AES_KEY_PREFIX}*${BACKUP_AES_KEY_SUFFIX}" -type f -mtime "+${BACKUP_KEEP_DAYS}" -delete
-#we have finished, remove lock
-rm -f "${RUN_LOCK}"
-exit 0
-</file>
 The script needs a public-private keypair to be set up. When making a backup, a symmetric key will be generated. This symmetric key is used to encrypt the actual backup. Symmetric key is then asymmetrically encrypted by the public key and stored alongside the backup. For recovery, you need the backup and its corresponding symmetric key. When recovering, first, you have to decrypt the symmetric key with the private key generated earlier. Then you will use symmetric key to restore the data backup.
 For instructions about keypair initialization, backup creation and recovery and also for the actual command to carry out these actions, please refer to the script itself.
-When you obtain the repository backup, you can restore the repository:
+When you obtain the repository backup, **you can restore the repository**:
   - Stop the identity manager container.
   - Backup current repository somewhere else - in case you need to check some data later.
   - Delete all data from the repository / drop the repository itself. This depends on the backup created - if you have database creation statements in there and such.
-  - Restore the data from the pgdump with ''psql [parameters] < your_backup_name.sql''
+  - Restore the data from the pgdump with ''psql [parameters] < your\_backup\_name.sql''
   - Start the identity manager.
-<note important>Encrypted backups have to be shipped off the production machine. In case the production server fails, it must be possible to rebuild it from scratch (documentation) and to fill the repository from the recent backup.</note>
 **Script deployment**
@@ Line 229: / Line 53: @@
 #do the dump
 # say we run the actual backup and create dump1.dmp, dump2.dmp and dump3.dmp here
+# STRONGLY ADVISED TO GZIP YOUR BACKUPS, SCRIPT DOES NOT DO THAT FOR YOU !!!
 #pack the dump
 #tar usage "tar [parameters] archive_name file1 [file2 file3 ...]"
-tar --remove-files -czf current_backup.tgz dump1.dmp dump2.dmp dump3.dmp
+tar --remove-files -cf current_backup.tar PUT-YOUR-FILES-HERE
+chmod 600 current_backup.tar
 </code>
-And change them to (expected name of the czechidm database is "idm"):
+And change them to (expected name of the czechidm database is ''czechidm''):
 <code bash>
 #do the dump
-pg_dump --create --dbname=idm > idm.sql
+pg_dump --create -Z 9 --dbname=czechidm > czechidm.sql.gz
 #pack the dump
 #tar usage "tar [parameters] archive_name file1 [file2 file3 ...]"
-tar --remove-files -czf current_backup.tgz idm.sql
+tar --remove-files -czf current_backup.tgz czechidm.sql.gz
 </code>
@@ Line 250: / Line 77: @@
 </code>
-===== Application backups =====
+===== Configuration backups =====
-CzechIdM is a Java application distributed as a WAR archive. This archive is deployed inside a Apache Tomcat container. For recovery of the application, only the WAR is needed. There are many things to speed the recovery up like backing-up the whole tomcat container, but the pure minimum is the //idm.war// archive.
+Everything about the identity manager lives in ''/opt/czechidm'', the configuration itself in the ''/opt/czechidm/etc/'' directory. Some files there (logging configuration, Quartz configuration and most parts of application property file) can be reconstructed from documentation. Crucial parts (repository password and especially ''secret.key'') have to be backed up.
+  * Losing repository password is not a problem (you can always set the new password in postgres) but it is a complication.
+  * Losing ''secret.key'' effectively means **losing all contents of the [[devel:documentation:security:dev:confidential-storage|Confidential Storage]]**.
-Instead of the repository backup, the application backup should be encrypted but this is not strictly necessary. This is because the application property file contains credentials to the repository. However in standard deployments, the PostgreSQL is accessible only locally and access to whole CzechIdM server is restricted -- therefore even with the repository password, an attacker is not able to read the data. As a rule of thumb: if the RDBMS, where the repository is located, is accessible remotely through the network, the application backup should be encrypted. To do it, consult the repository backup script and set up the application backup in a similar way.
+For simplicity reasons (and also because all the files count up to few kilobytes in size) we recommend backing up whole configuration folder. **It is vital that this backup is encrypted, as it contains the confidential storage key.**
+**Implementation**
-<note warning>If the RDBMS where the repository is located is accessible through the network, application (WAR archive) backups must also be encrypted, because they contain credentials to the repository.</note>
+  * For implementation, use the encrypted backups script as above.
+  * Create new instance of the script.
-For backing up the application, we will use a simple shell script:
+  * Ideally generate separate RSA keypair for it.
-<code bash>
+  * Do not pack the ''secret.key'' and repository backup together as this potentially lowers the security. (However, this somewhat depends where the backups are stored when off machine.)
+===== Application backups =====
+CzechIdM is a Java application distributed as a WAR archive. This archive is deployed inside a Apache Tomcat container. For recovery of the application, only the WAR is needed.
+  * If you use standard application release without any added modules and customizations, the ''idm.war'' **does not need to be backed up**. You can always download the application release again.
+  * If you customized identity manager's binary in any way, you **should rely on your version control system and development lifecycle** and you should be always able to produce the ''idm.war'' again.
+  * If you still want to backup the ''idm.war'', simply copying it somewhere safe is completely sufficient. This could look something like:<file bash backup_app.sh>
 #!/bin/bash
@@ Line 273: / Line 107: @@
 	exit 1
 fi
 touch "${LOCKFILE}"
 tar cpfz "${BACKUP_DIR}/backup_app.${NOW}.tgz" /opt/tomcat/current/webapps/* 2>/dev/null || echo "Error when performing backup." >&2
 find "$BACKUP_DIR" -name "*tgz" -type f -mtime "+${BACKUP_KEEP_DAYS}" -delete
 rm "${LOCKFILE}"
 exit 0
-</code>
+</file>
-When doing the recovery, simply extract the archive and move it to the (new or also restored) tomcat container into its //webapps// folder. After starting the tomcat, everything should get up and running. In cases where the repository also needs to be restored, restore it first. Then restore the CzechIdM WAR archive.
-<note important>Backups have to be shipped off the production machine. In case the production server fails, it must be possible to rebuild it from scratch (documentation) and to restore the application from the recent backup.</note>
+When doing the recovery, simply put the CzechIdM WAR archive into the same folder as you would normally do and restart the Tomcat container.
 **Script deployment**
-This is a standard deployment scenario. Create the directory structure and setup the script:
+If you really want to use the ''backup\_app.sh'', here is an example deployment scenario:
 <code>
 mkdir -p /opt/backup/app
@@ Line 310: / Line 137: @@
 </code>
-In some cases, CzechIdM is not deployed with frontend and backend bundled together in the //idm.war//. When backing up such environment, the backend should be backed up the way as was just described. The frontend, which may be deployed somewhere else, should be backed up in a similar way using the same script. For example, when running frontend application from separate Apache HTTPD, you should deploy another backup script which backs up ///var/www/html/*// directory instead of ///opt/tomcat/current/webapps/*//. The frontend-only backup containst only a webpage so there is no need to encrypt it in any way.
+In some cases, CzechIdM is not deployed with frontend and backend bundled together in the ''idm.war''. When backing up such environment, the backend should be backed up the way as was just described. The frontend, which may be deployed somewhere else, should be backed up in a similar way using the same script. For example, when running frontend application from separate Apache HTTPD, you should deploy another backup script which backs up ''/var/www/html/\*'' directory instead of ''/opt/tomcat/current/webapps/\*''.
+===== Restoring IdM application =====
+<note>
+This is a basic DR howto for restoring the identity manager in case you lose it. It does not deal with other disaster scenarios.
+If you backup your environment in some other way, virtual machine snapshots for example, use your DR procedures.
+</note>
+When the application is lost - due to HW or virtualization failure, human error or due to security compromise, you can restore it using backups and documentation. In this case, we show how to restore everything on the clean operating system installation.
+  - Install the operating system.
+  - Configure the OS according to your internal standards.
+  - Configure the OS according to [[https://wiki.czechidm.com/doku.php?id=start&do=search&q=server+preparation|Server Preparation howto]]. Snapshot of this howto should be already part of your documentation - this is important because the wiki content evolves in time.
+  - Deploy and configure the CzechIdM according to [[https://wiki.czechidm.com/doku.php?id=start&do=search&q=idm+installation|IdM installation howto]]. Snapshot of this howto should be already part of your documentation - this is important because the wiki content evolves in time.
+    - When creating a database user and CzechIdM database in the PostgreSQL, use credentials you already used before the failure. Restore the database from backup, for example ''psql ... < idm-database-backup.sql''.
+    - **Do not** create brand new configuration in ''/opt/czechidm''. Restore it from your backup.
+    - **Do not** download new ''idm.war'', restore it from your backup.
+  - Disable all new outgoing connections from the IdM machine **except for communication between your station and IdM server**.
+    - This way, the IdM will not start to communicate with end systems until you check its data is consistent.
+    - But you will still be able to access the web UI.
+  - Start the Tomcat container and wait for the identity manager to deploy.
+  - Log into the application as an administrator (use locally-authenticated account - any account that was granted ''superAdminRole'' role).
+  - Disable LRTs, kill all those that are running.
+  - Check data in the application: logs, audit trails, data on users and roles, event and provisioning queues. Diagnose and resolve any weirdness (especially in queues).
+  - Allow outgoing connections from the IdM machine.
+  - Test connections to all end systems, reprovision some users to end systems. Check event and provisioning queues for any errors and resolve them if needed.
+  - Test your general use-cases / UAT tests to make sure the application works as intended.
+  - Schedule LRTs.