Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
tutorial:adm:backups [2019/03/25 19:03]
fiserp [Repository backups]
tutorial:adm:backups [2020/03/20 10:13] (current)
fiserp [Restoring IdM application]
Line 1: Line 1:
-====== Backups and recovery ======+====== Server preparation - Backup and Recovery ======
  
 When it comes to backup, the CzechIdM deployments consists of three parts: When it comes to backup, the CzechIdM deployments consists of three parts:
Line 10: Line 10:
 <note important>All backups should be shipped off the production machine. In case the production server fails, it must be possible to rebuild it from scratch (documentation) and to restore the repository and configuration from recent backup.</note> <note important>All backups should be shipped off the production machine. In case the production server fails, it must be possible to rebuild it from scratch (documentation) and to restore the repository and configuration from recent backup.</note>
 ===== Repository backups ===== ===== Repository backups =====
-Using PostgreSQL allows us to do the ''pg\_dump'' of the repository which is a primary backup strategy. Because identity manager contains company's private data, it is desirable to encrypt the repository backup. We will store all backups in the ''/opt/backup/...'' location. To create encrypted backups, we will use following shell script: +Using PostgreSQL allows us to do the ''pg\_dump'' of the repository which is a primary backup strategy. Because identity manager contains company's private data, it is desirable to encrypt the repository backup. We will store all backups in the ''/opt/backup/...'' location. To create encrypted backups, we will use [[https://github.com/bcvsolutions/czechidm-monitoring/blob/master/backups/encrypted_backup.sh|this shell script]].
-<file bash enc_backup.sh> +
-#!/bin/bash +
- +
-# ********************************** READ ME ********************************** +
-+
-# General: +
-# Script is intended to do encrypted backups of whatever you implement in parts +
-# "do the dump" and "pack the dump". The result of your doing should be a tar +
-# archive called "current_backup.tgz". This name is automtically recognized and +
-# script will take care of everything else. Presumed shell is BASH. +
-+
-# Output of the script is saved into BACKUP_LOC directory in an encrypted form. +
-# Each backup consists of two files - symmetric key and public key. Because en- +
-# cryption is done by openssl, which cannot process an arbitrary file directly +
-# with RSA, files are first encrypted with random 32B key using AES-256-CBC. +
-# This 32B key is encrypted with RSA public key which is stored on the machine. +
-# Private RSA key SHOULD NOT be found anywhere on the same machine. If it was, +
-# you could do plain backups and not bother with this at all and security would +
-# be the same. +
-+
-# Needed binaries and builtins: +
-# test,echo,stat,id,tar,openssl,touch,chmod,rm,mv,find,date,basename +
-+
-# Setup: +
-# 1) Create separate system user to run this script, do not run it as root. +
-# 2) Generate public-private key pair of at least 2048b: +
-# openssl genrsa -out backups-rsa-key 2048 +
-# openssl rsa -in backups-rsa-key -out backups-rsa-key.pub \ +
-# -outform PEM -pubout +
-# 3) The backups-rsa-key file contains private key, store it in the keepass +
-# or somewhere safe. Do not leave it on the machine! +
-# 4) Move backups-rsa-key.pub to BACKUP_ROOT, set correct privileges (400), +
-# name it as you wish and set RSA_ENC_KEY_FILE accordingly. +
-# 5) Fill in the "do the dump" and "pack the dump" parts of the script to suit +
-# your needs. +
-# 6) Adjust other settings in the script as needed. Ensure that service user +
-# used for dumping the DB, LDAP, whatever is dedicated to this and has +
-# read-only privileges! This is IMPORTANT! +
-# 7) Run the script as a cronjobPreferred setting is in the crontab, not in the +
-#  /etc/cron.*/whatever file. But it does not really matter. +
-+
-# Recovering backups: +
-# Backups are stored in BACKUP_LOC as a pair of files. One file is an actual +
-# backup encrypted symmetrically. The other file is a symmetric key for the +
-# specific backup. (New symmetric key is generated for each backup run.) +
-# Symmetric key is encrypted with RSA. +
-+
-# To recover backups, do the following: +
-# 1) Get you backups, we will call them "data.tgz.e" and "key.bin.e"+
-# 2) Get your private RSA key "backups-rsa-key"+
-# 3) Decrypt the AES key, you will obtain "key.bin" file: +
-# openssl rsautl -decrypt -inkey backups-rsa-key \ +
-# -in key.bin.e -out key.bin +
-# 4) Decrypt the actual backup, you will get a tarball: +
-# openssl enc -d -aes-256-cbc -in data.tgz.e -out data.tgz \ +
-# -pass file:key.bin +
-# 5) Extract the tarball: +
-# tar xf data.tgz +
-# ***************************************************************************** +
-+
-# TODO: +
-# * better backups naming +
-# * something like .d directory where backup scripts will lay to make whole +
-# thing a bit more modular +
-# * add actions like "init", "recover" and "backup" to make script more +
-# user-friendly +
-+
-# Revision history: +
-# 2017-05-16  Petr Fiser  <petr.fiser@bcvsolutions.eu> +
-# * removed hardwired LDAP variables (original script was for LDAP backups) +
-# * removed hardwired lockfile name +
-# * PASS_FILE made optional +
-# * backup timestamp with granularity to seconds instead of hours +
-# 2016-02-25  Petr Fiser  <petr.fiser@bcvsolutions.eu> +
-# * first version of the script +
- +
-# basic setup +
-export PATH="/bin:/usr/bin" +
-#directory where everything happens +
-#should be empty except for backup scripts, keys and BACKUP_LOC folder +
-BACKUP_ROOT="/opt/czechidm/backup" +
-#hic sunt backupes +
-BACKUP_LOC="${BACKUP_ROOT}/repository" +
-#lockfile +
-RUN_LOCK="${BACKUP_ROOT}/`basename ${0}`.lock" +
-BACKUP_PREFIX="backup." +
-BACKUP_SUFFIX=".tgz.e" +
-BACKUP_AES_KEY_PREFIX="backup." +
-BACKUP_AES_KEY_SUFFIX=".aes.key.e" +
-#files with public RSA key and password file +
-#if the password file is not needed, leave the setting blank or the file nonexistent +
-PASS_FILE="${BACKUP_ROOT}/user.passwd" +
-RSA_ENC_KEY_FILE="${BACKUP_ROOT}/repo-backups-rsa.pub" +
-#backups retention period +
-BACKUP_KEEP_DAYS="30" +
- +
-# setup runtime variables +
-NOW=$(date +"%Y-%m-%d-%H%M%S"+
-BACKUP_FILE_NAME="${BACKUP_PREFIX}${NOW}${BACKUP_SUFFIX}" +
-BACKUP_AES_KEY_FILENAME="${BACKUP_AES_KEY_PREFIX}${NOW}${BACKUP_AES_KEY_SUFFIX}" +
- +
-# check root, must not run as root +
-if test "$EUID" -eq 0; then +
- echo "Script MUST NOT be run as root." >&+
- exit 1 +
-fi +
- +
-# check lock +
-if test -e "$RUN_LOCK"; then +
- echo "${RUN_LOCK} exists. Assuming ${0} already running." >&+
- exit 1 +
-fi +
- +
-# check binaries we need +
-if test ! -x `which tar`; then +
- echo "'tar' not found or not executable" >&+
- exit 1 +
-fi +
-if test ! -x `which openssl`; then +
- echo "'openssl' not found or not executable" >&+
- exit 1 +
-fi +
- +
-# check files privileges (euid==owner, egid==owner, file privileges r-- --- ---) +
-if [ -f "$PASS_FILE" ]; then +
- if test ! $(stat -c %a "${PASS_FILE}") -eq 400 || ! test $(stat -c %u "${PASS_FILE}") -eq "$EUID" || ! test $(stat -c %g "${PASS_FILE}") -eq `id -g`; then +
-         echo "File ${PASS_FILE} has incorrect permissions (should be 400) or owner/group (should be `stat -c %U ${0}`)." >&+
-         exit 1 +
- fi +
-fi +
-if test ! $(stat -c %a "${RSA_ENC_KEY_FILE}") -eq 400 || ! test $(stat -c %u "${RSA_ENC_KEY_FILE}") -eq "$EUID" || ! test $(stat -c %g "${RSA_ENC_KEY_FILE}") -eq `id -g`; then +
-        echo "File ${RSA_ENC_KEY_FILE} has incorrect permissions (should be 400) or owner/group (should be `stat -c %U ${0}`)." >&+
-        exit 1 +
-fi +
- +
-#create lock so we cannot run it more than once +
-touch "${RUN_LOCK}" +
- +
-#cd to our working dir +
-cd "$BACKUP_ROOT" +
- +
-#generate symmetric key here and push it (asymmetrically encrypted) into a file. this file will accompany symmetrically encrypted tar +
-#we use aes-256 to encrypt our dumps so we need 32*8=256b symmetric key +
-SYM_KEY=`openssl rand -base64 32` +
- +
-#encrypt the symmetric key +
-openssl rsautl -encrypt -pubin -inkey "$RSA_ENC_KEY_FILE" -out current_key.bin.e <<< "$SYM_KEY" +
-chmod 600 current_key.bin.e +
- +
-#do the dump +
-# say we run the actual backup and create dump1.dmp, dump2.dmp and dump3.dmp here +
- +
-#pack the dump +
-#tar usage "tar [parametersarchive_name file1 [file2 file3 ...]+
-tar --remove-files -czf current_backup.tgz dump1.dmp dump2.dmp dump3.dmp +
-chmod 600 current_backup.tgz +
- +
-#encrypt the dump with current symmetric key, also add a pinch of salt +
-openssl enc -aes-256-cbc -salt -in "current_backup.tgz" -out "current_backup.tgz.e" -pass stdin <<< "$SYM_KEY" +
-#remove unencrypted dump and key +
-rm -f current_backup.tgz +
- +
-#move encrypted things to backup_loc +
-mv current_backup.tgz.e "${BACKUP_LOC}/${BACKUP_FILE_NAME}" +
-mv current_key.bin.e "${BACKUP_LOC}/${BACKUP_AES_KEY_FILENAME}" +
- +
-#clean up backups older than $BACKUP_KEEP_DAYS days +
-find "$BACKUP_LOC" -name "${BACKUP_PREFIX}*${BACKUP_SUFFIX}" -type f -mtime "+${BACKUP_KEEP_DAYS}" -delete +
-find "$BACKUP_LOC" -name "${BACKUP_AES_KEY_PREFIX}*${BACKUP_AES_KEY_SUFFIX}" -type f -mtime "+${BACKUP_KEEP_DAYS}" -delete +
- +
-#we have finished, remove lock +
-rm -f "${RUN_LOCK}" +
- +
-exit 0 +
-</file>+
  
 The script needs a public-private keypair to be set up. When making a backup, a symmetric key will be generated. This symmetric key is used to encrypt the actual backup. Symmetric key is then asymmetrically encrypted by the public key and stored alongside the backup. For recovery, you need the backup and its corresponding symmetric key. When recovering, first, you have to decrypt the symmetric key with the private key generated earlier. Then you will use symmetric key to restore the data backup. The script needs a public-private keypair to be set up. When making a backup, a symmetric key will be generated. This symmetric key is used to encrypt the actual backup. Symmetric key is then asymmetrically encrypted by the public key and stored alongside the backup. For recovery, you need the backup and its corresponding symmetric key. When recovering, first, you have to decrypt the symmetric key with the private key generated earlier. Then you will use symmetric key to restore the data backup.
 For instructions about keypair initialization, backup creation and recovery and also for the actual command to carry out these actions, please refer to the script itself. For instructions about keypair initialization, backup creation and recovery and also for the actual command to carry out these actions, please refer to the script itself.
  
-When you obtain the repository backup, you can restore the repository:+When you obtain the repository backup, **you can restore the repository**:
   - Stop the identity manager container.   - Stop the identity manager container.
   - Backup current repository somewhere else - in case you need to check some data later.   - Backup current repository somewhere else - in case you need to check some data later.
   - Delete all data from the repository / drop the repository itself. This depends on the backup created - if you have database creation statements in there and such.   - Delete all data from the repository / drop the repository itself. This depends on the backup created - if you have database creation statements in there and such.
-  - Restore the data from the pgdump with ''psql [parameters] < your_backup_name.sql''+  - Restore the data from the pgdump with ''psql [parameters] < your\_backup\_name.sql''
   - Start the identity manager.   - Start the identity manager.
  
Line 228: Line 53:
 #do the dump #do the dump
 # say we run the actual backup and create dump1.dmp, dump2.dmp and dump3.dmp here # say we run the actual backup and create dump1.dmp, dump2.dmp and dump3.dmp here
 +# STRONGLY ADVISED TO GZIP YOUR BACKUPS, SCRIPT DOES NOT DO THAT FOR YOU !!!
 +
  
 #pack the dump #pack the dump
 #tar usage "tar [parameters] archive_name file1 [file2 file3 ...]" #tar usage "tar [parameters] archive_name file1 [file2 file3 ...]"
-tar --remove-files -czf current_backup.tgz dump1.dmp dump2.dmp dump3.dmp+tar --remove-files -cf current_backup.tar PUT-YOUR-FILES-HERE 
 +chmod 600 current_backup.tar
 </code> </code>
 And change them to (expected name of the czechidm database is ''czechidm''): And change them to (expected name of the czechidm database is ''czechidm''):
 <code bash> <code bash>
 #do the dump #do the dump
-pg_dump --create --dbname=czechidm > czechidm.sql+pg_dump --create -Z 9 --dbname=czechidm > czechidm.sql.gz
  
 #pack the dump #pack the dump
 #tar usage "tar [parameters] archive_name file1 [file2 file3 ...]" #tar usage "tar [parameters] archive_name file1 [file2 file3 ...]"
-tar --remove-files -czf current_backup.tgz czechidm.sql+tar --remove-files -czf current_backup.tgz czechidm.sql.gz
 </code> </code>
  
Line 310: Line 138:
  
 In some cases, CzechIdM is not deployed with frontend and backend bundled together in the ''idm.war''. When backing up such environment, the backend should be backed up the way as was just described. The frontend, which may be deployed somewhere else, should be backed up in a similar way using the same script. For example, when running frontend application from separate Apache HTTPD, you should deploy another backup script which backs up ''/var/www/html/\*'' directory instead of ''/opt/tomcat/current/webapps/\*''. In some cases, CzechIdM is not deployed with frontend and backend bundled together in the ''idm.war''. When backing up such environment, the backend should be backed up the way as was just described. The frontend, which may be deployed somewhere else, should be backed up in a similar way using the same script. For example, when running frontend application from separate Apache HTTPD, you should deploy another backup script which backs up ''/var/www/html/\*'' directory instead of ''/opt/tomcat/current/webapps/\*''.
 +
 +===== Restoring IdM application =====
 +<note>
 +This is a basic DR howto for restoring the identity manager in case you lose it. It does not deal with other disaster scenarios.
 +
 +If you backup your environment in some other way, virtual machine snapshots for example, use your DR procedures.
 +</note>
 +
 +When the application is lost - due to HW or virtualization failure, human error or due to security compromise, you can restore it using backups and documentation. In this case, we show how to restore everything on the clean operating system installation.
 +  - Install the operating system.
 +  - Configure the OS according to your internal standards.
 +  - Configure the OS according to [[https://wiki.czechidm.com/doku.php?id=start&do=search&q=server+preparation|Server Preparation howto]]. Snapshot of this howto should be already part of your documentation - this is important because the wiki content evolves in time.
 +  - Deploy and configure the CzechIdM according to [[https://wiki.czechidm.com/doku.php?id=start&do=search&q=idm+installation|IdM installation howto]]. Snapshot of this howto should be already part of your documentation - this is important because the wiki content evolves in time.
 +    - When creating a database user and CzechIdM database in the PostgreSQL, use credentials you already used before the failure. Restore the database from backup, for example ''psql ... < idm-database-backup.sql''.
 +    - **Do not** create brand new configuration in ''/opt/czechidm''. Restore it from your backup.
 +    - **Do not** download new ''idm.war'', restore it from your backup.
 +  - Disable all new outgoing connections from the IdM machine **except for communication between your station and IdM server**.
 +    - This way, the IdM will not start to communicate with end systems until you check its data is consistent.
 +    - But you will still be able to access the web UI.
 +  - Start the Tomcat container and wait for the identity manager to deploy.
 +  - Log into the application as an administrator (use locally-authenticated account - any account that was granted ''superAdminRole'' role).
 +  - Disable LRTs, kill all those that are running.
 +  - Check data in the application: logs, audit trails, data on users and roles, event and provisioning queues. Diagnose and resolve any weirdness (especially in queues).
 +  - Allow outgoing connections from the IdM machine.
 +  - Test connections to all end systems, reprovision some users to end systems. Check event and provisioning queues for any errors and resolve them if needed.
 +  - Test your general use-cases / UAT tests to make sure the application works as intended.
 +  - Schedule LRTs.
  • by fiserp