Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Last revision Both sides next revision | ||
tutorial:adm:server_os_updates [2019/12/17 07:46] fiserp [Performing the OS update] |
tutorial:adm:server_os_updates [2020/01/13 12:22] fiserp [Server updates - OS updates] |
||
---|---|---|---|
Line 1: | Line 1: | ||
====== Server updates - OS updates ====== | ====== Server updates - OS updates ====== | ||
- | <note warning> | ||
To ensure secure operation, servers in the infrastructure have to be kept up to date. This tutorial addresses the need for OS updates of the IdM server and gives basic guidelines and recommendations. | To ensure secure operation, servers in the infrastructure have to be kept up to date. This tutorial addresses the need for OS updates of the IdM server and gives basic guidelines and recommendations. | ||
Line 6: | Line 5: | ||
Each organization has some sort of schedule to apply OS patches: weekly, monthly, quarterly, never (not a good one), etc. You can patch the OS according to your strategy, but we recommend to perform patching at least once every three months. IdM relies on packages and libraries from the operating system and if those are not patched, security of the whole IdM solution also deteriorates. | Each organization has some sort of schedule to apply OS patches: weekly, monthly, quarterly, never (not a good one), etc. You can patch the OS according to your strategy, but we recommend to perform patching at least once every three months. IdM relies on packages and libraries from the operating system and if those are not patched, security of the whole IdM solution also deteriorates. | ||
- | ==== Things to consider ==== | + | ===== Things to consider |
Before applying updates, there are few things to consider: | Before applying updates, there are few things to consider: | ||
* Impact on users | * Impact on users | ||
Line 15: | Line 14: | ||
* LRTs run usually at night so it is not entirely necessary to stop the IdM, but you have to make sure you have enough time to perform the patching (and possible rollback) before jobs start to execute. | * LRTs run usually at night so it is not entirely necessary to stop the IdM, but you have to make sure you have enough time to perform the patching (and possible rollback) before jobs start to execute. | ||
* Restarting IdM cancels the LRT that was currently running, LRT **will not pick up automatically** after IdM goes up again. | * Restarting IdM cancels the LRT that was currently running, LRT **will not pick up automatically** after IdM goes up again. | ||
- | * Nightly LRTs usually read HR system data. This means there are dependecies between them (e.g. synchronize identities, then contracts and/or time slices, then run recompute on them and finally run HR processes which enable/disbale | + | * Nightly LRTs usually read HR system data. This means there are dependecies between them (e.g. synchronize identities, then contracts and/or time slices, then run recompute on them and finally run HR processes which enable/disable |
+ | * Impact on entity events | ||
+ | * Entity events that are currently running **are lost** on IdM restart. This usually affects from one to ten events; actual number of affected events depends on number of '' | ||
+ | * Entity events in other states are persisted into the database so they are not lost on IdM restart. | ||
+ | * No entity events should be in the event queue at the time of OS update. Because events are generated by LRTs or user actions, killing off LRTs and disconnecting users from IdM web interface is sufficient. | ||
* Impact on end systems connected to IdM | * Impact on end systems connected to IdM | ||
* There is no direct impact on other systems. | * There is no direct impact on other systems. | ||
Line 34: | Line 37: | ||
* Define use-cases that are important for your deployment. Before and after the update, test if those use-cases work. | * Define use-cases that are important for your deployment. Before and after the update, test if those use-cases work. | ||
- | ==== Performing the OS update ==== | + | ===== Performing the OS update ===== |
+ | Following list can be used as a basis for the maintenance checklist. Feel free to customize it to better suit your needs. | ||
- Preparations | - Preparations | ||
- Prepare testing use-cases. | - Prepare testing use-cases. | ||
Line 42: | Line 46: | ||
- Perform the update | - Perform the update | ||
- Begin the maintenance. | - Begin the maintenance. | ||
+ | - Disable monitoring system notifications. | ||
- (If you use hot snapshots, make one.) | - (If you use hot snapshots, make one.) | ||
- Make sure no user or external application can access the IdM. | - Make sure no user or external application can access the IdM. | ||
Line 64: | Line 69: | ||
- (If there were changes to the database (e.g. PostgreSQL major version upgrade), make a backup of the upgraded database.) | - (If there were changes to the database (e.g. PostgreSQL major version upgrade), make a backup of the upgraded database.) | ||
- Allow users to access the IdM. | - Allow users to access the IdM. | ||
+ | - Enable monitoring system notifications. | ||
- End the maintenance. | - End the maintenance. | ||
- Wrap-up | - Wrap-up | ||
Line 73: | Line 79: | ||
< | < | ||
- | ==== Solving | + | ===== Resolving |
For maintenance actions, it is necessary to: | For maintenance actions, it is necessary to: | ||
* Know how long each task will take and to measure the task duration when actually performing them. | * Know how long each task will take and to measure the task duration when actually performing them. | ||
Line 81: | Line 87: | ||
* Know how long (at worst) the whole rollback will take (rollback time **RT**). | * Know how long (at worst) the whole rollback will take (rollback time **RT**). | ||
* Have a maintenance window that spans at least **MT**+**RT** with some extra time **ET**. | * Have a maintenance window that spans at least **MT**+**RT** with some extra time **ET**. | ||
- | * You are not able to safely perform the maintenance in shorter window, there is simply not enough time. If something goes wrong, you need at most **RT** time to perform the rollback! | + | * You are not able to safely perform the maintenance in shorter window, there is simply not enough time. If something goes wrong, you will need **RT** time to perform the rollback! |
- | * If you do not have any **ET**, if anything goes wrong you have to perform rollback procedure. Therefore, **ET** gives you some time you can spend on solving the issue so you can carry on with updates. | + | * When you have no **ET**, if anything goes wrong you have to perform rollback procedure. Therefore, **ET** gives you some time you can spend on solving the issue so you can carry on with updates. |
- | You should have a rollback procedure that can safely restore the deployment. This depends on your environment. | + | * You should have a rollback procedure that can safely restore the deployment. |
- | + | * This depends on your environment | |
- | Fortunately, | + | |
- | Minor issues can be generally resolved with the help of ``/boot`` and ``/etc`` backups you created before updating the OS. | + | * After restoring the snapshot, you have to perform tests (with test use-cases) to confirm the rollback was performed correctly. |
- | + | | |
- | If IdM installation gets hit, you can debug the configuration or restore it from periodic backup. Since IdM is not installed from OS packages, this basically never happens. | + | |