7. Optimize and Automate – the seventh guiding principle of ITIL4
With the new and updated version of ITIL there is also an update on the 9 guiding principles first introduced with ITIL Practioner. These new set of 7 principles provides practical help with making decisions when adopting the ITIL4 framework:
- Focus on Value
- Start where you are
- Progress Iteratively with feedback
- Collaborate and promote visibility
- Think and work holistically
- Keep it simple and practical
- Optimize and automate
In a series of blogs I will look into each principle asking how these will provide guidance when adopting the ITIL4 framework and when improving the service management capabilities of the IT provider. To me these guiding principles should support the decision making when adopting or improving IT service management. It is also important to note that these principles will have to work together. The principles will not work in isolation, it is not a matter of pick-and-choose.
Optimize and automate
This is a guiding principle that is pretty similar to some of the other principles. In order to optimize you need some kind of framework of measurement and agreement on criteria, let’s say that you need to focus on value. It is hard to optimize a practice, process or product when you do not have a good idea of where you are. And you could see that optimizing is another word for progressing iteratively with feedback. By the time you reach the seventh principle you should have the feeling that you are getting it: ITIL is about improving IT services in cooperation with the business and other stakeholders.
To optimize or to improve
In the dictionary to optimize is a synonym for to improve. When you optimize a process or a practice you are also improving that process or practice. There is a difference between the two approaches in real life. Depending a bit on who you ask the definition of to optimize is to find improvements within the constraints to deliver value and to improve is to remove or challenge the constraints leading to more or higher value. Optimizing a process will not bring a better outcome for the business, but will reduce costs and risks. Where to improve a process will lead to an improved outcome and might lead to more costs and maybe some more risks.
Operations versus Change management
There is a long and ongoing discussion within ITIL and Service Management on the scope of change management in relation to operations management. When a ‘change’ does not impact service levels and doesn’t lead to a status update of a CI, is it still a ‘change’ to be controlled by change management. A long time ago I was involved in the implementation (improvement) of operations management as defined by the Microsoft Operations Framework (MOF was heavenly inspired by ITIL). This was one of the discussions that came up when trying to define the scope of operations management. Is adding a different setting to a configuration file to make a server work better a minor change that needed to be controlled by change management or could this be done through operations management and logged in their logbook? There are multiple examples where work done through Operations Management can be taken as forms of minor changes, even though they have no impact on service levels and normally do not lead to changes in the CMDB either. Looking at Service Management holistically you can understand that when a ‘change’ is logged in the Operations Management logbook that it is still under control of the IT Service Organization and there would be no need for additional control through the Change Management practice.
Optimizing Service Management
When looking through this perspective onto IT service management you can support the opinion that optimizing is an activity aligned with Operations Management. Within Operations Management you tweak the configurations to make the performance better without the need to actual change the setup by adding new hardware (CI’s) or major new releases of software (CI’s again). Operations Management is about patches and fixes not about new releases. Change Management should be about those change that impact the business (improvements) where Operations Management takes care of the minor and low-risk changes where the impact on the business is a better experience or performance of the IT systems. Another element of MOF is the idea to have monthly performance reviews, first the Operations Review within the IT department followed by the Service Level Reviews with the business departments. The Operations Review is about understanding how the IT platform is operating and what can be done to optimize this. The Service Level Review is about understanding how the IT platform has supported the business and what can be done to improve this.
Optimize for and improve with the business
The difference between optimizing and improving can be explained as a difference in scope. Optimizing is about the scope of the IT service organization and to make things better within this scope. Improving is about the scope of the Business and to make things better within this scope. Optimizing makes IT systems better and Improving makes the Business better. You need the Business involved, at least to define desired outcome and to articulate value of IT support, when you want to improve the IT services. As a side note: the 7-step Continual Improvement process in ITIL v3 was therefore more about optimizing than improving and should have been called the Continual Optimizing process.
Articulate the operations tasks first
Another observation from improving Operations Management using MOF as a guidance is that when you start defining Operations Management as a process/practice you have to start writing down a lot of daily tasks in the form of work instructions. For many of these tasks there are already templates available for the products involved that can be copied from the manufacturer’s websites. The nice thing about making operations work explicit is that you can start scheduling this and making it more predictable. Often the system engineer might come to the office with the idea of checking some server for possible issues but gets bombarded at work with service calls, urgent changes and all other kind of requests. At the end of the day the maintenance tasks often do not get done because of the daily madness. This can lead to service outages and all kind of other problems that could have been prevented by doing maintenance tasks timely an thoroughly. Making these tasks explicit is a first step in creating an overview of tasks that need to be done on a daily, weekly or monthly schedule.
Start managing the operations tasks
When you start to put the operations tasks in a schedule, you can also start to assign these tasks on a daily basis. You can combine many of these tasks with the duty of monitoring systems (If no one is there to see the red flag or hear the alert you are wasting lots of money fooling yourself with monitoring tools). These tasks and the schedule containing them will help the IT service organization to become more predictable and pro-active. When done well the operator performing the tasks will also be better prepared to do the tasks and this will reduce the amounts of errors made under time-pressure as well.
Automate operations tasks
When a task is well described and understood you can start considering automating the task. Often system engineers and developers are quick to automate when they feel work can be done more efficient through a computer system. That is not necessary the smart thing to do. It can lead to suboptimal results and actually increase costs and risks. Tasks are most times part of a larger process and should not be considered in isolation when investigating automation. Returning to the comment about monitoring monitoring systems, I’ve seen operations teams creating a script to forward systems alerts into the service desk tool as incidents. In operations they didn’t want to spend the time in monitoring the alerts and assessing the impact, so they had the service desk do it for them. At least, that was the idea. In reality the service desk tool got about 400 of these alerts a day, much more that the number of calls. Most of these alerts were very cryptic and without knowing the context it was impossible for the service desk to make any sense out of these. The solution was a script to close all incidents created out of the alerts. You can see the problem here.
Smart automation will reduce costs and risks
Only when you understand how manual tasks play a role in maintaining and optimizing the IT platform then you should start considering automation. And you should consider automating the whole process over automating individual tasks if that is possible. Making automation an active and conscious decision. Making it happen instead of letting it happen. The for mentioned monthly Operations Review is a good place to discuss and decide on automation projects. Where in the past we had the possibilities to write scripts and macro’s and create jobs in batches, now we have Robot Process Automation or RPA.
Spare human interaction from over-automation
A last remark is about the tendency of many IT departments to also try to automate human interaction. It might be because introverts are more attracted to engineering careers that many IT staff do not like to talk to their colleagues or to customers. They prefer interaction through e-mail, portals or other kinds of electronic messaging and forms. On the other side, many people like to talk to the person who might help them with an IT related issue or question. Calling an helpdesk might often not be considered fun, but they prefer this over sending an e-mail or filling out a form. When IT departments choose to communicate mainly through systems there is an high risk that the colleagues in the business departments do not feel supported by the IT department anymore.
Optimize and Automate
As a guiding principle for ITIL and Service Management the principle Optimize and Automate describes a situation where IT is responsible for the performance, costs and risks of the IT platform. Defining this responsibility to deliver IT services as different form the responsible use of IT services should have been a guiding principle as well. In that case it would have been sensible that IT would feel the need to optimize the delivery of IT services and the shared responsibility with the business to improve the use (outcome) of IT services.
Principle 6: Keep it simple and practical