I just had a conversation today with my VP (Rob Sauerwalt – check him out on Twitter – time to do some shameless kissing up to my management team) about a recent internal communication that we both saw. It was someone looking for a “readiness checklist” for the deployment of an application on the IBM Cloud. Rob and I both agreed that this seems pretty simple, and we came up with a quick checklist of things to consider.
Now this list is not specific to the IBM Cloud, it’s pretty generic. It’s just a quick checklist of things that you will want to make sure that you have considered, BEFORE you deploy that cloud based application into a production environment. I am an Agile believer, so I would suggest that you address these checklist items in the SPIRIT of what they are trying to do, and that you should do what makes sense. This means that each one of these areas does not need to represent some 59 page piece of documentation. What you want to do is provide enough information so the poor guy who takes your job after you get promoted, is able to be effective and understand and maintain the application or system.
If you have suggestions about other things that should be on this list, please drop me a line and let me know. I would love to add them to the list, and make this generic deployment readiness checklist even better.
Production Readiness Checklist
⊗ Name and General Description of the Application – this includes the purpose of the application and the number of users that are anticipated to use the application. Also have an idea of the types of users. Is it for the general public? Only for certain roles within our organization? Is it only for your customers? Do this in two to three paragraphs – anything more is adding complexity.
⊗ Description of Needed Software/Hardware/Cloud Resources – a list of the needed software packages, and the clou resources needed to run the application. Do you use third party utilities or libraries? Do you run on Cloud Foundry buildpacks? Virtual machines? Do you use Cloud services for database resources? Often a high level architectural diagram is useful to help other people understand the system at a high level. This should be done AS you build – so you can simplify things. Are your developers using different libraries to accomplish the same thing? Get them to standardize. Reduce your dependencies, reduce your complexity, and you improve your software quality.
⊗ Operating Systems and Patching Requirements – do you have specific OS requirements? Do you require a particular framework to run properly (like .NET, Eclipse, or a particular Cloud Foundry buildpack)? What OS versions have you tested and validated this application with – and do all of your components need to be on the same OS version? This becomes important when fixes get deployed to containers, virtual machines get upgraded, and maintenance activities are done.
⊗ Installation and Configuration Guidelines – you should be deploying your application in some automated manner. So your deployment and promotion scripts should be the only guide that you need…… except when they aren’t. Take the time and DOCUMENT those scripts – explain WHAT you are doing and WHY, so your application can easily be reconfigured or deployed in different ways in the future.
⊗ Back-up, Data Retention and Data Archiving Policies – let your operations people know what data needs to be archived and retained. How often do systems need to be backed up? How will services be restored in the event of a crash? Explain WHERE and HOW data needs to be retained. Explain what your DEVELOPMENT teams need to review on a periodic basis. This can be the biggest headache for development teams, because these are often scenarios that they have not considered. Backup plans are not sufficient, they need to be executed at least once before you go into production – so you are sure that they are valid and that they work.
⊗ Monitoring and Systems Management – This includes runbooks – what do we need to do while the application is running? Do we need to take the logs off of the system every day and archive them? Or do we just let logs and error reports build up until the system crashes? Should I monitor memory and heap usage on a daily basis? Should I be monitoring CPU load? Who do I notify if I see a problem, and what is a “problem”? (CPU at 50%? CPU running at 20% with spikes to 100%?) How will this application normally be supported? You may not have complete information and definition of “problems” when you begin, bu define what you can and acknowledge that things will change as time goes on.
⊗ Incident Management – This details how you react to application incidents. These could be bugs, outages, or both. In the case of an outage, who needs to be called, and what actions should they take to collect needed data, and to get the application back up and running. What logs are needed, what kind of data will aid in debugging issues? Who is responsible for application uptime TODAY (get things back on track and running), and who is responsible for application uptime TOMORROW (who needs to find root cause, fix bugs, make design changes if needed, etc.).
⊗ Service Level Documentation -This is the “contract” between you and your customers. How often will your application be down for maintenance? If your application is down, how long before it comes back up? Are there any billing or legal ramifications from a loss of service? Do your customers get refunds – or cash back – when your Cloud application is unavailable?
⊗ Extra Credit – DevOps pipeline – you need to have an automated pipeline for the deployment of code changes into well defined development, test, and production environments. You need to have a solid set of policies and procedures for the initiation and automation of these deployments. Who has authority to deliver to test environments? Production environments?
Software Architecture Considerations
⊗ Key Support & Maintenance Items – the team that built this thing knows where the weak spots are – share that knowledge! Where does the team know that “tech debt” exists – and how is that impacting your application? This information will help the teams maintaining and upgrading your application. They will be able to do this with knowledge about how the application works, and why certain architectural choices were made.
⊗ Security Plan – Everyone is worried about the security of their applications and data on the cloud. You need to be sensitve to this when deploying cloud based applications. Your stakeholders and users will want to know that you have considered security, and that you are protecting their data from being exposed, stolen, or used without their knowledge/consent.
⊗ Application Design – This should include some high level description of your use case, a simple flowchart and dependencies. Give enough detail so someone can easily get started in maintaining your application code, but not so much detail that you waste time and ultimately end up with documentation that does not match the code.
Is That Everything?
That’s not everything, but it is a good minimal list of things that you should have considered and/or documented. Most applications need some sort of a support plan – who handles incoming problem tickets from customers? Do you have a support process for your end users? In your own environments and business context, you may have other things that need to be added to this list. Do you need to check for compliance with some standard or regulation? What are your policies for using Open Source software?
So this list is not meant to be exhaustive – but it is designed to make you think, and to help you ensure higher quality when deploying your Cloud applications.