This has also been published at Medium.com.
Note: This article was updated on April 29, to add links for a Leigh Williamson blog post that I found informative.
In an earlier blog post, IBM Watson Fueled Chatbots — For Our Health, I talk about the costs and benefits of a chatbot, and show you how a chatbot is within the reach of most organizations. I then branch out and discuss how to make updates to your chatbot, in my post called, I Love My Watson Chatbot — How Do I Update It? Now we look at doing some continuous testing, and how to do some basic version control and deploy new chatbot updates.
How Are We Doing?
In order to see how well your chatbot is performing, you need to put some automated testing of your chatbot in place. This will help you get some objective measures of chatbot accuracy and performance, and will also give you some insight into areas where your chatbot could stand to improve. Keep in mind that testing cognitive applications is fundamentally different from testing more traditional applications. Traditional applications relied on path testing, and code coverage. Due to the non-deterministic nature of cognitive applications, we end up having to use statistical models for our testing.
I STRONGLY suggest using the Python notebooks listed below. These can be run on your own machine, or from within Watson Studio. They are a great starting point for your automated analysis of your chatbot. Over time you can improve these notebooks and expand on some of the testing techniques in them.
- Dialog Skill Analysis Notebook — a notebook that will do an analysis of your Dialog skill.
- Log Analysis Workbooks — these two notebooks focus on slightly different areas, and do an analysis of your chatbot based on your logs.
- My K-Fold Notebook — not as full-featured as the two notebooks above, but I like mine because it’s simple. For a new Python user, it also has a straightforward approach to k-fold testing that is easier to understand.
Getting the feeling that you need to know a little bit of Python? So am I. You don’t have to be a Python expert to do any of this, but you should at least understand the basics of Python, and how to use and manipulate Python notebooks. Don’t get intimidated — you don’t need to know EVERYTHING before you start — just jump in and learn things as you go.
Making Changes Without Impacting Production
Some change management scenarios are enabled by using Watson Assistant’s new versioning capability. This versioning is a bit different than some versioning that you might be used to — it’s not like a typical software development check out, check in paradigm. Instead, it is more like a baseline establishment paradigm, where a version (or baseline) is created as a snapshot of the model at some point in time. The current state of the model (which you see in the UI) is always the tip, or development version.
So if you go into your Watson Assistant Plus instance, and go into your skill, you can just click on “Versions” in the left-hand nav bar, and you will see the versions screen.
In order to drop a new version (or baseline), just click on the “Save a new version” link in the upper right corner. You should then fill in something useful (think about naming conventions!) for the description, before pressing “Save”. We will talk more about naming conventions below, when we start to talk about some typical change management workflows.
Typical Change Management Workflows
A typical change management workflow will allow you to easily scope, contain, and then deploy changes to your chatbot. What I am proposing here is something that might not satisfy a “DevOps Purist”, since we have production and dev/test resources residing in the same service.
Keep in mind that this approach CAN easily be modified so that instead of moving the versions around within a Watson Assistant instance, we could export the model from one Watson Assistant environment (say the test environment), and then import it into another environment (say the production environment). These environments could be located in different resource groups, with individualized access and permissions tailored for each environment/resource group.
So without further explanation, here is a high-level view of a typical lightweight change management process for your Watson Covid-19 chatbot:
- Get suggested changes for the next cycle. Save these in some agreed upon format/tool with changes for your Covid-19 bot.
- Agree on changes and scope of changes for the next cycle. For those of you who are familiar with Agile development principles, think of this as sprint planning….
- Apply these changes in the tip (development) version of your Covid-19 bot
- Test development Covid-19 bot version — validate your changes — make final revisions
- Run automated quality tests against development bot version (yes, I wasn’t kidding about automated tests earlier, do this now — and you will thank me later)
- Create a new version in the Covid-19 bot — call it “<YYYY-MM-DD> Production”
- Get formal sign off for promotion of new version to production.
- Move the Assistant pointer for the production Covid-19 bot to the new version (“<YY-MM-DD> Production”)
- Backup the Covid-19 bot, with an export of the dialog skill to JSON — store the JSON file with name “<YYYY-MM-DD>_Production.json”
- Move the Assistant pointer for the development Covid-19 bot to the tip version
- Go back to step 1 and do it again……
Note that I have been non-specific about tickets/issues/tools to be using. I just trust that you are using some sort of change control — no matter how simple. You want to get your stakeholders in the habit of making change requests and then APPROVING those requests — just so we avoid any “Why did you do that?” situations. You can implement this on the IBM Cloud (a free version of GitHub hosted on the IBM Cloud), where you can do real lightweight change management, or you can use your existing change management tooling and infrastructure. It’s completely up to you.
Change Management for the Advanced User
The simple change management process outlined above will work for simple deployments, and is probably robust enough for a chatbot that is not expected to be deployed on a long term basis. What about change management for more established chatbots? Or for chatbots that are part of a larger system that is currently under some more formal type of change management? What can we do in those cases?
In those cases, and for users wishing for a more robust and formal change management or DevOps infrastructure, you will want to read A comparison of DevOps choices for IBM Watson Assistant by Leigh Williamson. Leigh is a peer of mine who has spent some time thinking about this, and more importantly, he has spent some time actually implementing these types of change management systems.
So if you have been following this series of blog posts, you have deployed a longtail chatbot that answers Covid-19 questions. You have learned how to update that chatbot, and how to test and version control your changes. What’s left to do? At this point you know enough to be dangerous — it might be time to fill in some of those learning gaps. My blog posts tell you just what you need to know, but there are so many other Watson techniques and capabilities that you can use to make your end-user experience even better. Maybe you should check out some of the Watson Covid-19 specific learning resources that are out there.