Conversational Assistants and Quality with Watson Assistant – Revisited

By Daniel Toczala

Originally posted on Medium on February 11, 2020 at https://medium.com/@dtoczala/conversational-assistants-and-quality-with-watson-assistant-revisited-123fb3bb9f1f.

Note: I updated the original Conversational Assistants and Quality blog post in February 2020 to add a link to a much better testing notebook that I discovered, and to do a slight rewrite of that section. This blog post is a complete update to that original post – and it restates a lot of what I highlighted in the original post. The BIG difference is the new Python testing notebook – which is located out on GitHub, as CSM-Bot-Kfold-Test.

In early February of 2020 I was informed of this great blog post and Python notebook, on How to Design the Training Data for an AI Assistant. I REALLY liked this Python notebook MUCH better than my original k-fold notebook (from August of 2019). The other nice thing is that you can discover this Python notebook in the catalog in Watson Studio, and just apply it and have it added to your Watson Studio project. The only big difference with this notebook is that you need to have your testing data in a separate CSV file – it doesn’t break up “folds” based on your training data. It didn’t even do folds – just straight training and testing data.

I wasn’t a big fan of that approach, I liked my basic approach of pointing at only a Watson Assistant instance, and using all of the training data in a series of k-fold tests. Nobody wants to manage this data, that data, this file, that file….. it’s an opportunity to screw things up. Most of my customers are NOT AI experts, they just want a suite of tools that they can point at their chatbot engine that will allow them to do some automated testing of their chatbot. I have also noticed that many will use ALL of their training data, and not hold back some as test data. Doing k-fold testing using all of the training data in an existing Watson Assistant instance addresses this.

However, I really liked some of the analysis that they had done of the training data, and some of the other insights that they provided. So I decided to dive in and spend a little time merging the best of both of these approaches together. First, let’s start with some basic “rules” that you should be following if you are developing a chatbot.

Getting Started with Your Conversational Assistant

Back in July of 2019, I was working with a group of like-minded people inside of IBM, and we decided to create an IBM internal chatbot that would capture a lot of the “institutional knowledge” that some of our more experienced members knew, but that didn’t seem to be captured anywhere. We wanted our newer team members to be as effective as our more seasoned members. 

We spent a week or two coming to a common vision for our chatbot.  We also mapped out a “growth path” for our chatbot, and we agreed on our roles.  I cannot begin to stress how important this is – Best Practice #1 – Know the scope and growth path for your chatbot.  We had a good roadmap for the growth of our chatbot.  We mapped out the scope for a pilot, where we wanted to be to release it to our end users, and a couple of additional capabilities that we wanted to add on once we got it deployed.

My boss graciously agreed to be our business sponsor – his role is to constantly question our work and our approach.  “Is this the most cost-effective way to do this?”, and, “Does that add any value to your chatbot?”, are a couple of the questions he constantly challenges us with.  As a technical guy, it’s important to have someone dragging us back to reality – it’s easy to get focused on the technology and lose sight of the end goal.

Our team of “developers” also got a feel for the roles we would play.  I focused on the overall view and dove deeper on technical issues, some of my co-workers served primarily as testers, some as knowledge experts (SME’s), and others as served as UI specialists, focusing on the flow of conversation.  This helped us coordinate our work, and it turned out to be quite important – Best Practice #2 – Know your roles – have technical people, developers, SME’s, architects, and end users represented.  If you don’t have people in these roles, get them.

Starting Out – Building A Work Pipeline

As we started, we came together and worked in a spreadsheet (!?!), gathering the basic questions that we anticipated our chatbot being able to answer.  We cast a pretty wide net looking for “sample” questions to get us kickstarted.  If you are doing something “new”, you’ll have to come up with these utterances yourself.  If you’re covering something that already exists, there should be logs of end user questions that you can use to jumpstart this phase of your project.

Next, we wanted to make sure that we had an orderly development environment.  Since our chatbot was strictly for internal deployment, we didn’t have to worry too much about the separation of environments, so we could use the versioning capabilities of Watson Assistant.  Since our chatbot was going to be deployed on Slack, we were able to deploy our “development” version on Slack, and also deploy our “test” and “production” versions on Slack as well.  These are all tracked on the Versions tab of the Watson Assistant Skill UI.  This gives us the ability to “promote” tested versions of our skill to different environments.  All of this allowed us to have a stable environment that we could work and test in – which leads us to Best Practice #3 – Have a solid dev/test/prod environment set up for your Conversational assistant or chatbot.

How Are We Doing? – K- Fold Testing

As we started out, we began by pulling things together and seeing how our conversational assistant was doing in real-time, using the “Try It” button in the upper right-hand corner of the Watson Assistant skills screen.  Our results were hit and miss at first, so we knew that we needed a good way to test out our assistant. 

We started out with some code from a Joe Kozhaya blog post on Training and Evaluating Machine Learning Models.  I ended up modifying it a little bit, and posting it on my Watson Landing Page GitHub repo.  We also read some good stuff from Andrew Freed (Testing Strategies for Chatbots) and from Anna Chaney (Data DevOps Rules of Engagement),  and used some of those ideas as well.

In February of 2020 I was informed of this great blog post and Python notebook, on How to Design the Training Data for an AI Assistant. I liked this Python notebook MUCH better than my old K-fold notebook, but I liked my approach better. So I went to work combining the best of both worlds into a new Python notebook. My new Python notebook does this – and provides some great insight into your chatbot. Go and find it on GitHub, where it is stored as CSM-Bot-Kfold-Test.

This highlights our next best practice – Best Practice #4 – Automate Your AI Testing Strategy.

Using Feedback

As we let our automated training process take hold, we noted that our results were not what we had hoped, and that updating things was difficult.  We also learned that taking time each week to review our Watson Assistant logs was time well spent. 

It was quite difficult to add new scope to our conversation agent, so we looked at our intents and entities again.  After some in-depth discussions, we decided to try a slightly different focus on what we considered intents.  It allowed us to make better use of the entities that we detected, and it gave us the ability to construct a more easily maintained dialog tree.  We needed to change the way that we were thinking about intents and entities.

All of this brings us to our next piece of wisdom – Best Practice #5 – Be Open-Minded About Your Intents and Entities.  All too often I see teams fall into one of either two traps. 

  • Trap 1 – they try to tailor their intents to the answers that they want to give.  If you find yourself with intents like, “how_to_change_password” and “how_to_change_username”, then you might be describing answers, and not necessarily describing intents. 
  • Trap 2 – teams try to have very focused intents.  This leads in an explosion of intents, and a subsequent explosion of dialog nodes.  If you find yourself with intents like, “change_password_mobile”, “change_password_web”, “change_password_voice”, then you have probably fallen into this trap.

We found that by having more general intents, and then using context variables and entities to specify things with more detail, that we have been able to keep our intents relatively well managed, our dialog trees smaller and better organized, and our entire project is much easier to maintain.  So, if our intent was “find_person”, then we will use context variables and entities to determine what products and roles the person should have.  Someone asking, “How do I find the program manager for Watson Assistant?”, would return an intent of “find_person”, with entities detected for “program manager” and “Watson Assistant”.  In this way, we can add additional scope without adding intents, but only by adding some entities and one dialog node. 

Why K-Fold Isn’t Enough

One thing that we realized early on was that our k-fold results were just one aspect of the “quality” of our conversational assistant.  They helped quantify how well we were able to identify user intents, but they didn’t do a lot for our detection of entities or the overall quality of our assistant.  We found that our k-fold testing told us when we needed to provide additional training examples for our classifier, and this feedback worked well.

We also found that the “quality” of our assistant improved when we gave it some personality.  We provided some random humorous responses to intents around the origin of the assistant, or more general questions like, “How are you doing today?”.  The more of a personality that we injected into our assistant, the more authentic and “smooth” our interactions with it began to feel.  This leads us to Best Practice #6 – Inject Some Personality Into Your Assistant

Some materials from IBM will break this down into greater detail, insisting that you pay attention to tone, personality, chit-chat and proactivity.  I like to keep it simple – it’s all part of the personality that your solution has.  I usually think of a “person” that my solution is – say a 32-year old male from Detroit, who went to college at Michigan, who loves sports and muscle cars, named Bob.  Or maybe a 24-year-old recent college graduate named Cindy who grew up in a small town in Ohio, who has dreams of becoming an entrepreneur in the health care industry someday.  This helps me be consistent with the personality of my solution.

We also noticed that we often needed to rework our Dialog tree and the responses that we were specifying.  We used the Analytics tab in the skill we were developing.  On that Analytics tab, we would often review individual user conversations and see how our skill was handling user interactions.  This led us to make changes to the wording that we used, as well as to the things we were looking for (in terms of entities) and what we were storing (in terms of conversation context).  Very small changes can result in a big change in the end-user perception.  Something as simple as using contractions (like “it’s” instead of “it is”), will result in a more informal conversation style.

The Analytics tab in Watson Assistant is interesting.  It provides a wealth of information that you can download and analyze.  Our effort was small, so we didn’t automate this analysis, but many teams DO automate the collection and analysis of Watson Assistant logs.  In our case, we just spent some time each week reviewing the logs and looking for “holes” in our assistant (questions and topics that our users needed answers for that we did not address), and trends in our data.  It has helped guide our evolution of this solution.

Summary

This blog post identifies some best practices for developing a chatbot with IBM Watson Assistant – but these apply to ANY chatbot development, regardless of technology.

  • Best Practice #1 – Know the scope and growth path for your chatbot
  • Best Practice #2 – Know your roles – have technical people, developers, SME’s, architects, and end users represented
  • Best Practice #3 – Have a solid dev/test/prod environment set up for your Conversational assistant or chatbot
  • Best Practice #4 – Automate Your AI Testing Strategy
  • Best Practice #5 – Be Open Minded About Your Intents and Entities
  • Best Practice #6 – Inject Some Personality Into Your Assistant

Now that you have the benefit of some experience in the development of a conversational assistant, take some time to dig in and begin building a solution that will make your life easier and more productive.

Watson Discovery at the Size You Want

I just worked with a customer this week on an issue that they had – and the solution didn’t seem obvious, so I figured that I would share it with a larger audience.

The Issue

My customer has a couple of Watson Discovery instances in the IBM Cloud environment. These instances are supporting a cognitive application that they have acting in an expert assistant role – providing quick access to guidance and information to their associates. One instance is a Discovery instance which is supporting the production environment, the other is supporting their development and test environment. Both instances are Small sized.

They realize that they would like to save some money by using a smaller size instance for their development and test environment, where they think they only need an X-Small sized instance. They asked me for some guidance on how to do this.

The Background

This is not as simple a request as it might seem at first. The issue is that once you move into the Advanced sized instances (instead fo the free Lite instances), your Discovery instances begin to cost you money. They also can be upgraded from one size to a larger size, but they cannot be shrunk. Why? We can always expand and allocate additional resource to an instance, but we cannot guarantee that there will be no loss of data when shrinking instances. So we don’t shrink them.

It’s probably best to start by looking at the various sizes and plans. Looking at the Discovery page on the Cloud dashboard gives you some idea of the costs and charges, but it is not easy to read. Instead, I find that the help pages on upgrading Discovery, and Discovery pricing, are much more helpful. The table on each of these pages are informative and when they are combined, they give you the basics of what you need to know (this is accurate at the time of publishing – November 2019).

SizeLabelDocsPrice
X-SmallXS50k$500/mo
SmallS1M$1500/mo
Medium-SmallMS2M$3k/mo
MediumM4M$5k/mo
Medium-LargeML8M$10k/mo
LargeL16M$15k/mo
X-LargeXL32M$20k/mo
XX-LargeXXL64M$35k/mo
XXX-LargeXXXL100M$45k/mo

One other IMPORTANT difference between the plans is this: Each plan gives you a single environment that supports up to 100 collections and free NLU enrichments. The only exception is the X-Small plan, which will only support 4 collections. Also, note that you may also pay additional for news queries and custom models.

What Needs To Be Done

In order to “downsize” one of their Discovery instances from Small to X-Small, the customer will need to migrate the data themselves. What makes this difficult is that they will only have 4 collections available to them in the X-Small instance, instead of the 100 that were available in their Small instance. So they need to take these steps:

  • Create a new Discovery instance, with a size of X-Small.
  • Select the 4 (or fewer) collections that will be used in the new X-Small instance.
  • Re-ingest documents into the 4 new collections.
  • Delete the old development and test Discovery instance.

Creating a Discovery Instance of a Certain Size

The issue that my customer ran into was this: How do I create a Discovery instance of a certain size? When I look at the Discovery page on the Cloud dashboard , all I see is that I can select the Advanced plan – but no option on what size to use. So how do you do it?

It’s simple, and it’s outlined in the help docs in the section on upgrading Discovery. You first need to go in and create a new instance of the Discovery service with the Advanced plan. After you do this, the service will take some time to provision. You’ll need to wait patiently while this is done – it’s usually less than 2 minutes.

Now open your Discovery instance, by clicking on the link, and then choosing the “Launch Watson Discovery” button on the Manage page. You will now see the Discovery instance come up, and you will click on the small icon in the upper right corner of the browser to bring up a dialog that will allow you to “Create Environment”.

Hit the small icon in the upper left, and then the “Create Environment” button…

Then you will be able to select the SIZE of the Discovery instance that you want. You will see a dialog that looks similar to what is shown below:

For our X-Small option, we’ll need to click on “Development”…

See that you can choose from three different menus: Development (which shows the X-Small option), Small/Medium (which shows the Small through Medium-Large options), and Large (which shows Large through XXX-Large). Choose the size that you want, and then hit the “Set Up” button. This will create your Discovery environment, in the size that you want.

What If I Want To Increase the Size of my Discovery Instance?

In the above case, we had to do some specific actions to get a new instance created in a size that we wanted. We also learned that if we wanted to SHRINK in size, we needed to create a new instance and migrate data to the new instance.

What if I have been using Discovery for a while now, and I want to INCREASE in size? How do I do that? it’s actually pretty simple, and it’s also documented in the online help, in the section on upgrading Discovery. it just provides a link to the API, but not a lot of additional explanation. I’ll give you a bit more, so it’s a bit more clear.

If you look at the Discovery API reference, you’ll see a section on Update an Environment. This is the API call that you can use to upgrade your environment (and thus, the size of your Discovery instance). The API call is spelled out in this doc, and you can get examples for whatever language or way that you want to generate this API call by just selecting a type for the example in the black window on the right. In the example below, I chose to look at this in Python.

I wanted to see the API call example in Python, but most people will just use Curl…

Just make sure to use the “size” parameter in your API call, and make sure that you use the right code for the size that you want (the label from the table earlier in this post). That’s all there is to it.

Authentication for your App on the IBM Cloud

I like to answer customer questions on the IBM Cloud, and about using Watson services on the IBM Cloud. Sometimes I know the answer… sometimes I have to go and find the answer… and sometimes I create the answer. As I encounter interesting questions, I like to share my solutions and answers on this blog – so that way I am able to help other people who may be encountering the same issues.

Recently I have been seeing a lot of questions about how authentication on the IBM Cloud works, and what the “best practices” around authentication are. It’s a large topic, and I cannot cover it completely in this blog post, but hopefully, I can cover it to a depth that will help you make decisions about how you want to structure your IBM Cloud/Watson application.

Getting Into The IBM Cloud

You get into the IBM Cloud using your own username/password to authenticate. You might even use two-factor authentication to get in. The access to all of the services and accounts on the IBM Cloud is controlled by the IBM Cloud Identity and Access Management (IAM) service. This service validates your identity and then assigns your access and permissions based on that identity.

When an application attempts to access resources on the IBM Cloud, it too needs to have it’s identity validated. Most IBM Cloud users will create either a service account (with its own login, password, and credentials) or will create some specific service credentials for every cloud service.

So the first step is to have your application authenticate itself, with some sort of credentials. To do this you will make your API call to the API endpoint that you are attempting to access. This call will go through the IBM firewall, which takes care of traffic routing, ingress, load balancing, and some other “big picture” types of things. Your request is then routed to the authentication service. This service will attempt to authenticate your request.

Basic Watson Service Request Flow

It is at this point that things get interesting for an application developer. You must choose from one of two paths:

  1. You can give the authentication mechanism an API Key. The authentication service will then validate the API Key. It does this by calling the IAM Identity service. Once you have been authenticated, your request is passed along to the appropriate Watson service instance.
  2. You can give the authentication mechanism an API Token. To get this API Token, you will first need to call the IAM Identity service, give it your API Key, and ask for an API Token. You can then make your call and give the authentication mechanism this API Token.

So why would you want to take approach #2? You want to take approach #2 as much as possible because when you authenticate via an API Token, you do not make an additional call to the IAM service when authenticating. These API Tokens are good for 60 minutes.

How Much Do I Save?

Let’s look at a simple example. If your application will make 1000 calls to the Watson service in a typical hour of usage. If you just use the API Key approach, you will make 1000 additional calls to the IAM service. Each call will the API Key will require the authentication service to make a request to the IAM service.

If you decide to use the API Token method, you will make an extra call to the IAM service to get your token, and then you will make your initial service call. This is 1 additional call to the IBM Cloud. So in a typical hour, this will save you 999 service calls within the IBM Cloud.

The savings here for a single call to a service may be difficult to measure and probably would go unnoticed by an end-user. But during periods of high traffic and high load, this reduction is traffic and load will make your application much more robust, responsive and resilient.

Why even bother with the API Key approach? It is useful for testing scenarios and for times when you just want to issue curl calls by hand, directly to the API. In these cases, it is much nicer to just do the call once, rather than having to get a token and then use the resulting token (which can be large).

How Do I Do It?

Set up a small section of code to request your API Token, and then save that token and insert it into every API call made to services on the IBM Cloud. You’ll then need to decide on how you want to handle renewing your token, since every token is only good for 60 minutes.

You can either set a timer for 50 or 55 minutes and wake up to renew your token before the previous token expires. The other option is to just handle authentication errors in your application by refreshing your token and trying to execute your request again with the new token.

Another useless IBM Cloud Billing Script …. or not

Recently I have have had a customer ask me some questions about billing on the IBM Cloud. Specifically, they wanted to be able to download their billing information. Now I know that you can do this interactively from the IBM Cloud. You just go to your usage page and hit the little Export CSV button in the upper right-hand corner. This will dump out a CSV file of your current months usage for you.

“That’s nice…..but we want to be able to break down usage and get it into a spreadsheet”, was the response from my customer. “But CSV format IS a spreadsheet”, I thought to myself. So I decided to ask some more questions. It turns out that my customer is interested in AUTOMATING the administration of their IBM Cloud. This means that they want to dump not just one month’s worth of data, but a full 12 months of data into statements that will outline internal chargebacks (based on resource groups), as well as showing visual graphs that summarize usage and spending over the past 12 months.

Now I understood why the “button in the upper right” wasn’t going to work. They wanted some sort of script that would run automatically, to generate all of this content for their IBM Cloud user community. I had wanted to do some work with the Billing and Usage API and Resource Manager API on IBM Cloud, so I decided to script something up for them.

This project also led me to using the Resource Controller API – which is slightly different from the Resource Manager API. The Resource Controller API focuses on the various different service instances (resources) that have been created within an account. The Resource Manager API deals more with the resource groups defined within the account.

Note: If you’re reading this post and serious about doing this sort of automation for your IBM Cloud, PLEASE check out the API links above. It is possible to use different API calls to do more targeted operations – instead of the very broad implementation that I have done.

As usual, I decided to work within Watson Studio, and I decided to write my script in Python. Watson Studio includes some environments that have pre-defined collections of packages – including ones you will depend on like pandas, numpy, and IBM_Watson. It also allows me to run my script up on the IBM Cloud (not on my local machine), version my script, and dump my results to IBM Cloud’s Cloud Object Storage. My goal was to show this customer how to do a dump of just the services with usage (and not EVERY service, since there are many services out there which are on Lite plans). I also wanted to highlight how to get to some of the information available via the Resource Controller API.

I quickly learned that in order to get ALL of the information that I needed, I would need to cross-reference some of the data coming back from these various IBM Cloud API’s. That meant using all of the IBM Cloud API’s mentioned above.

So now I have this Python code all sitting in my new GitHub repository called IBMCloudBillingScript. This script only really does a lot of the same things that the “Export CSV” button does – so this makes this script kind of useless. The reason I built it, and have shared it on GitHub, is because we often want to AUTOMATE things based on this information. This script shows you how to get (and manipulate) this kind of information about your IBM Cloud account.

Finding Custom Models in Watson Discovery

My blog posts often focus on issues that my customers are having – I solve them once for someone and then share that with the wider world in the hopes that someone else may find my guidance useful. This week, I ran into a new issue with a new customer of mine.

My customer was wondering about the use of custom models in their account, and how this impacted their usage of the Watson Discovery service. They were being charged for each custom model used by the Discovery service, but they had no idea of where they were being used. I went and looked in the UI, but found nothing that indicated where custom models had been applied.

So I dug into the API and found out how you can tell. The key is the API call to get-configuration, which returns a JSON payload with information about the configurations in your Discovery instance. Using that information, along with some calls to other API services likelist_collections and list_configurations, you can find out which Discovery collections are using custom models.

Since I had to figure this out for myself, I decided to do some quick and dirty Python code to do this for me, for any given Discovery instance. If you’re interested, you can go and get your own copy of the code out in my GitHub project called Discovery Custom Model Detect. The code is a bit rough, but it gets the job done. Feel free to pretty it up or make it more interactive.

I’m Having an Issue on IBM Cloud Part 2 – What is Happening on the IBM Cloud?

By: Daniel Toczala 

Note: This is the second blog in a series of blogs that I am co-authoring with Paula Williams, as part of an “I have an Issue…” series on the IBM Cloud. The first article, Part 1 – Why Can’t I Create Anything can be found on my WordPress site. The next article, Part 3 – Who you Gonna Call ? For Your IBM Cloud Watson AI Software As A Service (SaaS) Needs can be found on Paula’s Medium site. These blog posts will cover how to deal with common issues and roadblocks for users of the IBM Cloud. 

I like helping my IBM Cloud customers, and I like dealing with the technology.  Every new technology (and even established technologies) has a learning curve.  My goal in writing this series of articles is to help you more quickly conquer that learning curve with respect to the IBM Cloud.   

Today’s article deals with understanding what is going on in the “big picture” with the IBM Cloud.  How do you know what new services are available on the IBM Cloud?  How do I know when maintenance windows will occur?  How do I find out when services are getting deprecated and retired?  If my services are down, is it just me?  Is it other people too?  Is the whole data center down? 

Checking the Current IBM Cloud Status 

When things are not working, or seem to be slow, the first place I check is the overall IBM Cloud status page.  You can find it here -> https://cloud.ibm.com/status?selected=status.  There are a few different ways to look at this page.  The first tab shows the Status of the overall cloud – which services might be unavailable and which regions are impacted.  There are four other tabs, and they show other information.  One is for Planned Maintenance, and this shows upcoming maintenance windows and describes their impact on users of the services.  It’s always good to check this once a week to see what upcoming activities may impact your users and cloud projects.  Another tab is for Security Bulletins, and this one shows important security bulletins and events that you will want to be aware of.  There is also a tab for more general IBM Cloud Announcements, which contains general cloud announcements and event notifications.  The final tab is for History, so you can see the events of the past few days, and see what the resolution of those events was. 

This is a lot of different tabs for you to check.  I have to admit, even as a frequent user of the IBM Cloud platform, I rarely check these tabs on a daily, or even weekly, basis.  Instead, I subscribe to an RSS feed that will give me push notifications of IBM Cloud events as they get posted.  For those of you unfamiliar with RSS, it is a publishing technology which allows users to “subscribe” to event streams.  There are a bunch of free RSS readers out there, just look one up and install it.  Then point your RSS reader at the IBM Cloud RSS feed.  The RSS link is on the IBM Cloud Status page – just click on the Subscribe icon on the right-hand side of the page. 

Signing Up For Email Notifications 

Another thing that IBM Cloud account owners and IBM Cloud Administrators should do is to sign up for email notifications.  You can have the account owner (your IBM Cloud account which “owns” the subscription) get notifications each month when certain events occur.   

Setting this up is easy, for the account owner.  Log into the IBM Cloud as the account owner, and then select Manage -> Billing and Usage from the top navigation bar for the IBM Cloud.  In the resulting screen, look at the menu on the left side of the browser, and select the Spending Notifications option. 

On this Spending Notifications screen, you should now be able to specify sending spending notifications to yourself for any of the conditions specified.  Set your limits, and be aware that you will be notified when you reach 80%, 90% and 100% of your specified threshold.  Your Spending Notification screen should look similar to this: 

Click in those checkboxes to make sure that you get emails sent to you whenever those threshold limits are hit. 

Why Can’t I See That Support Ticket? 

I like the IBM Cloud, but on occasion you will need to open a support ticket because you have run into an issue on the IBM Cloud, or with one of the IBM Cloud services.  In order to open up a support ticket, click on Support on the top menu bar.  In the resulting Support page, click on the Manage Cases tab, and you will see a list of support cases that you are involved with. 

Be aware of the fact that this Manage Cases page has a filter which will only show support cases that you are involved with, and that are in some open state.  You may want to go and change your filters, to be able to see additional support cases.  If you are not able to see a support case, it could be because your organization has not given you the ability to see or open support cases for the organization.  If this is the case, then you’ll need to ask your IBM Cloud administrator to give you that capability.  The Create A Service Support Case documentation page has a great description of the process used to create a support case.

If you are the IBM Cloud administrator, then you will need to go to the Manage –> Access (IAM) page, and then go to Access Groups.  Once there, create a new access group, and make sure that it follows your naming conventions.  A good example might be, “SupportTicketAccess_AG”.  Once the access group is created, you’ll see the Access Group page.  Click on the Access Policies tab, and then on the Assign Access button.  Now you will need to select Assign Access to Account Management Services.  Select the Support Center service, and then apply ALL levels of access (Administrator, Editor, Operator, and Viewer) to the support center.  Now all you need to do to give users access to all of the support tickets for an organization is to add them to this access group. 

Note that you could create finer-grained access groups, like “SupportTicketViewer_AG”, that would only allow limited capability with support tickets.  Just create the additional access groups, and change your assignment of levels of access accordingly. 

Oh My God – EVERYTHING IS BLOWING UP! 

Now I’m getting 5438 email messages a day about things going down – are things really THAT bad??  OK – maybe 5438 is a bit of an exaggeration, but you get the idea…. 

You have subscribed to get email notifications of outages in the IBM Cloud.  Nice job – you should be proud of yourself for being proactive!  Our IBM Cloud has a lot of different customers, all co-located with services in a lot of different data centers.  When our infrastructure team detects a loss of service (let’s say a machine dies, which causes some IBM Cloud service to fail for the 5 customer instances running on that machine), they want to notify our customers as soon as possible.  So we send out an automated warning email to our users.  This is all nice automation, and allows us to be “good” Cloud providers and let our customers know when things go wrong. 

Now we get to the not-so-pretty part.  At the time this happens, we cannot tell EXACTLY which 5 customer instances have gone down, so we err on the side of over-communication, and let EVERY CUSTOMER IN THAT DATA CENTER know that they MIGHT have lost service.  We didn’t want to ignore or pretend the errors weren’t happening – so we took this approach.  Unfortunately, these things happen relatively frequently, and while they are short in duration and limited in scope (only a couple of customers lose service for a short period of time), the email blast to customers is EXTENSIVE.  Inside of IBM we half-jokingly (with accompanying eye roll) refer to this as our BLAST RADIUS. 

What does this mean for you?  It means that you will get a lot of notices, only 5-10% of which will actually apply to you.  We SHOULD watch this issue though, as this is a known (and painful) issue that IBM is currently addressing and rolling out fixes for.  As these fixes and changes to the IBM Cloud get implemented, the percentage of notices that actually apply to you will increase from 5-10% to 100% (meaning we only notify you about things that WILL actually impact you). 

Conversational Assistants and Quality with Watson Assistant

By Daniel Toczala

Note: I updated this blog post in February 2020 to add a link to a much better testing notebook that I discovered, and to do a slight rewrite of that section.

Recently my team inside of IBM decided that we needed to capture some of the “institutional knowledge” that some of our more experienced members knew, but that didn’t seem to be captured anywhere so our newer team members could be as effective as our more seasoned members.  We also wanted an excuse to get our hands dirty with our own technology, some of us had been focused on some of the other Watson technologies and needed to get reintroduced to the Watson Assistant since the migration to the “skills based” UI.

I went looking for something good about developing a chatbot (or conversational assistant), with Watson Assistant, and some good lessons learned.  I found some good information, but not one spot with the kind of experience and tips that I was looking for.  So I thought it might be good to capture it in my own blog post.

Getting Started with Your Conversational Assistant

We spent a week or two coming to a common vision for our chatbot.  We also mapped out a “growth path” for our chatbot, and we agreed on our roles.  I cannot begin to stress how important this is – Best Practice #1 – Know the scope and growth path for your chatbot.  We had a good roadmap for the growth of our chatbot.  We mapped out the scope for a pilot, where we wanted to be to release it to our end users, and a couple of additional capabilities that we wanted to add on once we got it deployed.

My boss graciously agreed to be our business sponsor – his role is to constantly question our work and our approach.  “Is this the most cost-effective way to do this?”, and, “Does that add any value to your chatbot?”, are a couple of the questions he constantly challenges us with.  As a technical guy, it’s important to have someone dragging us back to reality – it’s easy to get focused on the technology and lose sight of the end goal.

Our team of “developers” also got a feel for the roles we would play.  I focused on the overall view and dove deeper on technical issues, some of my co-workers served primarily as testers, some as knowledge experts (SME’s), and others as served as UI specialists, focusing on the flow of conversation.  This helped us coordinate our work, and it turned out to be quite important – Best Practice #2 – Know your roles – have technical people, developers, SME’s, architects, and end users represented.  If you don’t have people in these roles, get them.

Starting Out – Building A Work Pipeline

As we started, we came together and worked in a spreadsheet (!?!), gathering the basic questions that we anticipated our chatbot being able to answer.  We cast a pretty wide net looking for “sample” questions to get us kickstarted.  If you are doing something “new”, you’ll have to come up with these utterances yourself.  If you’re covering something that already exists, there should be logs of end user questions that you can use to jumpstart this phase of your project.

Next, we wanted to make sure that we had an orderly development environment.  Since our chatbot was strictly for internal deployment, we didn’t have to worry too much about the separation of environments, so we could use the versioning capabilities of Watson Assistant.  Since our chatbot was going to be deployed on Slack, we were able to deploy our “development” version on Slack, and also deploy our “test” and “production” versions on Slack as well.  These are all tracked on the Versions tab of the Watson Assistant Skill UI.  This gives us the ability to “promote” tested versions of our skill to different environments.  All of this allowed us to have a stable environment that we could work and test in – which leads us to Best Practice #3 – Have a solid dev/test/prod environment set up for your Conversational assistant or chatbot.

How Are We Doing? – K- Fold Testing

As we started out, we began by pulling things together and seeing how our conversational assistant was doing in real-time, using the “Try It” button in the upper right-hand corner of the Watson Assistant skills screen.  Our results were hit and miss at first, so we knew that we needed a good way to test out our assistant. 

We started out with some code from a Joe Kozhaya blog post on Training and Evaluating Machine Learning Models.  I ended up modifying it a little bit, and you can find the modified notebook in my Watson Landing Page GitHub repo, under notebooks, in a Python notebook stored as ANYBOT_Test-and-Deploy.ipynb.  We also read some good stuff from Andrew Freed (Testing Strategies for Chatbots) and from Anna Chaney (Data DevOps Rules of Engagement),  and used some of those ideas as well.  This led me to create that modified Python notebook, which I used to provide automated k-fold testing of our assistant implementation. 

In February of 2020 I was informed of this great blog post and Python notebook, on How to Design the Training Data for an AI Assistant. I really like this Python notebook MUCH better than my own K-fold notebook mentioned above. The other nice thing is that you can discover this Python notebook in the catalog in Watson Studio, and just apply it and have it added to your Watson Studio project. The only big difference with this notebook is that you need to have your testing data in a separate CSV file – it doesn’t break up “folds” based on your training data. This highlights Best Practice #4 – Automate Your AI Testing Strategy.

After all of this was in place, our team fell into a predictable rhythm of work and review of our work.  Since this was a side project for all of us, some of us contributed some weeks, and didn’t contribute on other weeks.  We were a small team of people (less than 10), so it was easy to have our team manage itself.

Using Feedback

As we let our automated training process take hold, we noted that our results were not what we had hoped, and that updating things was difficult.  We also learned that taking time each week to review our Watson Assistant logs was time well spent. 

It was quite difficult to add new scope to our conversation agent, so we looked at our intents and entities again.  After some in-depth discussions, we decided to try a slightly different focus on what we considered intents.  It allowed us to make better use of the entities that we detected, and it gave us the ability to construct a more easily maintained dialog tree.  We needed to change the way that we were thinking about intents and entities.

All of this brings us to our next piece of wisdom – Best Practice #5 – Be Open-Minded About Your Intents and Entities.  All too often I see teams fall into one of either two traps.  Trap 1 – they try to tailor their intents to the answers that they want to give.  If you find yourself with intents like, “how_to_change_password” and “how_to_change_username”, then you might be describing answers, and not necessarily describing intents.  Trap 2 – teams try to have very focused intents.  This leads in an explosion of intents, and a subsequent explosion of dialog nodes.  If you find yourself with intents like, “change_password_mobile”, “change_password_web”, “change_password_voice”, then you have probably fallen into this trap.

We found that by having more general intents, and then using context variables and entities to specify things with more detail, that we have been able to keep our intents relatively well managed, our dialog trees smaller and better organized, and our entire project is much easier to maintain.  So, if our intent was “find_person”, then we will use context variables and entities to determine what products and roles the person should have.  Someone asking, “How do I find the program manager for Watson Assistant?”, would return an intent of “find_person”, with entities detected for “program manager” and “Watson Assistant”.  In this way, we can add additional scope without adding intents, but only by adding some entities and one dialog node. 

Why K-Fold Isn’t Enough

One thing that we realized early on was that our k-fold results were just one aspect of the “quality” of our conversational assistant.  They helped quantify how well we were able to identify user intents, but they didn’t do a lot for our detection of entities or the overall quality of our assistant.  We found that our k-fold testing told us when we needed to provide additional training examples for our classifier, and this feedback worked well.

We also found that the “quality” of our assistant improved when we gave it some personality.  We provided some random humorous responses to intents around the origin of the assistant, or more general questions like, “How are you doing today?”.  The more of a personality that we injected into our assistant, the more authentic and “smooth” our interactions with it began to feel.  This leads us to Best Practice #6 – Inject Some Personality Into Your Assistant

Some materials from IBM will break this down into greater detail, insisting that you pay attention to tone, personality, chit-chat and proactivity.  I like to keep it simple – it’s all part of the personality that your solution has.  I usually think of a “person” that my solution is – say a 32-year old male from Detroit, who went to college at Michigan, who loves sports and muscle cars, named Bob.  Or maybe a 24-year-old recent college graduate named Cindy who grew up in a small town in Ohio, who has dreams of becoming an entrepreneur in the health care industry someday.  This helps me be consistent with the personality of my solution.

We also noticed that we often needed to rework our Dialog tree and the responses that we were specifying.  We used the Analytics tab in the skill we were developing.  On that Analytics tab, we would often review individual user conversations and see how our skill was handling user interactions.  This led us to make changes to the wording that we used, as well as to the things we were looking for (in terms of entities) and what we were storing (in terms of conversation context).  Very small changes can result in a big change in the end-user perception.  Something as simple as using contractions (like “it’s” instead of “it is”), will result in a more informal conversation style.

The Analytics tab in Watson Assistant is interesting.  It provides a wealth of information that you can download and analyze.  Our effort was small, so we didn’t automate this analysis, but many teams DO automate the collection and analysis of Watson Assistant logs.  In our case, we just spent some time each week reviewing the logs and looking for “holes” in our assistant (questions and topics that our users needed answers for that we did not address), and trends in our data.  It has helped guide our evolution of this solution.

Summary

This blog post identifies some best practices for developing a chatbot with IBM Watson Assistant – but these apply to ANY chatbot development, regardless of technology.

  • Best Practice #1 – Know the scope and growth path for your chatbot
  • Best Practice #2 – Know your roles – have technical people, developers, SME’s, architects, and end users represented
  • Best Practice #3 – Have a solid dev/test/prod environment set up for your Conversational assistant or chatbot
  • Best Practice #4 – Automate Your AI Testing Strategy
  • Best Practice #5 – Be Open Minded About Your Intents and Entities
  • Best Practice #6 – Inject Some Personality Into Your Assistant

Now that you have the benefit of some experience in the development of a conversational assistant, take some time to dig in and begin building a solution that will make your life easier and more productive.