Recently I have seen a flood of requests from our customers looking for the best ways to deploy their Jazz solutions (RTC/RQM/RRC). Everyone wants the “optimal” deployment architecture for their environment, and everyone is fighting with these four basic concerns:
- Wanting to use the minimum amount of hardware, to save on costs and allocation of hardware resources.
- Wanting to have a highly performing Jazz solution – a stable environment with minimal performance issues
- Wanting to have an architecture that will easily scale to meet rising (or falling) demand
- Wanting to have an ability to monitor system performance and predict future hardware and organizational needs
So looking at this list, you are immediately struck by the fact that some of these concerns are opposed to each other. To have a highly stable and high performing solution, you could have an abundance of hardware resources. This is directly opposed to the first goal which is the specification of the minimum amount of hardware. People need to understand that the “optimal” conditions for their deployment depends highly on the relative weight that they assign these four basic concerns. Most people realize this, but it is important to keep this in mind as we discuss how a successful Jazz deployment will impact these factors.
So in this blog, I want to share with you some of the basic rules that I tend to follow, and the reasoning behind them. We’ll start with a couple of broad statements and highlight of some classic “broken” thinking, move on to a simple architectural model, and then highlight the things that people don’t consider, but MUST consider if they want a healthy and happy Jazz solution.
Statement #1 – Virtualization is great because it is flexible
A majority of my customers are looking to deploy the Jazz CLM solution into virtual environments. This is not a bad approach, and I honestly really like this approach. Some organizations do not understand the value that virtualization has for a Jazz deployment. The values that virtualization will bring to a Jazz solution are:
- The ability to have “predefined” Jazz application instances, ready for some final configuration prior to deployment. This means easier and lower risk scalability as demand for the Jazz capabilities grow within an organization
- The ability to modify the amount of resources used by existing Jazz application instances. If you discover that you have over allocated or under allocated resources, you can easily modify this.
Myth #1 – Virtualization does not provide magic elves that allow your single CPU desktop host applications that would normally be done by 3 enterprise server machines. Computing depends on basic physics at it’s core, and those electrons can’t move any faster…..
Many IT departments look at virtualization as a way to harvest more CPU cycles, and so they cram applications into a virtual appliance and end up having an “oversubscribed” virtual environment. This works for applications that are used sporadically, and those applications that are not business critical. This does NOT work for an application like the Jazz applications, which usually end up servicing a highly variable (but constant) workload throughout the day, and which are considered business critical.
So how does virtualization benefit someone deploying the Jazz solution? It allows architects and administrators to quickly define and deploy application servers with a common footprint. This allows a team to monitor their Jazz solution over time, and gain an understanding of how your specific organization utilizes the infrastructure. By monitoring these “common” components, it is easy for administrators to quickly determine performance bottlenecks, and identify when additional Jazz application instances are needed.
Statement #2 – Your Workload is Different
Let’s begin with a myth.
Myth #2 – You Jazz guys have a number of users that can be supported by each of the Jazz applications, and you are just too lazy/scared/chicken/paranoid to tell anyone.
We understand how RTC works in general, and we understand REALLY well how RTC works in the IBM internal environments where it has been deployed. The problem that we face is that everyone who uses the Jazz solutions uses them differently.
Myth #3 – We have an “average” organization, so you should know exactly how many licenses/servers/application instances we need to deploy.
Nobody is average. Every organization uses slightly different usage models. To add to this, each user population has a different culture, and they use the tools differently. Some cultures are obsessed with collaboration, and communicate heavily in work item discussions. Some cultures focus on continuous integration, and have a constant stream of builds and automated testing being done. Some cultures manage requirements by User story, and don’t use the requirements management capabilities (RRC) of Jazz.
None of these cultures is wrong. However, each one of these will put stress on different areas of the Jazz solution architecture. Now we can add another layer of complexity in that usage models and cultures change over time. Teams that never used planning tools in the past begin to use the tools, and then they begin using them in different ways as they discover the practices that work best for them. Keep in mind that software engineers are like electricity, they will find the path of least resistance with ANY tool that they use. So usage patterns, best practices, and usage models will change over time. This will change your workload.
The point of this is that attempting to determine the exact workload and the capacity of a Jazz solution prior to it’s rollout and implementation is an exercise in futility. However, the entire exercise does have merit. We can rely on prior experience with deployments, and the law of large numbers, to provide an estimation of what the load on a Jazz solution should be, and what capacity (in terms of hardware, deployed resources, etc.) is needed. Keep in mind that it is an estimation, and the variables in your environment will make the actual usage vary from your estimation.
So use the estimation to help you predict hardware needs, licensing, performance, and the costs associated with these, but do so with the understanding that this is an estimation. There are far too many variables to be able to make accurate predictions in this area.
A Simple Architectural Model
Now that we have established some basic “rules”, and explored a couple of common myths, let’s get to something that you CAN use for your Jazz deployment. Let’s explore a very simple architectural model that will allow you to easily scale and monitor the performance of your Jazz solution.
The Basic Building Block
The basic building block of my Jazz architecture is the application server. This should be a 4 core system, with 8GB of memory, and at least 80GB of available disk space. It is a pretty standard system (which is why I chose it). This is a good set of parameters to use in either physical or virtual environments. You also need at least a 100MB dedicated connection. So if you are in a virtual environment and splitting the network with another 9 virtual machines, the physical box had better have a 1GB network connection.
I would also deploy on WebSphere (WAS) instead of Tomcat. WebSphere supports more advanced application server features (like single sign on), and is supposed to be more reliable than Tomcat. I have had some customers argue that they would prefer Tomcat, as it fits with their open source strategy. I am OK with Tomcat, I would just prefer WAS.
The operating system is up for debate, and depends on what your organization has for a data center strategy. I am a Linux bigot, so I would choose to deploy on Linux. The IBM deployment of Jazz for the Jazz development team (which also hosts Jazz.net) is on AIX. So if I had to be pragmatic, I guess I would choose AIX, since the team developing Jazz is self hosted on AIX, so I think they would have those bugs worked out pretty quickly.
So a brief recap of what I am thinking. My basic Jazz architectural building block consists of a single server resource (physical or virtual), which hosts a single instance of WAS, which serves up a single Jazz application. This is all running on top of AIX. The WAS and AIX are negotiable, you can use other products and technologies if you want. The hardware specs of my basic building block for a Jazz application server are:
- 4 cores
- 8GB memory
- 80GB available disk space
- 100MB dedicated network connection
There is one more point on this basic building block that needs to be understood. In virtual environments, this building block may be variable. More on that later, when I discuss scalability.
Putting It All Together
So what do I need to put into place to support my Jazz environment? If I look at the Standard Topologies, I see that the one that fits this model the best, and that I have seen as being the most stable, is the E1 – Distributed Topology. This diagram is taken directly from the Jazz.net article:
E1 – Distributed Enterprise Topology
How Many Users Will This Support?
This is the section that most people will read, take out of context, and end up in a very unhappy situation. So please do not just read this section of the post, and use it out of context, because the context here is critical.
With the hardware that I discussed above, and the topology shown above, I would expect that this implementation would support roughly 350 concurrent RTC users, 300 concurrent RQM users, and 150 concurrent RRC users. Note that if your organization is not using RTC for source control and builds are not being done from the RTC repository, that you can probably expect to support almost twice as many concurrent RTC users. Keep in mind that this is a rough guesstimate. This is not IBM policy, it is a simple rule of thumb, a spot to begin your planning from. Remember the section above where I said that this was an estimation, and that you should monitor the performance of your solution to understand what the limitations and scaling are in YOUR environment. Every environment is different, and identical topologies implemented by different organizations can have different performance and scalability based on the differences in their usage of the environment.
One other word of caution here. I mention concurrent users. My definition of a concurrent user is someone who is actively engaged with the system. So a developer that brings up an Eclipse client and works on their code, checking in some changes, for an hour would count as a concurrent user. If they leave their Eclipse client open, and go to lunch, then I would not consider them a concurrent, or active, user. So when I state that this can support 350 concurrent RTC users, that means 350 users, with some development going on, and some builds executing. It does not mean 350 people scanning individual work items, and it does not mean 350 concurrent checkins and builds.
You will see other data that suggest other numbers for a deployment like this. I am just sharing a rough approximation of what I have seen deployed in our customer environments. As subsequent releases of the Jazz tools come out, and some performance issues are addressed, I would expect that this could improve.
At this point I know that you are probably thinking, “Geez Tox, you put so many qualifications on these estimates. It makes them almost worthless”. I understand your pain, but what I have seen in numerous instances is that people are treating these installations as if we were supporting automated teller machines, or a stock trading application, where performance can be measured in transactions per second. Unfortunately, the software development space is quite unbounded, and our transactions are not consistent. Changing a work item provides a much different workload on the system than running a report, or executing a build. We just cannot accurately forecast what the specific load characteristics of a developer population will be.
Now if we need to add additional Jazz applications, in order to scale our solution to support more concurrent users, then we would just add additional application instances, and additional corresponding logical database storage areas. Scaling is easy because we have a basic architectural building block (defined above), and we just need to place another block into this diagram for the new Jazz application instance.
In virtualized environments, you may find that your particular implementation has a consistent bottleneck in one particular area. For example, we might find that memory and JVM heap utilization is very high in all of our Jazz application instances. If this was the case, then we could increase the memory in our basic building block from 8GB to 12GB, and update ALL of virtual machines hosting our Jazz applications. In this way we remain consistent, and we leverage the flexibility of virtual technology to minimize changes to the logical architecture of our Jazz solution.
Deploying change to an organization is never easy. It involves having people change their habits , and move from what they know (their current tools and processes), to something that they don’t know. This is scary for many people, and many people fear the unknown. The best way to combat this fear of the unknown is education. Let your end-users know WHAT is going on. Keep your stakeholders informed of what is happening, and WHY it is happening. Most of what I have here deals with the PHYSICAL infrastructure of a Jazz deployment, but don’t forget to address the MENTAL infrastructure that needs to change. More on that in my next blog post.