Lately, your solution’s been teasing you with one of those problematic ‘catch me if you can’ conflicts. It’s a pretty intricate solution. A highly available database tier and a middle tier made up of a bunch of different hosts, possibly a virtualisation layer and allocated a quota off the multi-tenanted corporate storage network and SAN. Plus, it’s all communicating across the corporate network. And, that’s not the scary bit. Your organisation’s reputation relies on it being available all the time.
About four weeks ago, some of your customers reported that the their screens just froze. The Help Desk phones rang non-stop and after about 15 minutes everything just returned to normal by itself. You thought that’s the last you’d heard of that one. Wrong! Dead wrong!!
What conflict resolution hell looks like
Since that day, every single day since then, the whole scenario just replays itself as if playing 'catch me if you can'. It pops up at different times each day and you can’t even remotely relate to any pattern for the daily occurrence. You've spent the last four weeks debugging and isolating where the issue might be, logged multiple support calls with each component vendor from the hardware, to the storage infrastructure to the network infrastructure. Finally, after countless cups of caffeine and many sleepless nights, you find out that the unique combination of using two particular infrastructure components at your installation without a specific configuration setting, triggers this issue under high load patterns. It was such a rare combination of firmware versions of the involved components that finding information about the conflict was hard enough. To top it all off, the load was being generated by another application you didn't even know existed, let alone know was sharing the same corporate infrastructure with..
You find that the fix to enable this configuration setting is to apply a firmware upgrade to one of the affected components. So you now have to seek confirmation from the rest of the unaffected infrastructure component vendors whether their products are certified against the firmware upgrade you have to do. All the while, the system freezes continue unabated every single day!
Sound familiar? Can you relate to this? Then you've got to read the rest of this post.
How to avoid conflict resolution hell
Enter Oracle’s Engineered Systems solutions. Specifically, the Oracle Exadata Database Machine and the Oracle Exalogic Elastic Cloud Machine. These two Engineered Systems solutions have been around
for a while now and their capability and functionality has improved over a few generations of hardware refreshes. They come in different configurations that you can choose from to ensure that they can deliver your current and projected scalability requirements.
For a start, using a combination of Oracle software components like Real Application Clusters (RAC) at the database layer and Oracle Virtual Machine (OVM) at the application layer, as well as having redundant hardware components at every single layer, gives you very high availability. Combine that with self-contained and optimized storage within the respective machines (reminder, no more sharing!) together with built-in Infiniband network connectivity for intra-machine and storage network traffic (yes, extreme Inifiniband!), you get extreme throughput.
Now when you do have an issue (yes, they do still happen from time to time, this is the real world!), you have only a single vendor to log a support call with. The difference is your combinations of infrastructure firmware versions are not unique to your installation. There is a very high possibility that some other installation has already been through this issue and there’s already a patch available. The patch would most likely be part of a bundle patch and Oracle’s done the hard work for you. That is determining what other infrastructure components are impacted and bundled the other patches required for the other components together into a single patch.
Oracle Engineered Systems are unique in their own right and disrupt the status quo of the traditional IT department. From it’s optimised networking infrastructure to the new job roles that it is likely to create - the Engineered Systems Administration Team, if we were to coin a term. A successful implementation requires some planning and thought long before the Engineered Systems are delivered and powered up. Get this right and you have the infrastructure ready for use in days.
One thing is clear, they are a very real answer to avoiding conflict resolution hell.
For my next post I will go into details on how to plan your networking in advance for the setting up of an Oracle Exadata Database Machine and the Oracle Exalogic Elastic Cloud Machine.
From my corner in the office - Kishan