Take the Initiative, Not the Blame
Let me ask you a question; How often do you go to the doctor? Do you have an annual checkup, or only go when something is wrong?
We are all encouraged to be proactive in maintaining our health. Exercising, eating better and seeing your doctors annually are the keys to a longer, more fulfilling life. If this is what it takes for us to live longer, more active lives, then why aren't we doing this for our IT systems?
I am a consultant, so in many cases, I'm kinda like "Joe the bartender". I hear stories of heartache and pain quite a bit. "My environment is slow", "My backups are failing", and a host of others are phrases I hear a lot. I'm not complaining, but I think we could all potentially avoid many of these "trips to the ER" if we started giving our IT infrastructure the same type of preventative medicine we prescribe to ourselves.
Routine "checkups" are a great way to get a fresh set of eyes to look at your environment to see if there are things that may have room for improvement, or issues coming down the road. These checkups, or assessments, can typically be done in smaller pieces, thereby minimizing the impact to your customers/end users.
Let me describe a few that would help pre-emptively identify the majority of what I've seen lately:
- The Storage Assessment: This simple assessment can look at how your storage network and arrays are configured and performing. Many times, storage has to be added in increments over time, as projects require. If these additions aren't implemented correctly, they could have a negative impact on the entire storage array or network. Improperly spreading I/O loads across storage processors or disk pools can have severe consequences.
- The Virtualization Assessment: Virtualization is the hottest ticket in IT today. Why? Because it allows you to more fully utilize your hardware resources (a.k.a. more performance per dollar spent). Not all applications are created equal, so one must take EXTREME care when adding applications to a VM farm. Consideration for how the applications' load profile will affect others in a given cluster is sometimes overlooked. Also, keeping your host servers identically configured and updated is a must to reduce issues that can be caused by the application moving from one host to another.
- Networking Assessment: How many times have we all heard someone say, "There's a problem with application XYZ running slowly" only to find that, when we investigate, the issue is gone without a trace? Might be gremlins, but it's more likely a networking problem. Ethernet is the effective equivalent of everyone shouting at the same time. Only the one who was shouting when no one else was is heard is the one who gets the information. If one of the people shouting is standing further away, they might not get heard as much as it takes longer for the sound to get from them. OK, so it's a crude analogy, but if you have a network that has routing issues, everything may still work fine until a certain circumstance occurs (sudden spike in communication to ONE server several hops away for example), then BAM! You run into a problem. IT then goes looking for the problem, the circumstance that created the blip is gone and you're left wondering if someone in the user community is suffering from demonic possession. Having a third party evaluate routing and networking topologies can sometimes shed light on something overlooked by the folks closest to it.
- Backup Assessment: How many people aren't experiencing periodic issues with their backup? It's always a top hitter for companies. Tapes fail, updates break things, communications get interrupted….stuff happens. When is the last time you assessed your backup to your strategic plans? How are you testing your backups for validity? This is most often the piece that is neglected in IT since, well, it's been working for years. This is your organizations safety net. Don't forget to keep checking it to make sure it will be able to save you should the unthinkable happen.
I could go on and on, but I think you get the picture. Start thinking of ways to incorporate periodic checks of key components into your annual operational strategies now. These may add to the annual expense, but can save you BIG TIME by identifying potential problems BEFORE they become problems that are expensive to fix.
Leonard Bernstein once said, "Time is the cruelest teacher; first she gives the test, then teaches the lesson." I like to think that, if we test ourselves first, the lessons may be less painful to learn.
Clay Sides is a Senior Technical Consultant with Park Place International who specializes in helping customers hone in on their true objectives to ensure the biggest return on their investment. Clay gets satisfaction from working with customers to discuss their needs and provide consulting services that best address those needs. Identifying performance bottlenecks, assisting with building and implementing strategic plans for growth and sustainability, as well as recoverability, are all his specialties. Clay has been working in Information Systems since 1989, having spent time with United Technologies, Texas Instruments, Acer America, Palm Computing, Gateway and most recently, 5 years with JJWild/Perot/Dell. When asked, Clay says his favorite part of his current endeavor, is seeing the moment of realization in a customers’ eyes when he’s able to communicate a complicated plan, or concept, in terms they understand; “It’s a rush when you actually see you’ve been able to help someone understand something. Their stress seems to almost float away at the moment of realization.”