Thursday, April 4, 2013

A Room Full of Errors


At one point in my career I was involved in the merger of two organizations.  Both were roughly equal in size and I was working for the CIO of the acquiring company in an effort to evaluate systems across the two organizations and make decisions on which systems would be used going forward.

Let’s back up a ways and find out how we got into the mess - literally a room full of errors.


During the acquisition, the CEO of the acquired company let it be known that he had hired an outside consulting firm to create a new order entry/contract system that was beginning to be implemented within the company.  It was a new application built upon a thick client vs the application that we were using internally that ran on an IBM AS/400.  The argument was being made to ditch the green screens and use the sleek looking new interface of this new system.


So the acquisition is complete and we begin to send individuals down to review the systems and gain insight into where they are in the migration process from their old system to the new system and to evaluate if the new system might support the merged organization.  These trips occurred 2 times a week and sometimes I stayed on-sight for 3 or 4 days at a crack.  

It took a while, but over time I began to hear rumblings that all was not well in the world.  Bugs that had disappeared would suddenly reappear, errors were being encountered in the migration process, contracts could not be generated, resources were having to back key stuff in to the new system that people thought had been migrated.

The developers that were on staff also began to feel comfortable with me and started asking if they could be put in control of the contractors that were driving the development of the new system.  They began to pull back the doors and show me bug reports that were being sent to the contractors and how the bugs would recur on a regular basis.  They showed me the system and how data would disappear.  Not a good situation by any means.

The kicker finally came one week as we were preparing for a team meeting to discuss the status of the development work and the current migrations being scheduled.  Immediately prior to the meeting I was meeting with some of the production folks and they said I needed to see something.  We walked through the building and arrived at an office - the door was closed and the lights in the office were off.  As they opened the door - I could see boxes stuffed with paper lining the walls of the room.  I was curious and asked what I was looking at.  This, they indicated, were all the errors from all the migrations that had already occurred.  My jaw hit the floor!  To say that I was stunned was an understatement.  I am not kidding when I say that the walls were lined.  The boxes were stacked in piles over 6’ in height and wrapped around the room.

I quickly found my boss - the CIO - and let her in on what was happening.  Needless to say, our direction changed.  In that meeting, we pulled the plug on the new system.  The direction was made to fall back the migrations to the old system so that their sales teams could get reliable information.

Yikes, a lot went wrong here ... in hindsight, we found out a lot of things:


  1. IT had no management oversight of the contracting company.  They reported directly to the acquired companies CEO.
  2. The contract development team was not using any type of source control system - bugs were either reappearing because developers were not validating who had what code, or they were intentionally making changes to reintroduce bugs.
  3. The test process used by the contract development team was woefully lacking - from what I was able to determine, there were no formalized test plans anywhere.  Each developer was responsible for writing the code, testing the code and moving it to production.
  4. Migration issues were largely ignored by the contracting company - in their eyes, fixing those issues were not an issue with the system.
  5. User issues were ignored by the contracting company - the CEO was satisfied with the system, so they didn’t feel any pressure to make corrections.
  6. The requirements documentation built by the contract company was mostly fluff - very little concrete documentation identifying what was/was not included.  The contractors were mostly going off of seat of the pants - show me the screens you currently use, show me the reports you currently get, we’ll do the rest.

First, I would like to put on record - if you’re generating enough errors to fill up an office, there is something seriously wrong with what you’re doing!  Period - not even up for conversation.

During the actual negotiations, we had very limited ability to confirm what was being said.  However, after the sale was complete and I was able to get on site - I should have been more aggressive in my review of the migration activity of the system and the actual use of the system.  It ended up taking me several weeks to figure out what should have been noticed within the first few trips.

If memory serves me right, it took almost a $1M to clean up the mess and get all of the data back into the original system.  These were real $’s - not some estimate of lost productivity. Ultimately, that part of the organization became a very successful part of the larger organization.  We moved the contractors out - hired staff to build up the internal development team and put the few local resources they did have in to leading the new hires.

Now, I’ll circle back.  Look, if you getting test results that tell you something is wrong, and you proceed anyway, you’re a fool.  If you’re ignoring the results and plugging away and propagating errors into your production environment, and you think that’s ok.  You’re in the wrong job.  There were many warnings in this project - but nobody listened to the people on the front line that were finding all of the issues.  If I didn’t understand it before I got involved in this situation, it became crystal clear to me that if you wanted to be able to claim success, you needed to ensure that the end users of the system were happy with the results.  It really doesn’t matter how clever of a developer you are, if you’re not testing the system and ensuring that the users can be functional, then you’re not doing your job.

What lessons have you learned along the way that gave you better insight into the need for true testing within your lifecycle?

Tags: #SDLC #softwaredevelopment #metrics #lifecycle #qa #qualityassurance


If you'd like more information on my background: LinkedIn Profile

1 comment:

  1. Lessons are only lessons if instituted into a learned process the next time around. Essentially, proactive rather than reactive is our mantra. Forward thinking of potential hazard. So you are right on, test and retest and not by the same person. Sometimes our thoughts get hung up on what we believe to be correct when it truly is not. Thank you for sharing!

    ReplyDelete