Archive for the ‘Feature Articles’ Category

Immutable Invoices

Tuesday, November 24th, 2009

Back in the late-1990s, the Internet Service Provider where Simon C. worked was a mere micro-sized version of what they are today. Their website's original e-commerce system only needed to sell one thing — domain names, and a limited subset of them at that — so the shopping basket and invoicing parts of the system didn't need to be all that intelligent. They simply looped through each item ordered by the customer, displayed the description and prices of each one, and worked out the totals at the end. The whole process was so simple in fact that it made sense to the original developer to write the system so that the shopping cart and invoicing pages shared the same code.

Over time, the ISP grew in size to sell additional products such as new domain types and packages with a multitude of sub-products. Also, as the system grew in size, the site began running slower and slower. This gave Simon a reason to look into ways to improve the efficiency of the shopping basket and invoicing parts of the system.

However, the more familiar Simon got with the code, the harder it was for him to understand why it was allowed to remain in place for over ten years.

Elseif()s EVERYWHERE!

They say it takes an entire village to raise a child, but Simon discovered that it took over 16,000 lines of code to process the sale or print an invoice.

How could something like this be? The answer was surprisingly simple. As the company grew, it was simple to add new products. Just add a record to the master product table in the database and then add a new "elseif()" block to the shared code. Same thing went for discounts - toss in another elseif(). T-Shirts? Another elseif()...and so on so that the code was almost entirely made up of hundreds elseif blocks, each of which would need to be evaluated whenever a customer would add or remove an item from their shopping cart. Also, a side-effect of this design was that if someone wanted to change a product's description, for example, they couldn't just edit the database record, they would have to create a new product code and leave the old one in place forever.

Simon figured, My work's cut out for me - it should be easy to improve performance! Just trim out the elseif() blocks for the old items and discounts and I'm done! Ordinarily, Simon's reasoning would be absolutely correct, however, there was a small snag in this plan in the form of several hundred dire warnings.

¡NO TOCAR!

In looking for candidate code to trim, Simon found that most elseif() blocks heeded the warning DO NOT REMOVE and DON'T TOUCH UNDER PENALTY OF DEATH and the like, just like the following:

// xmas offer 20071206 - 20080103 DO NOT REMOVE, EVER!!
if($rowval['orderitemref'] > 10295755 &&  $rowval['orderitemref'] < 10373106 && (
    (substr($rowval['domainname'], -3, 3) == "com") ||
    (substr($rowval['domainname'], -3, 3) == "net") ||
    (substr($rowval['domainname'], -3, 3) == "org") ||
    (substr($rowval['domainname'], -3, 3) == "biz"))) {
       $thestr.='<font face="arial,verdana,helvetica" '
         .'size="-2" color="#ff0000"> at special offer price.'
	 .'<br>Renewal at standard price</font><br>';
}

The reason for this was also simple - as new items, price changes and special offers had come and gone over the years, it had exposed the problem of how the system handles historical invoicing. You see, the important thing about invoices is that they must be immutable. If a customer called up to argue a charge from six months ago, customer service had better be able to pull up that same invoice and it had better match what the customer saw at the time.

What Simon needed was a brilliant plan.

The Brilliant Plan

Simon figured he had two options in trying to bring down his goliath. The first option was to refactor the 16,000 lines into a system that would be able to regenerate ten years of invoices flawlessly and spend the rest of his life regression testing while in the meantime, the number of elseif() blocks would inevitably grow.

Alternatively, he could take the quick, dirty, and simple route: print out the historical invoices so that the CSRs could pull up when needed. Well, not print on real paper, but rather batches of PDFs of every historical invoice. This would allow him to simply ditch the old code entirely.

Simon was immensely proud of this solution and presented the idea to management. "Why would we want to change," his boss rhetorically asked, "what we have works already and has been working consistently for the past ten years!"

Defeated, Simon went back to his maintenance work and added yet another elseif() block to represent yet another promo offer.


Introducing Bad Code Offsets

Wednesday, November 18th, 2009

I have never written a bad line of code.

When I tell people that, they often scoff and offer replies like “so you’re not a programmer then?” and “let me guess, you’re a coding deity or something?” Well let me say, I am a programmer and I am not Codethulu, but in the same manner that Al Gore can fly around the world in a private jet without polluting, I have negated my bad code footprint through the purchase of Bad Code Offsets.

This is all made possible through the Alliance for Code Excellence, a group for which I am proud to be the chairman. Its charter members include Jeff Atwood, Erik Sink, Jon Skeet, Jason Cohen, and several other software development community leaders who are just as passionate about quality code as I am. We stand strong with our vision:

We envision a world where software runs cleanly and correctly as it simplifies, enhances and enriches our day to day work and home lives. Mitigating the scope and negative impact of bad code on our jobs, our lives and our world is our all–consuming passion. We foresee a time when bad coding practices and their rotten fruits have been eliminated from this earth and its server farms thereby heralding a new age of software brilliance and efficacy.

Nettlesome bugs and poorly written code have been constant impediments towards realizing our full potential as programmers and engineers. Bad Code Offsets provides the vehicle for balancing the scales of poor past practice while freeing us to pursue current excellence in code development. Until the dawn of the worldwide, bug free code base, each of us can take steps towards reducing our bad code footprint and remediate the bad code that we have each individually and collectively left behind on the desktops, servers and mainframes at school, at work and at home.

While the notion of offsetting bad code instead of outright correcting it may seem like a "hack" to some, we believe it's a good approach for today's problems and today's codebase. The dollars you spend purchasing Bad Code Offsets are donated to various worthy Open Source initiatives that are carrying the fight against bad code on a daily basis. These organizations currently include jQuery, PostgreSQL, and The Apache Software Foundation.

Building a better tomorrow—one line of code at a time

Imagine a world without bugs. Not the creepy-crawly-soil-enriching bugs, more the bugs that plague all software, past and present. The bugs behind the inane error messages you see day-in and day-out. The bugs that cause multi-million dollar business disasters. The bugs responsible for (literally) crashing billion-dollar space exploration equipment.

By today’s coding standards, a world free of bugs is a far-fetched fantasy. Bugs are an ever-present part of code and about as likely as a semi-colon in a C++ program. But does it always have to be like this? Will there always have to be bad software?

While I do believe that, one day, given sufficient tools, knowledge, and experience, we achieve the worldwide, bug-free codebase, there’s a preposterous amount of work and clean-up between here and there. Think of the swaths of bad code that we have left behind on the desktops, servers and mainframes at school, at work and at home. Add to that the bad code that is being churned out each day by unskilled colleagues and our own laziness, and we're left with a completely unwinnable battle.

That is, unless we try a radically different approach. And that's exactly what we're doing with Bad Code Offsets. It's our first, bold step towards universal code excellence.

Why Bad Code Offsets? Why Now?

Not many things in life allow us to atone for past mistakes. But by buying Bad Code Offsets, you can not only do that, but you can make up for other people's mistakes. Get them for your friends, for your peers, and of course for your code review sessions.

They're inexpensive (50¢ per SLOC) and come in a number of denominations. Plus, for a limited time, shipping is free. Buy as little (minimum of 3) or as many as you'd like.

Tonight Only: Code Offsets at the Conga Room in Los Angeles

If you're in the SoCal region this evening (Wednesday, November 18), make sure to stop by the world famous Conga Room. We'll be officially launching Bad Code Offsets at the Underground @ PDC 2009 event (free, but registration required) in Los Angeles. Pick up some Daily WTF stickers, Stack Overflow stickers, and of course, your very first Bad Code Offset.


The Standard Way

Tuesday, November 17th, 2009

length matters.Michael P. was feeling pretty tense – and really, who could blame him?

Today was no ordinary day. He was in the hot seat, presenting to the Software Advisory Committee - a multi-disciplinary group responsible for rubber stamping any and all new production application installations at MegaBank.

Much like being presented before a village's Council of Elders, if he received their blessing, he would no longer be considered among the ranks of MegaBank’s junior developers. Instead he would be shoulder-to-shoulder with the man developers in the company.

His word would have weight. People would come to him for advice, but all this could be demolished if he did not gain the thumbs up of the Committee who was headed up by Michael’s manager, Greg.

"Why can't we give the user their password in the e-mail?" Greg asked, in the same tone of voice you might expect if someone had told you they couldn't eat their soup because the spoon was upside down.

"I go on websites all the time and they always tell me my password when I forget it!"

For a moment, Michael thought about asking "which websites?". He also considered burying his face into his palms, or perhaps bang his forehead against the desk repeatedly - anything to get the stupid the development manager was spouting out of his mouth.

However, if Michael had even the remotest of hopes in getting his first new application installed, he was going to have to play this one cool.

Life at MegaBank is not without its shortage of WTFs: there is a global variable used in three different ways (depending on what part of the code is running) within the same module, one method is called by sending the sequence of keyboard presses that select and then press a button, and then of course there are the never-ending chains of if-statements nested so deeply you'd need a grappling hook to climb out.

However, none of these could touch the WTFness of the Software Advisory Committee meetings. To fully grasp what this is like, imagine you have two bosses who work in alternating schedules, and each despises the decisions that the other makes. Now imagine those two bosses are the same person – this was the committee of Greg.

In these meetings, Greg would incredulously ask why the developers would dare even suggesting that they may try to implement the features he had told them to implement several weeks ago, and Michael’s session with the committee wasn’t going much better.

“The only thing that sending a user a gobbledygook URL instead their actual password accomplishes is proves that we are a company of sloppy and inexperienced coders. In the end, you’ll be turning customers away from us balking at our utter lack of professionalism!” Greg explained.

Michael tried to make a case by saying that the email that went out did not show a full URL but rather it wassafely behind a link labeled “Click Here”  Also, the process of picking out a new password was actually quite friendly. However, there was no chance to begin reasoning.

Greg lowered the brim of his glasses further down his nose for emphasis. “The standard way of recovering passwords is to send a user their password, not to hide it behind a link! Here, let me show you how Google does things – pay attention and please, keep an open mind.”

While Greg surfed onto Gmail to go about resetting his password, Michael could barely watch – the embarrassment was too great. He thought that he had considered every angle and that he here he was going to be proved out to be a fool. However, he was saved at that last minute.

In the email that Greg had received, Gmail sent a link enabling him to set a new password.


Classic WTF: Don’t Worry, We’ll Fix It!

Tuesday, November 10th, 2009

I'm at the Business of Software conference in San Francisco this week and thought it'd be the perfect opportuntunity to revisit a classic. Don't Worry, We'll Fix It! was originally published on November 28, 2006.


We're in a bit of a jam, an email to the support desk read, we accidentally ran an entire day's worth of transactions for 11 Oct 2009 instead of 11 Oct 2006. Can you fix this?

In the world of retail, it's not an uncommon practice to "open" for a business date that is not the current date. Think of 24-hour stores that want to "close" the day at 11:00 PM instead of midnight, or the cases when the registers are out of commission. Whatever the reason, it's a feature that customers want and a feature that T. Ferguson's company provides in their point-of-sale systems.

Obviously, there's no way for the software to know if a different date is purposeful or accidental; all it can do is default the "open" date to the current date and hope that someone would notice a mistake on the registers, receipts, etc. before the day was "closed" out. The support email was the first "problem" that T.'s company had with this feature since first offering fifteen years ago.

Despite having a nation-wide chain of stores, with each bringing in nearly $500,000/day in sales, this company decided not to go for the extended-hours support contract. With no one to call at 9:30 PM for support, the shift manager ignored the incorrect date and "closed out" the store's point-of-sale system. He left a note for the general manager, who promptly emailed support the next morning.

The general manager also called the support line at 9:01 AM -- just after it opened -- to make sure they got the email. He was very concerned that the error would gravely impact their October reports, forecasting reports, inventory, and just about anything else that relied on that day's transactional data. The support rep assured the general manager that the development team was working on a way to fix the issue.

From a programming perspective, this was actually an easy thing to fix. All of the daily transactions are stored in a single database table, so a simple UPDATE script and a "re-close" should do the trick. They reproduced the "problem" on a test machine, ran the fix script, and watched it worked like a charm. T. called up the store to let the manager know how they planned to resolve the issue.

"But," the manager asked, "what about when someone makes a return? Their printed receipt will have a different transaction date. Won't the register refuse the return?"

"Nope," T. replied, "we only use the store number, register number, and transaction number when we validate the receipts for returns."

"Sounds great," the manager said in a much less stressful tone, "what a relief! I was really worried about how bad this would be."

The fix was sent to a technician to fix the problem on site. Before running the script, he noticed one thing that the development team missed: not only was there only one day of faulty data in the database, there was only one day. Period. All the transactional history was gone!

That, of course, would present a problem when trying to process a return. Or receiving merchandise that was ordered in the past. Or verifying an employee's time clock punches. Or tracking special-ordered items. Or knowing whether the store is on pace to meet its weekly sales goal. Or just about any activity of any consequence in retail that ISN'T selling merchandise.

The technician reported this back to the development team. After a bit of digging, they figured out why only the one day of data was left: part of the register closing code purges data that's over three years old. And how does one find three-year old data when the system clock is not a reliable indicator? Why, by taking the business date of the newest transaction and subtracting three years, of course!

The manager's day was about to get much, much worse.
 


Classic WTF: Keepin’ It Cool

Thursday, November 5th, 2009
Keepin' It Cool was originally published on October 4, 2006

A few years ago, Phil was working as a developer on a wire transfer application at a large bank. To make sure that nothing technical would prevent the bank from extracting maximum amounts of money from its operations, every part of their system had a redundancy with fast failovers and clustering. In fact, there was even one server (and a backup of that server) whose only function was to monitor the other server and send notifications if anything fell out of the operations norm.

When a system or process failed, the monitoring server would page the on-call support administrator, who would then log in and restore the errant system to its rightful state. On rare occasions, an actual visit to the server room was required.

One summer day, at about two in the morning, the on-call administrator (Mark) was awakened with a Critical Notification Alert that a couple of core processes -- such as the ACH Batch Script and General Ledger Job -- had crashed. Moments later, he was paged again, this time with notification that the fail-over processes had failed.

As Mark attempted to log in to the process control cluster, he received another Critical Notification Alert. And then another. And then several more. The remote access server wasn't responding to his log-in attempts, so Mark got dressed and headed downtown.

About halfway to work, his pager stopped receiving alerts altogether. That was a pretty bad sign, so he dialed into the office to access the company directory and get the numbers of secondary on-call administrators. But the phone lines were dead, too, which could only mean one thing: a bomb, a fire, or a giant robot wreaking havoc throughout the city.

When Mark arrived at work, things were very quiet. He nodded at the security guard and took the elevator to the server room. Mark approached the server room and saw that the automatic secure doors were propped open. As he entered the room, Mark felt a blast of heat and noticed two maintenance employees, both sweating profusely, working on the air conditioner units. This seemed a little strange since Mark should have been the one to call the maintenance crew out there, so he asked how they knew that the air conditioner failed.

HVAC Guy: It didn't fail. We're just changing the chiller bars and doing some other preventive maintenance.
Mark: But we've got two air conditioners in here, why are they both down?
HVAC Guys: We figured it'd be easier to do them both at the same time. Why, is there a problem with that?

Mark and his team of network administrators spent the next 36 hours or so rebuilding and restoring each server, its backup, and its backup's backup. And as for Phil, the developer who submitted this story, he got the day off.