On Budgets

What function do budgets serve?

In a case where there are well-known basic requirements, and highly variable-solutions, a budget is essential to curb the tendency to put wants ahead of needs. Take, for example, an automobile purchase. All you need is four wheels. Maybe you need four doors if you have two kids, or a minivan if you have three kids, but you certainly don’t need a sunroof, or a hemi, or a self-parking car. These are luxuries. Maybe you can afford to splurge a bit, but the budget knows there are other needs that take priority, like food, mortgage payments, and utilities.

When we try to take this concept of a budget to the workplace it breaks down. In the workplace, budgets serve one of two purposes: (1) flagging corruption, and (2) investment payback.

Purpose 1: Flagging Corruption

When we know about how much money it should take to do a job, we give the responsible person a budget. “Marty, we’ve been attending conferences in Las Vegas for 30 years. We know it costs about $X, so you have a budget of $X and $Y for contingency.” As long as Marty stays within that budget, keeps his receipts, and can avoid having the escort service charges showing up on his company Amex, he’s good to go. If his expense report comes in at twice the expected amount, then he has to justify it.

When you know how much something should cost, this is an efficient model. It delegates spending to the individual and only requires minor oversight from the accounting department.

Note that the budget wasn’t set based on how much money was in the bank, it was based on historical data, and the company decided there was a payback to send Marty to Las Vegas. That’s a fundamental difference between home and workplace budgets: at home we frequently buy stuff with no perceived payback, but at a company every expenditure is an investment. That’s why the function of budgets at home and at work are fundamentally different.

Purpose 2: Investment Paybacks

When you don’t have good historical data on a planned expenditure, accounting still needs an estimate. Estimates feedback expected expenditures to whoever is managing the cash flow. Budgets are the present-cost of the expected payback, with interest.

When a company looks at starting a new project, whether it’s done internally or subcontracted, first they get an estimate. Then they compare the estimate to the expected payback, and if it pays back within a certain time-frame (usually a couple of years, but sometimes as short as a few months), they go ahead with the project. However, since projects are, by definition, something you haven’t actually done before, everyone acknowledges that the estimate won’t necessarily be accurate. They will generally approve a budget with an added contingency. They’ve done the calculations to make sure that if the project comes in under that budget, the payback will make the company (and the investors) money.

As the project progresses, the project manager needs to track the actual expenditures against the expected expenditures, and provide updated estimates to accounting. If the estimate goes higher than the approved budget, management has to re-evaluate the viability of the project. They might increase the budget if the payback is good enough, or they might scrap the project.

Tying Incentives to Budgets

For home budgets, it’s implied: if you stay within your budget, you’ll make sure to satisfy all your needs, and you’re less likely to find yourself in financial hardship. You have an incentive to stay on budget.

At work, lots of people have their bonuses tied to “performance-to-budget”. If we’re talking about the first purpose of workplace budgets (flagging corruption) where historical data provides a solid prediction of what something should cost, then it makes sense to evaluate someone on their performance to that budget. On the other hand, if we accept that project budgets are based on rather inaccurate estimates, then measuring someone to a project budget isn’t very efficient.

In fact, it leads to all kinds of behaviors we want to avoid. First, people might tend to over-estimate projects because that will raise the budget. This might prevent the company from pursuing projects that could have been profitable. Secondly, project managers tend to play a “numbers game” – using resources from one project that’s going well to pad another project that’s going over. This destroys the accuracy of your project reports; now you won’t know which initiatives really made money. Third, the cost will be evaluated at the end of the project, but the payback continues over a longer time-frame. The project manager will tend to make choices that favor lower cost at the end of the project at the expense of choices that offer higher long-term payback.

Everything I’ve read suggests that (a) project managers have very little influence over the actual success or failure of a project and (b) in any task that requires creativity and out-of-the-box problem solving, offering a performance-incentive reduces performance because you’re replacing an intrinsic motivation with an extrinsic one. The intrinsic one is more powerful.

So why do we tie performance incentives to project budgets? Is there really any research that suggests this works, or are we just extrapolating what we know about purchasing a car, or taking a trip to Las Vegas? As someone who is intrinsically motived, I’ve found budget performance-incentives to be extremely distracting. Surely I’m not the only one, am I?

The Payback on Automated Unit Tests

I’m a Test-Driven Development (TDD) convert now. All of the “business logic” (aka “domain logic”), and more than 95% of my framework logic is covered by automated unit tests because I write the test before I write the code, and I only write enough code to pass the failing test.

It’s really hard to find anyone talking about a measurable ROI for unit testing, but it does happen. This study says it took, on average, 16% longer to develop the initial application using TDD than it did using normal development methods. This one reported “management estimates” of 15% to 35% longer development times using test-driven development. Both studies reported very significant reduction in defects. The implication is that the payback comes somewhere in the maintenance phase.

From personal experience I would say that as I gain experience with TDD, I get much faster at it. At the beginning it was probably doubling my development time, but now I’m closer to the estimates in the studies above. I’ve also shifted a bit from “whitebox testing” (where you test every little inner function) to more “blackbox testing”/”integration testing” where you test at a much higher level. I find that writing your tests at a higher level means you write fewer tests, and they’re more resilient to refactoring (when you change the design of your software later to accommodate new features).

A Long Term Investment

It’s hard to justify TDD because the extra investment of effort seems a lot more real and substantial than the rather flimsy value of quality. We can measure hours easily, but quality, not so much. That means we have a bias in our measurements.

Additionally, if and when TDD pays us back, it’s in the future. It’s probably not in this fiscal year. Just like procrastination, avoiding TDD pays off now. As humans, we’re wired to value immediate value over long term value. Sometimes that works against us.

A Theory of TDD ROI

I’m going to propose a model of how the ROI works in TDD. This is scientific, in that you’ll be able to make falsifiable predictions based on this model.

Start out with your software and draw out the major modules that are relatively separate from each other. Let’s say you’re starting with a simple CRUD application that just shows you data from a database and lets you Create, Read, Update, and Delete that data. Your modules might look like this:

  • Contact Management
  • Inventory Management

If you go and implement this using TDD vs. not using TDD, I suspect you’ll see a typical 15% to 35% increase in effort using the TDD methodology. That’s because the architecture is relatively flat and there’s minimal interaction. Contact Management and Inventory Management don’t have much to do with each other. Now let’s implement two more modules:

  • Orders
  • Purchasing

These two new modules are also relatively independent, but they both depend on the Contact Management and Inventory Management modules. That just added 4 dependency relationships. The software is getting more complex, and more difficult to understand the effect of small changes. The latter modules can still be changed relatively safely because nothing much depends on them, but the first two can start to cause trouble.

Now let’s add a Permissions module. Obviously this is a “cross cutting” concern – everything depends on the permissions module. Since we had 4 existing modules, we’ve just added another 4 dependency relationships.

Ok, now we’ll add a Reporting module. It depends on the 4 original modules, and it also needs Permissions information, so we’ve added another 5 dependency relationships.

Are you keeping count? We’re at 13 relationships now with just 6 modules.

Now let’s say we have to add a function that will find all customers (Contact Module) who have a specific product on order (Orders) that came from some manufacturer (Purchasing and Contact Management) and a certain Lot # (Inventory) and print a report (Reporting Module). Obviously this will only be available to certain people (Permissions).

That means you have to touch all 6 modules to make this change. Perhaps while you’re messing around in the Inventory Management module you notice that the database structure isn’t going to support this new feature. Maybe you have a many-to-one relationship where you realize we really should have used a many-to-many relationship. You change the database schema, and you change the Inventory Module, but instead of just re-testing that module, you now have to fully re-test all the modules that depend on it: Orders, Purchasing, and Reports. It’s likely we made assumptions about that relationship in those modules. What if we need to change those? Does the effect cascade to all the modules in the software? Likely.

It doesn’t take long to get to the point where you need to do a 100% regression test of your entire application. How many new features potentially touch all modules? How long does it take to do a full regression test? That’s your payback.

You can measure the regression test time, and if you use a tool like NDepend you can measure and graph the dependencies of an existing application. Using your source control history, you can go back and determine how many different modules were touched by each new feature and bug fix since the beginning of time. You should be able to calculate:

  • How much time it takes to regression test each module
  • Probability of each module changing during an “average” change
  • The set of modules to regression test for any given module changing

Given that, you can figure out the average time to regression test the average change.

Obviously, the average regression test time must be longer than 15% to 35% of the time it took to write the feature (assuming you’ll keep following TDD practices during the maintenance phase). The amount of time it takes to test in excess of that is payback against the initial 15% to 35% extra you spend developing the application in the first place.

What kind of numbers are we talking about?

Let’s run some numbers. A lot of places say 30 to 50% of software development time is spent testing. Let’s assume 50% is for the apps with “very interconnected dependencies”. Also, let’s say our team spends an extra 33% premium to use a TDD methodology.

Now take a project that would originally take 6 months to develop and test, but with TDD the development took about 33% longer, so +2 months. The average change takes 3 days to code and test, or 4 days with TDD. Let’s say the regression test on something that took 6 months to develop (personal experience – a 3 month project had a regression test plan that was about 1 day to run through) would have to be about 2 days.

Without TDD, a feature would take 3 days to write and test, and then 2 days to do a full regression test. Using TDD, it would take 4 days to write and test, but zero time to regression test, so you gain a day.

Therefore, since you had to invest an extra 2 months (40 days assuming one developer) in the first place to do TDD, you’d see a break-even once you were in the maintenance phase and had implemented about 40 changes, each taking 4 days, which means 160 days. That’s about 8 months. That ignores the fact that regression test time keeps increasing as you add more features.

Obviously your numbers will vary. The biggest factor is the ratio of regression test time vs. the time it takes you to implement a new feature. More dependencies means more regression testing.

Conclusion

If you have a very flat architecture with few dependencies, then the TDD payback is longer (if there even is a payback). On the other hand, if you have highly interdependent software (modules built on top of other modules), TDD pays you back quickly.

A Very Fast Tutorial on Open Source Licenses

I’ve written a bit of open source software lately, and I did a lot of learning about open source licenses. Unfortunately, after you learn a lot about a topic, you tend to subconsciously assume everyone knows what you do. In the interest of catching you all up, here’s a cheat sheet with some practical tips:

Note: I use the terms “proprietary” and “commercial” with specific meanings. “Commercial” means an application that isn’t open-source and you sell copies of it for money. “Proprietary” includes “commercial” but also includes software that is only used internally, never sold.

BSD or MIT/X11 Licenses

Sometimes called the “Academic” Licenses

Technically it’s the “revised” BSD license, but it’s been around so long that it’s just the BSD license now. It’s pretty much equivalent to the MIT/X11 license. These are considered to be the least restrictive, “do-anything-you-want” licenses as long as you keep the original notice on the code, display the copyright notice in your program, and don’t infer the author is endorsing your product. Also, it has a disclaimer saying they’re not responsible for anything you do with it. Standard stuff.

You can use this code in proprietary software, and you can use it in a GPL’d work (the common term for this is “GPL-compatible”.)

Apache 2.0

The Apache 2.0 license is similar in use to the BSD license, but it adds a patent grant. That is, it specifically states that if the authors hold any patents that cover the code, you can still use this code and be safe from that. Some might argue that BSD implies the patent grant, but it’s not specific, so Apache 2.0 makes it specific.

You can use it in proprietary software, and it is GPL-compatible, but only with version 3 of the GPL.

Mozilla Public License 1.1 and the Common Development and Distribution License

a.k.a. MPL 1.1 and CDDL

CDDL was based on the MPL 1.1.

These are your “weak copyleft” licenses, copyleft meaning some part of the derived work also has to be released under the same license. What it means is that you can use this code in a proprietary application, but if you make any changes to the code you included, you have to release that code publicly (but not the code of your entire work). The CDDL defines the boundary at the source file, so if you change one of the original files, you have to release that back.

These are not compatible with the GPL.

MS-PL

Microsoft Public License

You’ll see this license a lot if you do much .NET coding. A lot of the stuff you find on MS’s open source site, CodePlex, is MS-PL licensed. The short of it is that you can use it in proprietary applications, but it’s not GPL-compatible (by design – MS hates Linux). It’s also not copyleft at all.

GPL

Your strong copyleft

The GPL is the one everyone loves to hate, but it’s also popular because the Linux kernel is released under the GPLv2 license, and many/most of the tools of the Linux community are GPL-based.

Unlike the previous licenses, if you take any code from a GPL’d program and include it in your own project, you’re making a “derived work” and you must agree to release your entire derived work under the same license (or a later version if the author specifically says you can). That means you can’t use GPL’d code in your commercial application (but you can use it in internal applications).

The reason some programmers are so annoyed by it is that they’re at work Googling for some code to solve their problem, realize someone has written an open source library to do exactly what they want and they get all excited. Then they check the license and their heart drops because it’s GPL and they realize they can’t use it. A small minority go as far as to send hate mail to the author. (As someone who has released some GPL’d code, I’ve received my share of this hate mail, and I find it very silly. I’m offering something for free, under certain conditions, and you are free to take it or leave it.) What most programmers don’t seem to understand is that if you email the author, they’d probably be willing to sell you a commercial license for the code so you could use it in your program.

LGPL

The weak copyleft version

The “L” originally stood for library, but now it stands for “lesser”. It kind of works like the CDDL, but it defines the boundary at the “library” rather than the source file. It basically says you can use this library, even in a commercial application, but if you make any changes to it, you must release your new version of the library under the same (or newer…) license.

It also adds a restriction that some people overlook – you must also provide the users of your application the ability to replace the LGPL’d library with a newer or modified version of that library. Usually this means providing the binaries and compilation instructions. Consider the difficulty of meeting this obligation if your derived work is firmware on an embedded device. Version 3 of the GPL and LGPL make it very clear that you must allow your users all the tools needed to replace the software on the device. This was a reaction to the TiVo, which used GPL’d code, and released the code publicly, but didn’t allow anyone to further modify the code and update their TiVo’s with it.

The other thing you have to worry about is copying code. If you copy any code from the library into your main project, then your project becomes a derived work, and you’re essentially forced to release your whole application under the LGPL. Programmers don’t worry about this too much, but legal departments do.

AGPL

Affero?

It turns out you can take GPL’d code, run it on a server as a web application, make all the changes you want, and never release your code because you’re not “distributing” the derived work, and distribution is what triggers the GPL. (Google has it’s own version of Linux that it never had to release because it only uses it internally).

This ticked off some people who were writing GPL’d blogging and other website type software, so someone came up with the AGPL. This changes the triggering clause, so if you are using the AGPL’d code in a website, you have to make any changes public.

Conclusion

Those are the major licenses you’ll run into. If you’re writing commercial software, you want to look for BSD, MIT/X11, Apache 2.0, MS-PL, MPL 1.1 or CDDL code. You can also use LGPL’d code, but watch out for the extra restrictions.

If the proprietary software you’re writing is only for internal use, or you’re writing it “for hire” for another company that will only use it internally, then you’re safe to use GPL’d or LGPL’d code because you won’t trigger the distribution clause. Just be sure that you make this clear to your management/customer before you go down this path. If they decide they want to sell the software later, they’ll have a mess to clean up.

If you’re writing open source code then you need to pick a license. A BSD license is the easiest, and it’s great for little utility libraries because anyone can use it. If you’re writing an application and you want to protect against some company taking the application, adding a bunch of new features that make it incompatible, and then releasing and charging for it without ever giving you anything, then you should choose the GPL (or AGPL if it’s a web application).


On the Pathetic State of Automation Security

To start with, even PC security is pretty bad. Most programmers don’t seem to know the basic concepts for securely handling passwords (as the recent Sony data breach shows us). At least there are some standards, like the Payment Card Industry Data Security Standard.

Unfortunately, if PC security is a leaky bucket, then automation system security about as watertight as a pasta strainer. Here are some pretty standard problems you’re likely to find if you audit any small to medium sized manufacturer (and most likely any municipal facility, like, perhaps, a water treatment plant):

  • Windows PCs without up-to-date virus protection
  • USB and CD-rom (removable media) ports enabled
  • Windows PCs not set to auto-update
  • Remote access services like RDP or Webex always running
  • Automation PCs connected to the office network
  • Unsecured wireless access points attached to the network
  • Networking equipment like firewalls with the default password still set
  • PLCs on the office network, or even accessible from the outside!

All of these security issues have one thing in common: they’re done for convenience. It’s the same reason people don’t check the air in their tires or “forget” to change their engine oil. People are just really bad at doing things that are in their long term best interest.

Unfortunately, this security issue is becoming an issue of national security. Some have said there’s a “cyber-cold-war” brewing. After the news about Stuxnet, it’s pretty clear the war has turned “hot”.

I’m usually not a fan of regulations and over-reaching standards, but the fact is the Japanese didn’t build earthquake resistant buildings by individual choice. They did it because the building code required it. Likewise, I’ve seen a lot of resistance to the OSHA Machine Guarding standards because it imposes a lot of extra work on Control System Designers, and the companies buying automation, but I’m certain that we’re better off now that the standards are being implemented.

It’s time for an automation network security standard. Thankfully there’s one under development. ISA99 is the Industrial Automation and Control System Security Committee of ISA. A couple of sections of the new standard have already been published, but it’s not done yet. Also, you have to pay ISA a ransom fee to read it. I don’t think that’s the best way to get a standard out there and get people using it. I also think it’s moving very slowly. We all need to start improving security immediately, not after the committee gets around to meeting a few dozen more times.

I wonder if you could piece together a creative-commons licensed standard based on the general security knowledge already available on the internet…

Beating Procrastination

I recently downloaded The War of Art: Break Through the Blocks and Win Your Inner Creative Battles (by Steven Pressfield) for my Kindle and blasted through it. It was recommended by Seth Godin, etc., and I figured I’d give it a shot.

First of all, I couldn’t help but notice the irony: certainly reading a book about procrastination is, by definition, procrastination. Or is it just sharpening the saw? Well, I had just finished pushing out FluentDwelling so a little procrastination at starting the next project is probably understandable.

Secondly, the book is well worth a read. If you’re an artist, writer, engineer, entrepreneur, or really anyone that does creative work on a daily basis, this book is like a shot of caffeine. He starts off by describing what he calls “the Resistance”, which he manifests as the ever-present enemy of all of us. It’s the collection of invisible forces that keeps us from starting.

It made me do a little introspection of my own Resistance. I know that the key is to start working. The biggest thing that keeps me from sitting down and starting work is anticipation of interruption. Certainly we already know the dangers of being interrupted while working, particularly if you need to get into “the zone” to be productive in your job. The problem is that the negative consequence of possible interruption became so great that the anticipated consequence of the interruption was worse than the anticipated reward of getting work done. Interruption is a physical assault. It feels like being punched in the gut, and even though I can take a few of those in the course of the day, a constant pounding wears me down.

What was the consequence? Entire days lost to quadrant 1 (or worse yet, quadrant 3) activities. When you work at a place where, if you wait 15 minutes, someone will call you with some “urgent” problem, then it’s too demoralizing to start working on the bigger tasks (or the bigger tasks only get done after hours and on the weekend).

I’ve only, in the past few years, realized how hard you have to fight for the right to work even in the workplace. Pressfield describes every day as a battle, and he’s right. The stakes are high. The enemy is relentless. Here are some of the tactics I’ve employed:

  • No instant messenger
  • Email notifications are turned off – I check email on my schedule
  • I work from a prioritized to-do list
  • I automate myself first – increase your own productivity, then help others

What are the worst productivity killers I’ve experienced?

  • Being asked more than once a day for a status report
  • Not having the tools to do your job
  • The Web – which is a nasty double-edged sword, because it’s a productivity multiplier

I don’t win this battle every day, but my win-loss record is improving. This book is one more salvo deep into the gut of the Resistance. I’ll leave you with this quote:

We were put here on earth to act as agents of the Infinite, to bring into existence that which is not yet, but which will be, through us.

Maybe that’s a prayer to say before sitting down at my desk in the morning. I look forward to doing battle tomorrow.

Insteon and X10 Home Automation from .NET

I’ve been playing with my new Smarthome 2413U PowerLinc Modem plus some Smarthome 2456S3 ApplianceLinc modules and two old X10 modules I had sitting around.

Insteon is a vast improvement over the X10 technology for home automation. X10 always had problems with messages getting “lost”, and it was really slow, sometimes taking up to a second for the light to actually turn on. Insteon really seems to live up to its name; the signals seem to get there immediately (to a human, anyway). Insteon also offers “dual-band” technology, meaning the signals are sent both over the electrical wiring of the house, and over a wireless network. On top of this, Insteon implements “mesh networking”, acknowledgements, and retries. The mesh networking means that even if two devices can’t directly communicate, if an intermediate device can see both, it will relay the signal.

Now, where Insteon seems to have improved leaps and bounds on the hardware, the software support is abysmal. That’s not because there’s anything wrong with the API, but because they’ve put the Software Development Kit (SDK) behind a hefty license fee, not to mention a rather evil license agreement. Basically it would preclude you from using any of their examples or source code in an open source project. Plus, they only offer support if you’ve purchased their SDK.

So, I’ve decided to offer free technical support to anyone using a 2413U for non-commercial purposes. If you want help with this thing, by all means, email me, post comments at the end of this post, whatever. I’ll be glad to help you.

Let’s start by linking to all the useful information about Insteon that they haven’t completely wiped off the internet (yet):

Now how did I find all this information? Google. SmartHome (the Insteon people) don’t seem to provide links to any of this information from their home or (non-walled) support pages, but they either let Google crawl them, or other companies or organizations have posted them on their sites (I first found the modem developer’s guide on Aartech’s site, for instance). Once you get one document, they tend to make references to the titles of other documents, so you could start to Google for the other ones by title. Basically, it was a pain, but that’s how it was done.

Now, whether you buy the 2413S (serial) or 2413U (USB), they’re both using the 2412S internally, which is an RS232 device. The 2413U just includes an FTDI USB-to-Serial converter, and you can get the drivers for this for free (you want the VCP driver). It just ends up making the 2413U look like another COM port on your PC (in my case, COM4).

So, assuming you know how to open a serial port from .NET, and you got done reading all that documentation, you’d realize that if you wanted to turn on a light (say you had a switched lamp module at Insteon address “AA.BB.CC”), you’d want to send it this sequence of bytes (where 0x means hex):

  • 0x02 – start of message to PLM
  • 0x62 – send Insteon message over the network
  • 0xAA – high byte of Insteon ID
  • 0xBB – middle byte
  • 0xCC – low byte of Insteon ID
  • 0x0F – Flags (meaning: direct message, max hops)
  • 0x12 – Command byte 1 – means “turn on lighting device”
  • 0xFF – Command byte 2 – intensity level – full on

… after which the 2413U should respond with:

0x02, 0x62, 0xAA, 0xBB, 0xCC, 0x0F, 0x12, 0xFF, 0x06

… which is essentially just echoing back what it received, and adding a 0x06, which means “acknowledge”.

At that point, the 2413U has started transmitting the message over the Insteon network, so now you have to wait for the device itself to reply (if it does… someone might have unplugged it, after all). If you do get a response, it will look like this:

  • 0x02 – start of message from 2413U
  • 0x50 – means received Insteon message
  • 0xAA – high byte of peer Insteon ID
  • 0xBB – middle byte
  • 0xCC – low byte of peer Insteon ID
  • 0x?? – high byte of your 2413U Insteon ID
  • 0x?? – middle byte of your 2413U Insteon ID
  • 0x?? – low byte of your 2413U Insteon ID
  • 0x20 – Flags – means Direct Message Acknowledgement
  • 0x12 – Command 1 echo’d back
  • 0xFF – Command 2 echo’d back

If you get all that back, you have one successful transaction. Your light show now be on! Whew, that’s a lot of overhead though, and that’s just the code to turn on a light! There are dozens of other commands you can send and receive. I didn’t want to be bit-twiddling for hours on end, so I created a little helper library called FluentDwelling so now you can write code like this:

var plm = new Plm("COM4"); // manages the 2413U
DeviceBase device;
if(plm.TryConnectToDevice("AA.BB.CC", out device))
{
    // The library will return an instance of a 
    // SwitchedLightingControl because it connects 
    // to it and asks it what it is
    var light = device as SwitchedLightingControl;
    light.TurnOn();
}

I think that’s a little simpler. FluentDwelling is free to download, open-sourced under the GPLv3, and includes a full unit test suite.

It also supports the older X10 protocol, in case you have some of those lying around:

plm.Network.X10
    .House("A")
    .Unit(2)
    .Command(X10Command.On);

There are quite a few Insteon-compatible devices out there. In addition to lighting controls, there is a Sprinkler Controller, Discrete I/O Modules, a Rain Sensor, and even a Pool and Spa Controller. That’s just getting started!

Questions to Ask your Employer When Applying for an Automation Job

If you’re going to interview for a control systems job in a plant, they’ll ask you a lot of questions, but you should also have some questions for them. To me, these are the minimum questions you need to ask to determine if a future employer is worth pursuing:

  1. Do you have up-to-date electrical drawings in every electrical panel? – When the line is down, you don’t have time to go digging.
  2. Do you have a wireless network throughout the plant? – It should go without saying, having good reliable wireless connectivity all over your facility really helps when you’re troubleshooting issues. Got a problem with a sensor? Just setup your laptop next to the sensor, go online, look at the logic, and flag the sensor. You don’t have time to walk all over.
  3. Does every PC (including on-machine PCs) have virus protection that updates automatically? – We’re living in a post Stuxnet world. Enough said.
  4. Have you separated the office network from the industrial network? – Protection and security are applied in layers. There’s no need for Jack and Jill in accounting to be pinging your PLCs.
  5. What is your backup (and restore) policy? – Any production-critical machine must always have up-to date code stored in a known location (on a server or in a source control system), it must be backed up regularly, and you have to test your backups by doing regular restores.
  6. Are employees compensated for working extra hours? – Nothing raises a red flag about a company’s competency more than expecting 60+ hour weeks but not paying you overtime. It means they’re reactive, not proactive. It means they don’t value experience (experienced employees have families and can’t spend as much time at the office). It probably means they scored poorly in the previous questions.

You don’t have to find a company that gets perfect on this test, but if they miss more than one or two, that’s a warning sign. If they do well, they’re a proactive company, and proactive companies are sane places to work.

Good luck!

Upgrading a Legacy VB6 Program to .NET

There is a lot of code out there written in VB6, running just fine. If you’re someone who has to maintain it, then at some point you’ll ask yourself, “should we just bite the bullet and upgrade this to .NET?”

There is, so far, no end-of-life issue on the horizon. VB6 applications will run on Windows 7, and Microsoft has vowed to support the VB6 runtime through the life of Windows 7. That will be a while, so there’s no hurry.

First, you need to do a cost-benefit to determine if it’s worth upgrading. That’s a pretty big task right there. What do you gain by moving to .NET? Certainly you gain a much richer ecosystem of utilities, libraries, persistence layers, test frameworks, etc. You’ll also find it easier to hire developers who have .NET on their resume. It’s pretty hard to find a copy of Visual Studio 6 these days, unless you have an MSDN subscription. .NET features like lambda expressions, LINQ, and reflection are also big productivity boosters if you spend the time to become proficient with them. These are all valid points, but they’re hard to measure.

You’re going to need to do some ballpark estimates. I’ve actually been doing some conversions lately, so I have some real experience to throw at it. Take any VB6 application, and it’ll take you 1/3 to 1/2 of the original development time to rewrite it in .NET with the same feature set (using test-driven development). That’s my estimate… do what you will with it. So, how much maintenance work are you doing, and how much more efficient would you be after the conversion?

So let’s take an application that took one programmer 6 months to write, and then you’ve been maintaining it in 50% of your time for the last year. So there are 12 months of development in the existing application. By my estimate you’ll need to spend 4 to 6 months rewriting. Let’s say you’re twice as fast after the conversion (if you didn’t have unit tests before and you use test-driven development during the conversion, the unit tests alone should make you this much more productive, not to mention the improvements in the IDE and the full object-oriented support). In that case, the payback period is 8 to 12 months of actual planned development. If you have that much work ahead of you, and you can afford to put off working on new features entirely for half that time, you’ll break even.

That’s still a really big investment. The problem is that you won’t have anything to show for it for half that time. It’s surprising how quickly management could lose faith in your endeavor if they a) don’t really understand what you’re doing and b) don’t see any tangible results for months.

There are alternatives to the all-or-nothing rewrite. First, you can use a conversion tool to convert the VB6 to VB.NET. The one that comes with Visual Studio 2005 is notoriously bad, but some of the commercially developed ones are apparently much better. Still, given VB6’s laughably bad support for the object-oriented programming paradigm, the code you get out of the conversion is going to smell more like VB6 than .NET. It will get you done faster, probably more than twice as fast, so it’s still an option. However you won’t get a chance to re-architect the software or normalize the database, etc., in the process.

The other alternative to the “big rewrite” is to do the upgrade in an “agile” manner. Take some time to break the software into smaller modules, each of which can be upgraded in about one month or less. This will significantly lengthen the amount of time it takes you to finish the project, but you’ll have something tangible to show after each month. Most managers can wait this long. This approach has its problems too: you need to write a lot of code to interact between the VB6 and .NET code. It can be tricky.

Normalizing a Database

If you’re in a position where you have a database as a backing store, and you need to make major database structure changes, this must affect your decision. The “big rewrite” is the most friendly to database changes: you just write a single conversion script that upgrades the existing database in-place, and you write your new version against the new schema. You have a clean slate, so you can clean up all the crufty problems in the old schema.

On the other hand, if you’re just using a conversion tool to automatically convert from VB6 to .NET, you can’t change the schema.

If you take the middle road (“agile”), you can change the database structure at the same time, but it’s much more difficult than in the “big rewrite”. As you upgrade each module, it makes sense to modify the database structure underlying that module, but unless you’re really lucky, you’ll have parts of other modules left in VB6-land that are dependent upon database tables that are changing. That means you’ll have the same problem anyone without a really good data access layer (or object-relational persistence layer) has when they go to change the database schema:

You have a whole bunch of code that looks like this: sql = "SELECT MY_COL1, MY_COL2 FROM MY_TABLE JOIN..."

Assuming you don’t have unit test coverage, how do you find all the places in your code that need to be changed when you normalize MY_COL2 out of one table into another? Of course you can start with a search and replace, but if you really have a database normalization problem, then you probably have duplicate column names all over the place. How many tables have a column called CODE or STATUS? There are many pathological cases where a simple text search is going to find too many matches and you’ll spend hours tracking down all the places where the code might change just because of one column being moved or renamed.

The most pathological case is where you have, for instance, two columns like CONTACT1 and CONTACT2 in the same table, and somewhere in the code it says sql = "UPDATE MY_TABLE SET CONTACT" & ContactNumber & " = '" & SomeValue & "'". You’re doing to have a hard time finding that column name, no matter what you do.

You need to develop a smarter system. I’ve tried a couple of different approaches. I tried one system where I auto-generated unique constants for all of my table and column names in my database, and then I wrote a script that went through my source code and literally replaced all of the instances of table or column names inside of strings with the constants. When I changed the database, I regenerated the list of constants, and the compiler was able to catch all the dependencies. Unfortunately, this method has some deficiencies: the resulting SQL statements are more difficult to read, and when you go and make changes to these statements you have to be disciplined enough to use the generated constants for the table and column names, or you break the system. Overall, it saves a lot of time if you have a lot of database changes to make, but costs extra time if you have to write new code.

I tried a different variation of the system where instead of replacing the table and column names in the string directly, I added auxiliary statements nearby that used the constants for the table and column names, and these would generate compile errors if a dependency changed. This made the code easier to read, but had problems of its own.

I don’t have a perfect answer for this problem, but if you have any SQL strings embedded in your legacy VB6 application, and you want to do big changes to your database, I can tell you that you must build a tool for yourself.

Summary

If you really must convert your application from VB6 to .NET then make sure you go into it with your eyes wide open. Engage management in a frank discussion. Make sure you get a strong commitment. If they waffle at all, walk away. The last thing anyone wants is a half-converted piece of software.

Still, I’m here to tell you that it is possible, and if you do your homework, there can be a real payback. Good luck!

More about Stuxnet, on TED

I’ve been enjoying a lot more TED recently now that I can stream it directly to our living room HDTV on our Boxee Box. Today it surprised me with a talk by Ralph Langner called “Cracking Stuxnet: A 21st Century Cyber Weapon”. I talked about Stuxnet before, but this video has even more juicy details. If you work with PLCs, beware; this is the new reality of industrial automation security: