TAG | dry-principle
In programming, we have a principle called Don’t Repeat Yourself (DRY). It’s a very important idea, and I’d argue that most of the advances in programming environments over the years have been in support of this principle and its related principle, Once and Only Once (OAOO).
Unfortunately, like every “principle”, it eventually takes on the level of dogma, and the people spouting it sometimes forget why it exists. These principles aren’t ends in themselves; they’re not self-justified. They are general principles to follow, but only when they support the end-goal of solving problems in more efficient, and more maintainable ways.
Let me give you a very simplified example of how it can be carried to far. Consider the following declarations in C#:
const int MOTOR_1_START_TIMEOUT_MS = 5000;
const int MOTOR_2_START_TIMEOUT_MS = 5000;
Consider that I could write:
const int MOTOR_1_START_TIMEOUT_MS = 5000;
const int MOTOR_2_START_TIMEOUT_MS = MOTOR_1_START_TIMEOUT_MS;
const int MASTER_MOTOR_TIMEOUT_MS = 5000;
const int MOTOR_1_START_TIMEOUT_MS = MASTER_MOTOR_TIMEOUT_MS;
const int MOTOR_2_START_TIMEOUT_MS = MASTER_MOTOR_TIMEOUT_MS;
Notice that all 3 versions accomplish the same end-result, but they are semantically different. The first version means that the two motors have independent timeout values, and they’re just co-incidentally the same. The second says, “motor 2’s timeout must be the same as motor 1’s timeout.” The third says that both motors must have the same timeout.
In my opinion, any of these three versions might be correct for various systems involving two motors. However, if you follow the DRY principle without thinking about it, you’ll assert that the first version is incorrect. In fact they’d probably say the only correct version should be:
const int MOTOR_TIMEOUT_MS = 5000;
(…ignoring, for the moment, that it should probably be a configurable value rather than a constant.)
Why does this simple example matter? Consider the case of a PLC-based control system with 10 motors. Let’s say at the start that all the motors, and all the drives running them, are identical. If you’re familiar with my philosophy of PLC programming, you know that my default solution for this would be to have 10 ladder logic routines, each called MOTOR_01, MOTOR_02, etc. Each routine would basically be a copy. That really doesn’t follow the DRY principle, does it? Certainly no, not at face value.
You might not believe it, but I get the occasional “hate mail” to my blog’s email address because of some of my technical opinions here. The most recent one, comically, referred to me (and all PLC programmers for that matter) as “dinosaurs”. I’m not sure what the rest of the message said, because if you can’t be polite, I’m not going to bother listening to you. However, I believe it’s this flagrant violation of things like the DRY principle that really rubs traditional PC programmers the wrong way when you start to talk about the principles of PLC programming.
Of course, my views about PLC programming are just that – general principles that need to be evaluated in the light of each and every project. I’m just asserting that most of the time you should be following a principle of a one-to-one mapping between ladder logic and real-world hardware. That doesn’t mean it’s an unbreakable rule.
Going back to the 10 motor example, the way you structure your program should be based on a decision you make about anticipated future changes to the system.
If you write one generic routine for controlling a motor, and you call it 10 times, you’re saying, “I always expect all 10 of these motors to behave in an identical way for all of the future.” Of course, you can allow variations, but you have to do that by passing in parameters for each instance. You have to be explicit about what can vary. Adding new parameters is typically a harder task than just modifying one of the 10 existing motor routines when you need to change the behavior of one motor.
On the other hand, if you follow my principle of 10 motor routines for 10 motors, you’re saying, “I expect that we’ll rarely need to make a sweeping change to all 10 motor control routines, but that we are likely to modify one or two routines to make them perform differently than the others.” I personally believe this is usually closer to the truth. As a system ages, perhaps one motor drive will blow, and you can’t buy the original drive anymore, so you have to replace it with a new one that has different control signals. That’s a fairly typical scenario, in my experience. Also, even though you might have 10 identical drives and motors, the process may or may not be identical for each motor. They may perform vastly different functions, and it’s likely that you’ll want to change just one or two of them to access more advanced features of the drive when you refine the process. Of course, I also like that with a one-to-one mapping in a PLC, troubleshooting becomes much easier because with online monitoring you can see each control routine executing just for that motor. You can make temporary changes just to one motor routine to bypass a faulted drive, or to do a million other changes that you’ll never be able to predict when you’re writing the logic.
The fact is, we’re physically limited by the number of drives we have. The amount of time it takes to make a change to all 10 motor control routines is tiny compared to how long it takes to make physical changes to 10 drives. This effort scales with the size of the system. In PC programming, you can have a system with millions, even billions, of objects, but in the PLC world, you’re limited by physical reality. The consequences of repeating yourself aren’t always as great, and you need to take that into account, and weigh it against your other goals.
That doesn’t mean I can’t imagine a case where you really want to assert that the motors all have to operate identically, all of the time, forever in the future. There are systems with load sharing drives where the system wouldn’t operate if you mismatched the drives or motors. That’s a design decision you have to make. Principles are only there for guidance, but they are not absolute rules, and they shouldn’t be treated that way.
I spend a significant amount of my time these days doing PC programming (as opposed to PLC programming) and I’d say the most time consuming part of my job is following the DRY Principle, aka “Don’t Repeat Yourself”.
The naive interpretation of the DRY principle is that it’s about minimizing the amount of code you write. While this may sometimes be a side effect, this isn’t the point of DRY at all. What we’re trying to do is structure the program so that there’s just one place for everything, and everything is in it’s one place. This is a lot harder than it sounds because no matter how you segment your program into modules, classes, or layers there’s always something that cuts across those boundaries.
Take logging for instance. Every part of the application needs to log things to the same log file. The naive implementation of opening a file, writing a line, and closing it is short but doesn’t follow the DRY principle because we’re repeating this simple sequence of steps all over the place. The danger is that if we change our logging strategy, we now have to change multiple places in every file.
There are probably two ways that the “logging strategy” could change:
- We change where (or if) we want to log the data, like to a database instead of a file
- We change the kinds of stuff we want to log
We can handle the first case easily by moving the logging logic into a global subroutine, or if you’re more advanced you would use something like a logging service with the service locator pattern. That would let us change where we log the data, and if we include a severity level with the logging interface, we could filter our logging by severity.
The second case is really hard to solve. We have to choose each spot in the code where we want to log information. Maybe every time I catch an error in a try/catch block, but that means every time I write a try/catch block I have to add a call to my logging service. This case violates the DRY principle because the idea is just “every time I catch an error in a try/catch block”. That idea needs to be codified in one place. Ideally I should be able to extend the try/catch block of the language and say, “any time you execute a catch I also want you to do this”. That’s not possible with .NET, that I know of anyway.
Likewise, maybe I want to know every time a user carries out any action that might affect the underlying data in the database, and I’ll have to add logging code at each of those places too. If you have all of your database access going through a single layer in your application, at least you can confine the logging to that layer, but it may consist of dozens of classes representing dozens of database tables. That’s where we need something like the ADO.NET entity framework, and have all of our database entities derive from a single base class, and then we might be able to tie our logging rules in there.
The problem is that these architectural solutions are hard. They take a lot of effort, and when you just need to add logging, it’s too easy to just go through your code and add logging wherever you need it, and you’ve created a maintenance nightmare. It’s the extremely low cost of copy and paste vs. architecture that creates this problem. If you had to pay $1 every time you copied and pasted code, I bet you’d write more maintainable software.