TAG | standards
I think I can group decisions into two types:
- Decisions where it’s really important that we make the right decision
- Decisions where it’s really important that we make any decision and everyone gets behind it
For instance, deciding what products to launch for the Christmas season is really important. The choices made will have a profound impact on the bottom line of your company. On the other hand, it didn’t really matter what side of the road we decided to drive on, but it was really important that we, as a group, made a decision, and everyone agreed to it.
Now let’s talk about how organizations make decisions. I think there are typically two approaches:
- Appeal to authority
- Appeal to committee
When appealing to authority, the accounting department has the authority to make cash-flow decisions, and the engineering department has the authority to make technical decisions, and the marketing department gets to decide whether we run Superbowl ads or Craigslist ads. The CEO can override these decisions when a higher level view recognizes a different need.
When we appeal to committee, we gather all the “stakeholders” who then sit around a table, generally as equal representatives of their respective departments, and come to some kind of consensus.
I don’t think anyone’s surprised by the fact that when it comes to making decisions where being right is the most important criteria, authoritative decisions tend to be better than committee decisions. In the same way, when success of the decision is tied to consensus rather than the “correctness” of the decision, then committee decisions probably have an edge.
Now, if you’ve spent any time around government offices, you’ll realize that almost all decisions, including planning the staff Christmas gathering, are done by committee. Very large publicly traded companies don’t seem to be much different. On the other side of the spectrum, small companies don’t need much consensus because they’re small, and they tend towards decisions based on authority. Successful entrepreneurs seem to surround themselves with knowledgeable people and trust those people to make intelligent choices. This makes them well suited to make decisions where it’s important to be right, like how much raw material to buy this month, and where to commit other scarce resources.
It’s interesting to look at the outliers too. Apple is famous for being the exception that proves the rule. Despite being a huge organization, all information seems to indicate that Jobs ruled it with authority, not committee. And since he seemed to make good decisions, they were successful. Apple shareholders beware.
Now let’s go all 7-Habits on this and put it in quadrants, dividing decisions along two axes:
Drive on the
Left or Right?
Chicken or Fish?
I divided it into four quadrants numbered 1, 2, 3, and 4. Quadrants 2 and 3 we’ve already covered. In quadrant 2, committees really shine, and in quadrant 3 authority really shines. I’m not even going to talk about quadrant 4.
Quadrant 1 is the tricky one. The Easter Island society collapsed because they were faced with a decision: do we allow everyone to cut down all the trees, or do we centrally manage it? Obviously they made the wrong decision, but the right decision would have required broad support, which is why it’s so difficult.
Apple beat the quadrant 1 decisions by rolling both authority and consensus into one charismatic (and knowledgeable) leader. People follow leaders who have a track record of delivering on their promises. Success is a positive spiral.
The idea that you can take committees and make them authoritative is misguided. On the other hand, we’ve seen our share of authority figures who’ve succeeded at the long road of building consensus around the right decisions. They are our political and cultural heroes.
All of this brings me to two conclusions:
First, unsurprisingly, is that we shouldn’t put big government bureaucracy in charge of quadrant 1 type decisions (and that’s a bit scary, because they certainly are in charge of those decisions now).
Second is that our system of government tends to promote leaders who are good consensus builders without promoting leaders who are likely to make the right decisions. I’m not saying it promotes leaders who are likely to make bad decisions; I’m just saying it’s neutral on the issue.
I’m not out to change the system of government, but I think a two-pronged offensive could make a dent: on one side our domain experts tend to live in a world where consensus building doesn’t matter because their community has the skill to recognize logical consistent arguments. Scientists simply publish their findings and wait for others to confirm or disprove them. Engineers test various design alternatives and measure their performance. Unfortunately this means our domain experts lack the soft skills necessary to convince us to do the right things. A marketing budget for these experts, perhaps paid for by some rational-minded philanthropists, could go a long way.
On the other side, the general public is hopelessly lacking in critical thinking skills. We live in a world where logic is first introduced as a university-level introductory philosophy class. It belongs in high school (along with some other suspiciously missing life-skills like food/nutrition and childcare).
Unfortunately the high school curriculum is decided on by… a committee.
To start with, even PC security is pretty bad. Most programmers don’t seem to know the basic concepts for securely handling passwords (as the recent Sony data breach shows us). At least there are some standards, like the Payment Card Industry Data Security Standard.
Unfortunately, if PC security is a leaky bucket, then automation system security about as watertight as a pasta strainer. Here are some pretty standard problems you’re likely to find if you audit any small to medium sized manufacturer (and most likely any municipal facility, like, perhaps, a water treatment plant):
- Windows PCs without up-to-date virus protection
- USB and CD-rom (removable media) ports enabled
- Windows PCs not set to auto-update
- Remote access services like RDP or Webex always running
- Automation PCs connected to the office network
- Unsecured wireless access points attached to the network
- Networking equipment like firewalls with the default password still set
- PLCs on the office network, or even accessible from the outside!
All of these security issues have one thing in common: they’re done for convenience. It’s the same reason people don’t check the air in their tires or “forget” to change their engine oil. People are just really bad at doing things that are in their long term best interest.
Unfortunately, this security issue is becoming an issue of national security. Some have said there’s a “cyber-cold-war” brewing. After the news about Stuxnet, it’s pretty clear the war has turned “hot”.
I’m usually not a fan of regulations and over-reaching standards, but the fact is the Japanese didn’t build earthquake resistant buildings by individual choice. They did it because the building code required it. Likewise, I’ve seen a lot of resistance to the OSHA Machine Guarding standards because it imposes a lot of extra work on Control System Designers, and the companies buying automation, but I’m certain that we’re better off now that the standards are being implemented.
It’s time for an automation network security standard. Thankfully there’s one under development. ISA99 is the Industrial Automation and Control System Security Committee of ISA. A couple of sections of the new standard have already been published, but it’s not done yet. Also, you have to pay ISA a ransom fee to read it. I don’t think that’s the best way to get a standard out there and get people using it. I also think it’s moving very slowly. We all need to start improving security immediately, not after the committee gets around to meeting a few dozen more times.
I wonder if you could piece together a creative-commons licensed standard based on the general security knowledge already available on the internet…
Ok, so I’ve complained about “Best Practices” before, but I want to revisit the topic and talk about another angle. I think the reason we go astray with “Best Practices” is the name. Best. That’s pretty absolute. How can you argue with that? How can any other way of doing it be better than the “Best” way?
Of course there are always better ways to do things. If we don’t figure them out, our competitors will. We should call these standards Baseline Practices. They represent a process for performing a task with a known performance curve. What we should be telling employees is, “I don’t care what process you use, as long as it performs at least as well as this.” That will encourage innovation. When we find better ways, that new way becomes the new baseline.
In case you haven’t read Zen and the Art of Motorcycle Maintenance, and its sequel, Lila, Pirsig describes two forms of quality: static and dynamic. Static quality are the things like procedures, and cultural norms. They are a way that we pass information from generation to generation, or just between peers on the factory floor. Dynamic quality is the creativity that drives change. Together they form a ratchet-like mechanism: dynamic quality moves us from point A to point B, and static quality filters out the B points, throwing out the ones that fall below the baseline.
I’ve heard more than one person say that we need to get everyone doing things the same way, and they use this as an argument in favour of best practices. I think that’s wrong. We have baseline practices to facilitate knowledge sharing. They get new employees up to speed fast. They allow one person to go on vacation while another person fills in for them. They are the safety net. But we always need to encourage people go beyond the baseline. It needs to be stated explicitly: “we know there are better ways of doing this, and it’s your job to figure out what those ways are.”
I’ve just been reading Ken McLaughlin’s recent post Top Ten Signs an Integrator is the Real Deal #7: Best Practices and Standards and I have to say, my initial reaction is one of skepticism. I think Ken’s thinking is a little too narrow on this one. Let me explain…
This isn’t the first time I’ve considered the “problem of standards” on this blog. In an earlier post, Standards for the Sake of Standards, I explained how most corporate standards eventually end up being out-of-date and absurd, mostly because nobody making the standard ever things to write down Why the standard exists, which would allow future policy-makers to understand the reasons and change the standard when it no longer applied. Instead, it becomes gospel.
However, that isn’t to say you could run a large organization without best practices and standards. That’s the point isn’t it? In order to become large, you need built-in efficiency, and you do that at the expense of innovation. Big companies don’t innovate (in fact the only notable exception is Apple, and the rebuttal is always, “fine, so give one example other than Apple”). Almost all innovation happens in small companies, by a tightly knit group of superstars where the chains have been removed. Best Practices are, in fact, put in place to clamp down on innovation because innovation is risky, and investors hate risk. It’s better to make lots of average product for average people than exceptional products for a few people (hence McDonald’s). Paul Graham, as usual, has something insightful to add to this:
Within large organizations, the phrase used to describe this approach is “industry best practice.” Its purpose is to shield the pointy-haired boss from responsibility: if he chooses something that is “industry best practice,” and the company loses, he can’t be blamed. He didn’t choose, the industry did.
I believe this term was originally used to describe accounting methods and so on. What it means, roughly, is don’t do anything weird. And in accounting that’s probably a good idea. The terms “cutting-edge” and “accounting” do not sound good together. But when you import this criterion into decisions about technology, you start to get the wrong answers.
The reason small companies are innovative is that innovative people can’t stand corporate environments. Imagine if you were an inspired chef… could you stand working at McDonald’s? Could McDonald’s even stand to employ you? You’d be too much trouble! You’d have to work in that nice one-off restaurant called “Maison d’here” where the manager puts up with your off-beat attitude because ultimately you make good food, and you keep their small but devoted clientèle coming back. But you can’t be franchised. The manager of the restaurant can’t scale you up without making what you do into a procedure.
So back to Ken’s topic… if you are choosing a systems integrator, you need to decide if you’re buying an accounting system (i.e. something that’s generic to all companies, and not a competitive advantage), or something that is a competitive advantage to you. When you’re automating your core business processes, you must build competitive advantage into it, and it must be innovative. If that’s the case, stay away from larger integrators with miles and miles of red tape and bureaucracy. Go for the “boutique” integrator (somewhere in the 7 to 25 person sized company, under $10 million per year in revenue) that can show you good references. You’re looking for a small group of passionate people. Buzzwords are a warning sign; small companies don’t have time for corporate-speak.
I’m not saying you should use the two guys in their garage. These guys are ok for your basic maintenance tasks, small changes, and local support, but you do want someone who has been around for a few years and has at least a couple of backup engineers they can pull in if there’s a problem. Make sure they have a server, with backups, and all that.
On the other hand, if what you’re automating is very large and very standard, that’s when you want to go with Ken’s approach. If you need to integrate a welding line, paint line, or whatever, there’s nothing new or innovative in that, so you want to lower the risk. You know all the big integration companies can do this, so go and get three bids, and choose the one that’s hungriest for the work. Make sure they have standards and best practices. The reduction in risk is worth it if you don’t need the innovative solution.
You can do a hybrid approach. Identify the parts of your process that could be key competitive advantages if you could find a better way to do it. This is where innovation pays off. Go out and consult with some boutique integrators ahead of time and get them working on those “point solutions”. Then go to the bigger companies to farm out the rest of your automation needs. How’s that for a “best practice”?
(This is the third part of a trilogy of blog posts about what I think is the biggest roadblock preventing significant growth in the industrial automation industry. You should read Part 1: Industrial Automation Technology is Stagnant and Part 2: Why Automation Equipment Vendors Dabble in Integration first.)
I was recently making significant additions to a PLC program and the resulting program was too big to fit in the existing PLC. There weren’t many areas where the code could be made smaller without impacting readability, so I went out and priced the same series of PLC with more memory. I’m not going to name brands or distributors here (and I’d probably get in a lot of trouble if I did, because they don’t publish their prices), but I was taken aback by one thing: the only difference between these two controllers was the amount of memory (execution speed and features were the same), and it was over $1000 more. I realize this is “industrial” equipment, so the memory probably has increased temperature ratings, and it’s battery backed, but how much PLC memory can you buy for $1000? As of writing this post, the cost of 1 Gibabyte of memory for a desktop computer is less than $50.
At $50 per Gig, 20 Gigs of PLC memory? Nope.
Ok, it’s industrial, so… 1 Gig? Nope.
Try less than half a Megabyte. Yes.
Just for the sake of comparison, how much is that per Gigabyte? If half a Meg costs $1000, then the price of PLC memory is 2 Million dollars per Gigabyte. WTF? It’s not 40 times more expensive, it’s forty thousand times more expensive than commodity RAM.
Do you know why they charge two million dollars per gigabyte for PLC memory? Vendor lock-in. It’s that simple. They can charge thousands of dollars because the switching costs are insanely high. I can either buy a PLC in the same series from the same manufacturer (and they make sure there’s only one distributor in my area) with more memory, or I can replace the PLC with one from another manufacturer incurring the following costs:
- Rewriting the program because the languages are not portable.
- Modifying the electrical drawings significantly.
- Rewiring the electrical panel.
- Retraining all of my existing maintenance and engineering staff on the new platform.
So it really doesn’t matter. They could charge us $5000 for that upgrade and it’s still cheaper than the alternative. Why don’t they publish their prices? Simple, they price it by how much money they think you have, not by how much it costs them to produce the equipment. Wouldn’t we all like to work in an industry like that?
How did it get this way? In the PC industry, if Dell starts charging me huge markups on their equipment, I can just switch to another supplier of PC equipment. Even in the industrial PC world, many vendors offer practically equivalent industrial PCs. You can buy a 15″ panel mount touch screen PC with one to two gigs of RAM for about $3000. Add $500 and you can probably get an 80 Gig solid state hard drive in it. Compare that with the fact that small manufacturing plants are routinely paying $4500 for a 10″ touch screen HMI that barely has the processing power to run Windows CE, and caps your application file at a few dozen megabytes!
The difference between the PC platform and the PLC platform comes down to one thing: interoperability standards. The entire explosive growth of the PC industry was based on IBM creating an open standard with the first IBM PC. Here’s the landscape of personal computers before the IBM PC (pre 1980)*:
Here’s the market share shortly after introduction of the IBM PC (1980 to 1984):
… and here’s what eventually happened (post 1984):
Nearly every subsystem of a modern PC is based on a well defined standard. All motherboards and disk drives use the same power connectors, all expansion cards were based on ISA, or later PCI, and AGP standards. External devices used standard RS232, and later USB ports. Hardware that wasn’t interoperable just didn’t survive.
That first graph is exactly like where we are now in the automation industry with PLCs. None of the vendors have opened their architecture for cloning, so there’s no opportunity for the market to commoditize the products. Let’s look at this from the point of view of a PLC manufacturer for a moment. Opening up your platform to cloning seems like a really risky move. Wouldn’t all of your customers move to cheaper clones? Wouldn’t it drive down the price of your equipment? Yes, those things would happen.
But look at what else would happen:
- The hardware cost of your platform would drop, so everyone would leave your competitor’s platforms and come to yours. At least you’d still be in the game, but your competitors wouldn’t even have a compatible platform anymore.
- By dropping the price of automation equipment, the demand for complementary products would increase. IBM didn’t make its fortune directly in the PC industry, but they rode the wave and became an enterprise services company built on top of the commodity PC business.
- The size of the industry would explode.
Why don’t we have open standards? What about IEC-61131-3? PLC vendors tell us that their controllers are IEC 61131-3 compliant, but that doesn’t mean they’re compatible. The standard only specifies what languages must be supported and what basic features those languages should have. Why didn’t the committee at least specify a common file format? You may be surprised to learn that many of the major committee members were actually from the PLC equipment manufacturers themselves. The same people who have a vested interest in maintaining their own vendor lock-in were the ones designing the standard. No wonder!
What about the new XML standard? Well, it’s being specified by the same group and if you read their documentation, you can clearly see that this XML standard is not meant for porting programs between PLC brands, but rather to integrate the PLC programming tools with 3rd party programs like debugging, visualization, and simulation tools. You can be certain that a common vendor-neutral file format will not be the result of this venture.
I see three ways this problem could be solved:
- An association of common manufacturers and system integrators could form and develop a truly open and interoperable standard,
- A PLC manufacturer could “go IBM” and open up their platform, or
- A single integrator or manufacturer could develop and create an open automation platform and put the rights in the public domain.
Option 1 is almost doomed to fail because it has the word committee in it, but options 2 and 3 (which are similar) have happened before in other industries and stand a good chance of success here.
In my opinion, the most likely candidate to fulfill option 2 is Beckhoff. I say this because they’ve already opened up their EtherCAT fieldbus technology as an open standard, and their entire strategy is based around leveraging existing commodity hardware. Their PLC is actually a real-time runtime on a PC based system, and their EtherCAT I/O uses commodity Ethernet chipsets. All they would really need to do is open source their TwinCAT I/O and TwinCAT PLC software. Their loss of software license revenue could be balanced by increased demand for their advanced software libraries in the short term, and increased demand for their EtherCAT industrial I/O in the longer term. Since none of the other automation vendors have a good EtherCAT offering, this could launch Beckhoff into the worldwide lead relatively quickly.
For option 3, any company that considers automation equipment to be an expense or a complementary product (i.e. manufacturing plants and system integrators) could do this. There is a long term ROI here by driving down the cost of automation equipment. Many internet companies do this on a daily basis. IBM sells servers and enterprise services, which is the natural complement of operating systems, so they invest heavily in Linux development (not to drive down the cost of Linux – it’s free – but to offer a competitive alternative to Microsoft to keep Microsoft’s software costs down). Google does everything it can to increase internet usage by the public, so it invests heavily in free services like Google Maps, Gmail, and even the Firefox web browser. The more people who surf, the more people who see Google’s advertising, and the more money they make.
If I were going to go about it, I’d build it like this:
- Base it on commodity PC hardware (x86).
- Pick an open source, free real time operating system like FreeRTOS.
- Document and publish (free download) the communication protocol and file format for the automation program files (e.g. ladder logic).
- Write a reference runtime that conforms to this communication protocol and publish it under a BSD license. It doesn’t have to be great, but it has to work and be useful.
- Write an extensible programming environment (Windows-based) where you can develop automation programs that conform to the standard and that communicates with any runtime that conforms to the standard. Give it away free and publish it under the GNU GPL license to prevent it from being embraced and extended in a proprietary way by one vendor.
If you did that, anyone could make hardware that’s compatible with this platform, people are free to innovate on the runtime, and anyone can make improvements to the programming environment (as long as they give their changes back to the community). Instant de-facto standard.
I know some of you may be grinning right now because this is the path I’m following with my next personal software project, that I’m calling “Snap” (which stands for “Snap is Not A PLC”). However, Snap is for the home automation industry, not the industrial automation industry. If any company out there wants to take up the gauntlet and do this for industrial automation equipment, you may just change this industry and solve our biggest problem.
Discussion is welcome. Please comment below or contact me by email ([email protected]).
* Graphs courtesy of ars technica.
A few years ago I wrote a program for a customer using a flowcharting language. It wasn’t just a flowcharting language; the product allows you to use both relay ladder logic and flowcharts. But the customer had a software standard that made the use of ladder logic forbidden!
Imagine a simplified example. Do you think Bill Gates’ kitchen has a blender? Probably. But it’s probably not like the one you or I have. Maybe it’s made out of solid gold and maybe it has a nuclear power source, but most of all I imagine it has a fully fledged industrial control system. Now, imagine we’re programming the control system for this blender.
Of course it would have the standard start and stop buttons, but this blender is top of the line and absolutely safe. It has a sensor to determine if the lid is closed. Obviously, if we’re happily blending away and the lid flies off, we need to stop the blender, and it shouldn’t start again until the lid is back on and the start button is pressed. After all, we wouldn’t want to endanger Bill’s fingers or anything (I happen to know he does a lot of blending).
Many of you have already written this complicated control system in your mind:
Good job! No matter what happens, the lid will have to be closed or the motor simply won’t run. Of course, the software specification says that you can’t use ladder logic.
Well, in flowcharting the start and stop logic is simple:
That’s pretty simple, but it doesn’t take the lid closed switch into account. At first glance, we might have to put the check for the lid closed before every decision block, so we check for the state of the lid switch, then check for the start or stop buttons. Do you see how this could get out of hand quickly?
Fortunately, the language offers a solution:
How’s that? While it’s true that if the lid ever comes off, the motor will stop, we have no way to exit gracefully. This “exception block”, as it’s called, stops whatever you were doing, turns off the motor and starts the whole process over again. I’m sure it would work fine for our simplified example here, but what if we were doing something else later on in the flowchart? What if we were tracking parts, or decelerating an unrelated axis? In the ladder logic example, the lid switch only disables the motor. In the flowchart it stops the logic and then disables the motor.
Of course, to deal with this problem in the complexities of a real machine, you end up writing two different flowcharts: one for the sequence and one for the outputs.
Now take a close look at the right hand flowchart. It scans through both decisions on every single scan of the controller. It’s “ladder logic written sideways”. It’s simple combinational logic that screams to be rewritten in ladder.
In fact, writing the state logic (like auto mode sequences) in flowchart and the combinational logic (like manual mode and faults) in ladder makes a lot of sense, especially for more complicated machines. So why forbid ladder logic? Perhaps it was just to force people to start using the flowcharts.
A software standard that bans ladder logic is a bad idea. Some logic is more readable as flowcharts, and some is more readable in ladder. For years we’ve had to write everything in ladder. We were like the little boy who only had a hammer and thought every problem was a nail. If that boy suddenly traded the hammer for a screwdriver, was he any better off?
I love standards. I wish we had more of them. I wish the IEC 61131-3 programming language standard was actually a standard and not a suggestion. But of course, sometimes we end up with standards we could do without…
They’re a lot like those dumb laws you hear about in your email inbox like, “You cannot chain your alligator to a fire hydrant.” You know this law only exists because someone, at some point, chained their alligator to a fire hydrant. Some of these dumb standards exist because they are no longer relevant. Dumb standards keep hanging around because we are so concerned about telling people what to do that we forget to say why. Maybe this is because it seems obvious at the time.
I was contemplating this the other day when a co-worker related to me the fable of the five monkeys (reprinted here for your convenience):
There was an interesting experiment that started with five monkeys in a cage. A banana hung inside the cage with a set of steps placed underneath it. After a while, a monkey went to the steps and started to climb towards the banana, but when he touched the steps, he set off a spray that soaked all the other monkeys with cold water. Another monkey tried to reach the banana with the same result. It didn’t take long for the monkeys to learn that the best way to stay dry was to prevent any monkey from attempting to reach the banana.
The next stage of the experiment was to remove the spray from the cage and to replace one of the monkeys with a new one. Of course, the new monkey saw the banana and went over to climb the steps. To his horror, the other monkeys attacked him. After another attempt, he learnt that if he touched the steps, he would be assaulted.
Next, another of the original five was replaced with a new monkey. The newcomer went to the steps and was attacked. The previous newcomer joined in the attack with enthusiasm!
Then, a third monkey was replaced with a new one and then a fourth. Every time a newcomer approached the steps, he was attacked. Most of the monkeys beating him had no idea why they were not allowed to climb the steps or why they were joining in the beating of the newest monkey.
After replacing the fifth monkey, none of the monkeys had ever been sprayed with water. Still, no monkey ever approached the steps. Why not? Because as far as they knew it was the way it had always been done around here… and that is how company policy begins.
Another co-worker of mine would say that this situation exists because the monkeys are only passing on data, not knowledge. The monkeys have created a culture that is immutable to change because the culture rewards following and enforcing the rules more than understanding why the rules exist.
We need standards for efficiency. Part of their value is as a mechanism for passing information between people. We don’t want to re-invent the wheel and we don’t want to repeat our mistakes, but we pass up a valuable opportunity to pass on knowledge if we don’t document the why of each standard. Imagine being a new employee and being handed the company’s electrical controls standards document. Here’s an excerpt:
Standard 10.3(a) sub. 5: All wires will be terminated with ferrules.
I can imagine why this standard might exist – too many hours spent troubleshooting electrical problems caused by loose wire strands shorting out on nearby terminals. But if you’re a brand new employee straight out of school, would you understand that’s why this standard exists? When people don’t know why, they tend to make up their own reasons.
What happens two decades down the road when we’re all using carbon nanotube wires or some other non-stranded alternative? Without knowing why the standard exists, we might try to enforce this standard in a way that doesn’t make sense. Without a clear why stated, we risk allowing this standard to become another dumb standard, making the company less efficient.
So please, let’s make a new standard for future standards. Every time you write a standard, include a short paragraph with it describing the history of the decision and the reasoning behind it. Take the opportunity to pass on your knowledge!