19
The Psychology of Automation
No comments · Posted by Scott Whitlock in Industrial Automation
It’s easy to overlook the power of human motivation in automation systems.
I’m going to assume you don’t work in a lights-out factory. That’s pretty rare. Almost all automation systems interact with people on a regular basis, and even though we have high fidelity control over our automation processes, people are notoriously difficult to predict, let alone control.
For instance, consider a process with a reject station. Finished parts are measured and parts that don’t meet specification are diverted down a chute. Good parts continue down the line.
Question: how big do you make the reject bin? Naively you might want to make it as big as possible so the operator doesn’t have to waste time emptying it. Unfortunately that means it’s just as easy to make bad parts as it is to make good parts. You’d be better off to make the reject chute only hold 3 parts, and put a full sensor on the chute that throws a fault and stops the machine when it’s fully. Then it’ll be a pain in the ass to make bad parts, and the operator will have a lot more motivation to do something about it. While you’re at it, put the reject chute on the other side of the machine so they have to walk around the machine to empty it.
Consider a machine with an e-stop button. It’s a big red button with a mushroom head that’s supposed to be easy to hit. However, I’ve seen a lot of machines where the consequence of hitting that button was major downtime because the part tracking got screwed up or the machine just didn’t recover gracefully. I once watched a pallet get bumped out of the track and it was riding along the rail of the conveyor. I hit the e-stop just before it was about to damage some equipment. I was scolded for my efforts: “never hit the e-stop,” he said. That’s the wrong motivation. You want operators to press the button when they see something wrong, so make it easy to recover.
Consider an inventory tracking system. You want people to record stuff they’ve consumed, and what cost account they’ve consumed it against. What motivation does a person standing there with a bolt in their hand have to look up that bolt in your inventory system and mark it consumed? Very little. What if you lock the door to the store room and make them request an item before you unlock the door? That’ll help, but chances are some people who don’t know exactly what they want will click the first item on the list, and just go in and browse. What if you make the inventory storage system so convoluted that the only way to find an item is to look up the storage location in the computer? Well, that might work (until your inventory system breaks).
Like water, people tend to take the easiest path down hill. You’re better off digging a channel where you want it to go than expecting it to get there under its own power. Use gravity to your advantage. Make it harder to do things the wrong way and easier to do things the right way.
15
NTSB Suggests Banning Use of Electronic Devices While Driving — Yeah?
No comments · Posted by Scott Whitlock in Uncategorized
I was a bit surprised to see all the hubbub about the NTSB recommending banning the use of cell phones and other electronic devices while driving. You see, they already passed similar legislation here in Ontario, and the sky hasn’t fallen.
As someone who used to answer my cell phone while driving (I always justified to myself that I was always on a straight stretch of road or not in town but in truth I would answer it almost any time), I can honestly say I was wrong to do that, and I was converted by the evidence. Our brains just don’t seem to be wired to handle cell phone conversations while driving, even though we do much better with other tasks like talking to a passenger.
Unfortunately our brains are also poorly wired to understand statistics. The fact that I used cell phones while driving and I‘ve never had an accident because of it is apparently all the proof I needed to know that it was safe. Of course, the real research disagrees:
There are, of course, edge cases. I know that here you’re free to dial 911. Also, you’re not allowed to be manipulating a Navigation System, but I believe you’re still allowed to have one turned on giving directions (that’s pretty reasonable – you can set it before you leave, or let your passenger set it).
For those of you who like to let their spouse know when you’re almost home, I’ve heard they can get an app where they can see your location in real-time on their phone by tracking your phone’s GPS, so there’s no need to call. (Yeah, I think that’s creepy too, and maybe they already have it installed!)
10
Safe(r) Data Collection from a PLC
3 Comments · Posted by Scott Whitlock in Industrial Automation
There’s been a lot of discussion recently about the dangers of connecting automation equipment to networks, and yet there are significant pressures to do so. Of course, I don’t ever think that you should take a PLC and put it on an internet accessible IP address, but it’s certainly common practice to connect industrial automation equipment to internal LANs to facilitate data collection. People in the front office need to push production planning information down to the production floor, and they need real-time data on what’s going on (not to mention for historical data logging, historians, etc.).
It’s all too common to throw a PLC on the same network as your front office, and I’ve seen it blow up. What happens is something invariably goes wrong on the office network (someone plugs two ports together on the switch in the boardroom, or someone brings in an infected music player and plugs it in, or the DNS server at head-office goes down and the local DNS doesn’t work correctly… I’ve seen a lot). However, you want your machines to keep going when this happens.
This is all made worse now that (a) industrial automation equipment is more commonly based on off-the-shelf commodity hardware and software (e.g. windows PCs) and (b) the people writing malware can actually spell PLC now. Up to this point there’s been some form of security through obscurity.
If you have a local IT staff that’s on the ball, you really should be getting them to handle the network layout. On the other hand, if you’re in a small facility with limited resources, there’s a lot you can do by making some simple design choices that will go a long way towards improving the reliability and security of your systems.
Most automation cells now come with an Ethernet switch already built-in. Typically this is an industrial spec. DIN-rail mounted one. It’s not fancy, but it’s supposed to survive in a panel. These Ethernet switches are there to connect your PLC to your HMI, and increasingly to connect your PLC to Ethernet-based I/O like Ethernet/IP, etc. The common (and wrong) thing to do is to drop a network cable from your plant network to your panel and plug it right into this Ethernet switch. This creates some technical problems right off the bat:
The automation devices typically have fixed IP addresses (I personally prefer this because it means these devices aren’t dependent upon an external DNS or DHCP server – two less dependencies are good). Chances are that these IP addresses won’t work on your plant network, so you have to manage those IPs at the plant level. You’re opening yourself up to someone with the wrong IP on their laptop pouncing on your PLC’s IP address, and then bam, your machine is down.
A much better way is to place some kind of Router with NAT between the plant network and the machine’s Ethernet Switch:
Now if you’re just a little manufacturer with two machines out back and your data-collection link isn’t critical, you can probably get away with one of those home routers from Best Buy that you’d use to connect your laptop and your desktop at home to your cable modem. Note that it doesn’t need to be wireless, and you’re probably better off if it isn’t. The way you hook it up is to connect the Internet (Uplink) port on the router to the plant network and run a cable from one of the ports on the LAN side to the existing Ethernet switch in the machine. If your data needs to be a bit more reliable, consider buying some kind of Cisco router with NAT capability (but you’ll be going from the $50 range to many hundreds of dollars – your choice).
Now, when you configure it, you want to make sure that you turn off the port forwarding function, the DMZ function, disallow remote administration, and block all anonymous internet traffic (these should be default settings, but it’s good to check). Also, make sure the router’s local IP address doesn’t conflict with the PLC’s and HMI’s, and make sure they have the same subnet. Typically you’ll want to either turn off DHCP on the local side, or limit it to a range that won’t conflict with the fixed IPs. DHCP is handy when you connect your laptop to the programming port. Now what you’ve done is made it somewhat invisible from the plant side. Some piece of malware scanning for devices on your plant network should just see a black hole.
Now on the PLC side, you can now initiate a connection to the data collection server even though the data collection server can’t connect to the PLC (in the same way that your home computer can connect to Google, but Google can’t get to your PC – theoretically). Note that a piece of malware on your data collection server or on one of the routers/switches in your plant network could intercept this communication, and own your PLC, but at least you’ve significantly reduced the surface area of attack. Not perfect, but reasonable at this time, depending on the sensitivity of your equipment. I’m assuming you’re not enriching uranium or providing drinking water to my community.
(I’m going to be talking specifically about Allen-Bradley products now – sorry.)
So how do you get the data from the PLC to the data collection server? In the old days you’d have some software on the server like RSSQL, and it used a product like RSLinx Enterprise and as far as I know, it initiated the connection to the PLC. That won’t work in this case. Sometimes you’d throw an OPC server in there, and have some kind of historian that would log tags to a database. That OPC server, obviously, needs to be able to initiate a connection to the PLC. To use a router with NAT, you’d need to port-forward from the router to the PLC (or to the OPC server if it was inside the machine network). That’s undoing a lot of our protection.
What you need to do is initiate the connection from the PLC, and have the Data Collection computer act as a Server. One way to do this is with a 3rd party Ethernet card, like this MVI56-GEC card from Prosoft for the ControlLogix line. I have used that in the past to connect to a server, but it involves a lot of ugly PLC coding. It’s your only option if you have to conform to someone else’s protocol though.
If you just want to write data directly into a SQL database, there are 3rd party products that will let you do this (basically a SQL Server connector card).
But there is an option without buying any new hardware. The ControlLogix/CompactLogix lines can send Unsolicited CIP messages, and you can find products that can receive these messages in the PC world, like CimQuest’s NET.LOGIX product. It can act as a server and receive data directly from the PLC – either individual tags, or even arrays of UDTs. The code on both ends is relatively simple, so all you have to pay for is the NET.LOGIX runtime license, which is cheaper than the hardware alternatives. Note that you can also do this with PLC5 and SLC500 devices, though there’s some more effort involved.
I hope that’s enlightening. This is by no means a perfect solution, but it’s reasonable for now. It doesn’t plug the laptop hole (the programming laptop is probably still your #1 vector for malware to get into your machine network). It’s susceptible to man-in-the-middle attacks between the router and the data collection server. It’s susceptible to exploitable bugs in the router’s firmware. Beware and use your own judgement.
I think I can group decisions into two types:
- Decisions where it’s really important that we make the right decision
- Decisions where it’s really important that we make any decision and everyone gets behind it
For instance, deciding what products to launch for the Christmas season is really important. The choices made will have a profound impact on the bottom line of your company. On the other hand, it didn’t really matter what side of the road we decided to drive on, but it was really important that we, as a group, made a decision, and everyone agreed to it.
Now let’s talk about how organizations make decisions. I think there are typically two approaches:
- Appeal to authority
- Appeal to committee
When appealing to authority, the accounting department has the authority to make cash-flow decisions, and the engineering department has the authority to make technical decisions, and the marketing department gets to decide whether we run Superbowl ads or Craigslist ads. The CEO can override these decisions when a higher level view recognizes a different need.
When we appeal to committee, we gather all the “stakeholders” who then sit around a table, generally as equal representatives of their respective departments, and come to some kind of consensus.
I don’t think anyone’s surprised by the fact that when it comes to making decisions where being right is the most important criteria, authoritative decisions tend to be better than committee decisions. In the same way, when success of the decision is tied to consensus rather than the “correctness” of the decision, then committee decisions probably have an edge.
Now, if you’ve spent any time around government offices, you’ll realize that almost all decisions, including planning the staff Christmas gathering, are done by committee. Very large publicly traded companies don’t seem to be much different. On the other side of the spectrum, small companies don’t need much consensus because they’re small, and they tend towards decisions based on authority. Successful entrepreneurs seem to surround themselves with knowledgeable people and trust those people to make intelligent choices. This makes them well suited to make decisions where it’s important to be right, like how much raw material to buy this month, and where to commit other scarce resources.
It’s interesting to look at the outliers too. Apple is famous for being the exception that proves the rule. Despite being a huge organization, all information seems to indicate that Jobs ruled it with authority, not committee. And since he seemed to make good decisions, they were successful. Apple shareholders beware.
Now let’s go all 7-Habits on this and put it in quadrants, dividing decisions along two axes:
| Great Risk if Wrong |
Little Risk if Wrong |
|
| Must have everyone’s support |
1: Invade Iraq? What product for Christmas? |
2: Drive on the Left or Right? |
| Don’t need everyone’s support |
3: Bail out the banks? Windows or Linux servers? |
4: Chicken or Fish? Bike-shed color? |
I divided it into four quadrants numbered 1, 2, 3, and 4. Quadrants 2 and 3 we’ve already covered. In quadrant 2, committees really shine, and in quadrant 3 authority really shines. I’m not even going to talk about quadrant 4.
Quadrant 1 is the tricky one. The Easter Island society collapsed because they were faced with a decision: do we allow everyone to cut down all the trees, or do we centrally manage it? Obviously they made the wrong decision, but the right decision would have required broad support, which is why it’s so difficult.
Apple beat the quadrant 1 decisions by rolling both authority and consensus into one charismatic (and knowledgeable) leader. People follow leaders who have a track record of delivering on their promises. Success is a positive spiral.
The idea that you can take committees and make them authoritative is misguided. On the other hand, we’ve seen our share of authority figures who’ve succeeded at the long road of building consensus around the right decisions. They are our political and cultural heroes.
All of this brings me to two conclusions:
First, unsurprisingly, is that we shouldn’t put big government bureaucracy in charge of quadrant 1 type decisions (and that’s a bit scary, because they certainly are in charge of those decisions now).
Second is that our system of government tends to promote leaders who are good consensus builders without promoting leaders who are likely to make the right decisions. I’m not saying it promotes leaders who are likely to make bad decisions; I’m just saying it’s neutral on the issue.
I’m not out to change the system of government, but I think a two-pronged offensive could make a dent: on one side our domain experts tend to live in a world where consensus building doesn’t matter because their community has the skill to recognize logical consistent arguments. Scientists simply publish their findings and wait for others to confirm or disprove them. Engineers test various design alternatives and measure their performance. Unfortunately this means our domain experts lack the soft skills necessary to convince us to do the right things. A marketing budget for these experts, perhaps paid for by some rational-minded philanthropists, could go a long way.
On the other side, the general public is hopelessly lacking in critical thinking skills. We live in a world where logic is first introduced as a university-level introductory philosophy class. It belongs in high school (along with some other suspiciously missing life-skills like food/nutrition and childcare).
Unfortunately the high school curriculum is decided on by… a committee.
21
Internet Facing Water Management Infrastructure in Texas Exploited Easily
2 Comments · Posted by Scott Whitlock in Industrial Automation
Completing the trilogy of ICS-security related blog posts, a hacker recently demonstrated how easy it was to find and log in to an internet-facing SCADA system using for water management in a town in Texas. From the article on threatpost:
The hacker, using the handle “pr0f” took credit for a remote compromise of supervisory control and data acquisition (SCADA) systems used by South Houston, a community in Harris County, Texas. Communicating from an e-mail address tied to a Romanian domain, the hacker told Threatpost that he discovered the vulnerable system using a scanner that looks for the online fingerprints of SCADA systems. He said South Houston had an instance of the Siemens Simatic human machine interface (HMI) software that was accessible from the Internet and that was protected with an easy-to-hack, three character password.
For those of us who design, build, and deploy systems like this, let’s ask ourselves what would happen if a serious incident happened and significant equipment damage was done, or worst case, people were seriously injured or killed. Don’t you think the people who worked on the system would end up in court (if not in criminal court, then at least in civil court)?
When in doubt, don’t sit these things directly on the internet. There are lots of secure remote access products available (Google for “VPN”). It’s worth it.
18
Public Water Control System Attacked
2 Comments · Posted by Scott Whitlock in Industrial Automation
Joe Weiss recently reported on the possible hacking of a public water SCADA system, apparently in Illinois. This attack, if it was an attack, caused damage to a pump by turning it on and off repeatedly.
It seems obvious that this situation is going to be repeating itself more and more. If you’re a company with industrial control systems, or you provide control system services, now’s a great time to start thinking about your control system security strategy. Do you have the necessary skills on staff? If not, where are you going to source them from?
11
Finding Internet-Connected Industrial Automation Devices
No comments · Posted by Scott Whitlock in Industrial Automation
I think most people in our industry realize you shouldn’t connect industrial automation devices to the internet, but just in case you happen to think otherwise, here’s a quick explanation why (this is old news, by the way).
You may believe that things connected to the internet are relatively anonymous. There’s no web page linking to them, so how is Google going to find them, right?
It turns out it’s relatively easy to find devices connected to the internet, and it’s kind of like the old movie WarGames where the lead character, played by Matthew Broderick, programmed his computer to dial every phone number in a specific block (555-0001, 555-0002, etc.) and record any where a modem answered. That was called “war-dialing”. In the age of the internet, you just start connecting to common port numbers (web servers are on port 80, etc.) on one IP address at a time, and logging what you find. This is called port-scanning.
It turns out that you don’t even have to do this yourself. One free service called SHODAN does this for you, and records everything it finds at the common port numbers (web servers, FTP servers, SSH daemons, etc.) and lets you search it just like Google. It turns out that (a) most modern industrial equipment is including embedded web servers and/or FTP servers to allow remote maintenance, and (b) most web servers or FTP servers respond with some kind of unique “banner” when you connect to them, announcing who or what they are.
So, if you don’t believe that you shouldn’t be putting industrial automation equipment on the internet, here’s a little experiment you can run:
- Take a ControlLogix with an ENBT card and hook it directly to the internet, so it has a real IP address.
- Wait a couple of days.
- See if your IP address shows up on this SHODAN search page.
You could try the same thing with a Modicon M340.
This query for Phoenix Contact devices is particularly scary, as one of the links is a wind turbine! I was a bit scared once I opened it (it opens a publicly accessible Java applet that’s updating all the data in real-time), so I closed it. There was no password or anything required to open the page. At least the button that says “PLC Config.” appeared to be grayed out. Let’s hope that means it’s protected by some kind of password… and that it’s hardened better than every single major corporation’s website was this year.
Just want to say thanks to DigitalBond for pointing out this SHODAN search for all Advantech/Broadwin WebAccess deployments around the world too.
Google has been developing a self-driving car.
It’s abnormal for a company to go completely outside of its comfort zone when designing new products, so why would Google go into the automotive control system industry? They claim it’s to improve safety. They’ve also offered the pragmatic assertion that “people just want it”.
Google is a web company. It makes huge gobs of money from advertising, and advertising scales with web traffic. Google’s designing a self-driving car so that drivers can spend their commute time surfing the web on mobile devices. That’s the only explanation that makes sense.
Not that I mind, of course… I’d love to be able to read on the way to work. When can I buy my self-driving car? Can I get one that’s subsidized by ads?
25
Control System Security Dilemmas
1 Comment · Posted by Scott Whitlock in Industrial Automation
It’s fascinating to watch what’s unfolding in the Industrial Control System Security front these days. Digital Bond’s SCADA Security Portal is as entertaining as any (thanks to ArchestrAnaut for pointing it out for me).
A brief recap:
- Stuxnet makes news even in the mainstream press
- Siemens shrugs it off and does absolutely nothing about it
- Security researchers, smelling smoke, start poking around PLC security and find it completely lacking
- Details about wide open backdoors inserted into common PLC hardware has now been published online
Things are not moving in a positive direction either. Those security “researchers”, many of whom seem to be selling security solutions, are digging up ways to compromise PLCs and they’re posting all that information online. Now if this forces automation vendors to stop looking the other way and start taking security seriously, then I think it can only be a good move in the long term, but you have to admit it feels a little like a tire salesman throwing roofing nails on the road in front of his store.
All of this makes you wonder, what’s a small manufacturer to do? As always, businesses need to weigh the risks and the costs and act accordingly. This isn’t easy for the decision makers. On one side there’s enormous pressure to network all of the systems together to facilitate the fast flow of information between the ERP, MES, and Plant Floor layers, but on the other side, every interconnection increases the risk of catastrophic failure. I’ve personally seen Windows worms take down automation networks. In the next few years I’m certain we’re going to see worms that can jump from PLC to PLC and probably ones that can cross from Windows to PLC and back.
Properly segregating networks and then managing them is a big IT project. That means it needs scarce resources, and those resources aren’t making money for the company. Big manufacturers have enough cash flow (and have been bitten enough times) that they can allocate resources for this kind of project, but small manufacturers are a different story.
Small companies generally lack the specialists needed to implement such systems. Almost by definition, generalists serve in small companies and specialists gravitate towards large companies. Small companies can only implement commodity solutions (unless it’s part of their core business strength). That means that while we’re all worried about what might happen if a major utility or top tier manufacturer gets hit with an automation security breach, the fact is it’s more likely that small manufacturers will be the first ones hit by a fast-spreading generalized threat. The economic impact could be just as large… those small manufacturers are feeding parts up the supply chain, and in this just-in-time environment it doesn’t take much to cause a major interruption.
What’s the solution?
Short of the automation vendors waking up and making secure products, we need better (and less expensive) tools for securely connecting our PLCs. I hate to say it, but you can’t implement modern control systems without knowing the basics of network security, VLANs, and access control.
No tags
26
The Skyscraper, the Great Wall, and the Tree
No comments · Posted by Scott Whitlock in Software
One World Trade Center will be a 105 story super-skyscraper when completed. It’s estimated total cost is $3.1 Billion.
When it’s complete, how much would it cost to add the 106th story? Would it be 105th of $3.1 Billion? No. The building was constructed from the ground up to be a certain height. The maximum height depends on the ground it’s sitting on, the foundation, the first floor, and so on. Each of those would need to be strengthened to add one more story.
It’s possible to engineer a building so that you can add more floors later, but that means over-engineering the foundation and the structure.
Contrast the building of a skyscraper with the building of the Great Wall in China. The Great Wall is infinitely scalable. It stretches for many thousands of kilometers, and each additional kilometer probably cost less than the one before because the builders became more skilled as they worked.
The difference between the skyscraper and the Great Wall is all about dependencies. In the case of the skyscraper, every story depends on everything below it. In the case of the Great Wall, each section of wall only depends on the ground underneath, and the sections of wall on either side. If one section is faulty, you can easily demolish it and rebuild it. If one section is in the wrong place, you only have to rebuild a copy of sections to go around the obstacle.
When you’re building the Great Wall, you can probably take an experienced stonemason and have him start laying stone on the first day, without much thought about the design. With a skyscraper, it’s the opposite. You spend years designing before you ever break ground.
When people first started to write software, they built it like the Great Wall. Just start writing code. The program gets longer and longer. Unfortunately, complex programs don’t have linear dependencies. Each part of the code tends to have dependencies on every other part. Out of this came two things: the Waterfall model, and Modular programming. While modular programming was a success at reducing dependencies, and eventually led to object-oriented (and message-oriented) programming, the Waterfall model was eventually supplanted by Agile methodologies.
The software developed by Agile methodologies is neither like the Great Wall (there are still many inter-dependencies) and yet nothing like the traditional skyscraper (because the many dependencies are loosely coupled).
Agile software grows organically, like a tree. You can cut off a limb that’s no longer useful, or grow in a new direction when you must.
Many programmers seem to think that Agile means you don’t have to design. This isn’t true. A tree still has an architecture. It’s just a more flexible architecture than a skyscraper. Instead of building each new layer on top of the layers beneath it, you build a scaffolding structure that supports each module independently.
The scaffolding is made up of the SOLID principles like the single responsibility principle, and dependency injection, but it also consists of external structures like automated unit tests. If Agile software is like a tree, you spend more time on the trunk than the branches and more time on the branches than the leaves. The advantage is that none of the leaves are directly connected, so you can move them. That’s what it means to be Agile.
What are the trade-offs? Agile enthusiasts seem to think that every other methodology is dead, but that’s not true. You can still go higher, faster, with a skyscraper, or get started sooner with the Great Wall.
Embedded control systems for automobiles are still built like a skyscraper. That’s because the requirements should be very well known, and future changes are unlikely. On the other hand, you’d be surprised how many businesses run significant portions of their businesses on top of Excel spreadsheets. That’s typical Great Wall methodology. It works as long as complexity stays low.
Use Agile when you need complexity and flexibility.
No tags


