There are two things that I hate in the world this morning: the term ‘IoT’, and the fact that ICS slave devices are the ones which run server software. Sometimes, two bad thoughts do make a good one. This morning is one of those times.
A common architectural theme in IoT devices is that they establish connections to a central service, usually a web application, for instructions on how to behave. In well-designed IoT devices, the connection is initiated by the device(0).
Contrast this to the standard ICS architecture, wherein the field device runs a listening service (often, more than one). The device awaits the connection from master systems, which then pull the information out of the device. This approach uses the opposite of good sense: it does not follow the same design principle which causes us to architect the corporate-to-control perimeter in the way that we do. It is asking the least capable, and also the most critical, device to do the most resource-intensive communication task when it comes to both security and reliability.
This security and reliability antipattern is a direct cause of most, if not all, of the ICS-CERT advisories concerning field devices over the last decade. Every backdoor, every protocol parsing error, and every denial of service bug would be either eliminated, or have its exploitability reduced to ‘local network’ if this antipattern stopped being a standard operating procedure.
A basic tenet of ICS security perimeters is that the control environment should push data out to corporate network (rather than allowing corporate systems to connect into the ICS to retrieve the data). This reduces the attack surface against the entire control system from the corporate network, since it forces, or funnels attackers into compromising one or two systems on the corporate network. The funneling effect happens for free thanks to this design pattern, and does not require a well-configured firewall or gateway to provide protection (although, such a device is still recommended). In order to pivot into the control system, client-side attacks are necessary against the ICS (really, these attacks target systems which themselves are in a DMZ). Most of the security resources can then be spent on this small number of corporate systems and the DMZ itself: hardening the systems, keeping them patched, etc.
For whatever reason (1), this same logic is not applied to the interior of a control network. Instead of having field devices ‘push’ data (in the form of initiating connections to the Master systems), field devices sit with services listening on them, waiting for any computer on the network to connect to them and begin issuing demands. It gets even weirder when we start thinking about these networks in terms of zones and conduits: there is no easy principle to apply where their security is concerned(2).
This is the reason that so many field devices crash when scanned. The embedded operating system, coupled with crummy processor, can’t deal with basic attacks such as a SYN flood.
IoT devices for the most part turn this architecture around, and in a good way. If the actual implementation is done correctly, there is no listening service on the device or device gateway to attack. If a new IIoT (3) device is deployed, the attacker is once again forced to compromise one or two systems on the ICS LAN (the data gateway), and then to perform client-side attacks once they’re in. Security protections can be centered on this handful of servers. While it seems counterintuitive to call this central point of failure ‘better’, it is not so much worse than having dozens of homogenous vulnerable controllers. Besides, as a server running a ‘real’ OS, it can be patched, audited, and logged more easily.
Of course many IoT devices are /implemented/ incorrectly, and have developer listening services left on them. However the devices quite often are /architected/ better: they initiate the connection to the Cloud.
Future ICS design would do well to learn from this. I expect the reality to be a good IoT design percolating into the Industrial space, and forcing out (or being absorbed by) the big legacy players.
image by adactio
(0) The IoT design pattern world hasn’t really got its terminology down, although they are thinking about it. I’ll call this the External Collection Application (Push) pattern, or ECAP for short.
(1) I’ll blame a principle called ‘First to Market’: it was much easier to wrap Modbus in TCP by making the slave device be a server. While I’m sure many engineers realized that this was a bad idea, doing it another way would mean more development time. Given a choice between robustness and more money, we humans can be pretty shortsighted.
(2) Most operators continue to use and apply the Purdue Model and its associated terminology, instead of ISA99. ISA99s Zone and Conduit concepts usually require a SCADA Hero to step in, should something go wrong. There is no easy principle to apply should ‘the new employee’ find him- or herself in the position of troubleshooting a network issue. Instead the SCADA Hero has to consult out-of-date network diagrams and understand how the protocol works to fix an issue with a particular conduit.
(3) I use the term to describe industrial control products which follow IoT design principles: the ‘outbound connection’ is chief among them.