For a bit of history that we all know, ICS wasn’t originally built to be patched and updated on a regular basis. In an automation world that demanded static systems that could perform their function day after day with limited intervention, this wasn’t surprising. However, the ICS world is realizing now that patching and updating are necessary components of system upkeep, and are pressuring their vendors to support routine patching, many due to regulatory requirements. Many Vendors are responding well, some not so well.
I’ve been hearing some rumors/anecdotes/etc recently that I hope aren’t turning into a pattern. They surround this basic scenario:
- Owner wishes to patch their ICS system, possibly after a really ugly vulnerability assessment or even a virus incident
- Owner contacts Vendor, and says they want to patch their system
- Vendor offers make the results of patch testing available to Owner after paying some subscription fee, usually the Microsoft patches
- Owner either hops on the subscription, or doesn’t, depending on their risk calculation at the time of the decision (usually, “That Never Happens Again” or “That’ll Never Happen Again”)
Simply testing every Microsoft release is necessary, but it’s hardly the complete picture nor the most efficient use of owner paid time. I’d classify the subscription as basic acceptance testing, ensuring that the core product functionality is maintained after a patch. But many control systems use other components, such as OPC servers or custom scripting, that aren’t covered by the basic acceptance testing. Fundamentally, the basic acceptance testing helps reduce one variable in the risk equation, but does not address others at all.
Additionally, basic acceptance testing is reasonably easy for a vendor to do. It requires systems with the latest version of the vendor software installed, maybe even several systems that are a few versions behind. This type of acceptance testing dovetails into product testing activities, which are part of normal product development. While it’s certainly clever to charge customers for a service that may already be done internally, is it providing owners the intended risk reduction? If it’s not providing reduction, the blame for patch problems, and the associated troubleshooting, may outstrip the benefits gained from simply publishing this acceptance information and noting a limit to its effectiveness.
Now, I’m not saying that patch assessment and testing services are not valuable, I am saying that a simple testing service doesn’t provide the best reduction in risk for an owner. The valuable services help with the detailed testing, looking at the interactions between control applications for areas where the patch may affect it. If a vendor would provide an owner specific test system that has 3rd party components, including cyber security components, and would test patches against that owner system, more risk could be covered and value is increased.
Recently, a study by Tofino estimated that there is a 1 in 12 chance that a patch would affect safety or reliability, which is frightening if accurate. If accurate, a cross reference of 1 in 12 with Microsoft Patch Tuesday releases over the past year shows that a monthly patch cycle would result in a ~50% chance of failure of a patch operation, which would make owners take a long step back. This figure seems high to me, which could be related to the age of the control systems in the study. My personal experience, with reasonably modern control systems applications (circa 2010), is that the patches rarely fail due to basic Windows patch issues. Where they do fail often is where interconnection is forced between older control systems, or where legacy products had to be used to connect with old hardware/software, like PLCs or NetDDE.
So, how many systems do you own that are legacy, in the area of Windows NT, 98, or 2000? How many control system applications do you have that were originally built for these legacy systems, and that you are running on modern operating systems? These are your risky systems, the ones I maintain will likely result in patch failure and downtime. Are these being tested in that patch subscription service, or replicated in a test environment effectively?
Generation systems have a very interesting cross section of legacy products and modern products when it comes to patching. They often have money to upgrade the core DCS to a relatively modern one, but won’t have capability to upgrade ancillary systems, such as the baghouse, water treatment, or fuel handling. This results in a nice DCS that can be patched with near impunity, but patching ends up crashing the ancillary systems.
In my May 17th training outside of Rosemont, IL, I discuss many of the generation specific issues, give some examples of patching problems that have happened in generation, and offer recommendations to reduce risk of patching systems.