With the use of cyber attacks and physical sabotage increasingly likely, I thought I should take the time to discuss the vulnerability space of good old fashioned railroad interlockings. I say old fashioned because at this time I do not want to re-hash the security issues associated with CBTC and PTC. Today I am just going to look at the logic that controls the switches, signals and related interlocking appliances and ways they could be directed to disrupt normal operations, specifically through the creation of unsafe situations.
Railway signaling is implemented by two separate yet equally important parts. The safety critical logic that detects and prevents unsafe situations, and the control systems that display information to rail controllers and transmit that information to/from field locations. In the same vein, one can attack the interlocking logic or one can attack the control systems and in each of those cases one can try to make the system non-functional or one can try to make the system unsafe. So before even getting into the various types of technology we can sort the threats and vulnerabilities into those four bins.
Skipping over mechanical or electro-mechanical plants where the interlocking logic and the user interface are united and a human is on site to monitor things, relay based signaling is going to be the most resistant to malicious change. Relay logic is literally hard wired and extensively tested for safety meaning that there is little an attacker could do, even if they had full control over the communications link and human interface. In terms of physical attacks and sabotage operations on the other hand, relay logic can be modified with only basic tools. Although the mess of wires in a relay hut or room is very complicated, the concepts are straightforward and can be determined using the documentation that is often left in each location. North American style logic is a bit simpler to modify as it relies in high reliability components whereas European style logic uses lower quality components with additional validation logic to check the result.
Solid state or microprocessor based interlocking became all the rage starting in the 1980's and continues to command an increasing share of the market. This type of technology unfortunately imports all of the problems associated with industrial control systems and Internet of Things from a security point of view. The good news is that these components undergo rigorous safety and regulatory compliance testing, the downside is that tends not to include security testing. Unfortunately I can't just say "this is good because" or "this is bad because" as there are simply multiple ways that any specific vendor may have implemented its technology. Still, there are some general conclusions that can be reached.
Microprocessors run on code and code modification and/or code injection forms the basis for most types of malicious exploitation. Under North American practice, the code is stored on Read Only Memory type modules (likely EEPROMs) and is a regulated item in that no official changes can be made without going through a regulated test procedure. The $64,000 question is if, if any case, the processor accepts data, or if it accepts state. Accepting state means that to request a route the only thing the interlocking logic "sees" is a voltage on a line in the same way a direct wire unit lever relay interlocking machine puts a voltage on a coil to lift a relay. Accepting state only generally precludes modifying the code. On the other hand if the interlocking processor accepts bytes of data, it is almost certain that flaws exist within the code that would allow an attacker to take full control of the interlocking process given sufficient knowledge and preparation. The fact that many of these product lines have been around since the 80's or 90's imply that they use older types of processor that have little in the way of hardware based defenses against this type of attack.
Larger issues appear outside of North American practice where interlocking functions are more centralized and therefore have less obvious separation between the safety critical parts and the control system. Under North American practice interlockings and signal locations in general have to be transferable to new owners. This means that not only does each location need to be atomic, but forwards and backwards compatible with any control system. (That pretty much makes it impossible for the signaling hardware to require data.) Under European practice, centralized interlocking/signaling systems lack the guardrails against plugging the human interface directly into the safety critical processing elements. I believe that 2 of 3 voting systems are used to gain the desired fail safety performance, but since Europe often considers the human interface as a safety critical system, I would not be surprised if the signaling processors themselves are handling state requests and changes directly. This creates a massive vulnerability for exploitation.
This brings us to the control systems. Here we need to look at both what is being sent and how its being sent. If one is only sending state or state commands, as in the North American system, the signaling control network is pretty much irreverent. Over the air or over the internet there is nothing an advisory can do except make the system unavailable (which is a problem, just not a safety problem). Even if state updates are suppressed, North American dispatchers worked successfully for years in that fashion given the limitations of early wide area CTC systems. European style area signaling schemes run into a different set of problems when signaling logic is centralized. In this case field equipment such as switches and signals act as dumb terminals and simply do whatever they are commanded to do. This is where the serious risk lies as it is highly unlikely that a 1980's or 90's grade computing system would have much in the way of "securing" its safety critical messages beyond a parity or other redundancy check. Anyone with access to the communication link would be able to make arbitrary commands including the setting of conflicting routes and display of false clear signals. Granted many of these area schemes are not wide area and use dedicated lines that can be considered equivalent to some of the longer direct wire control situations in North America, but in an era of IT efficiencies how tempting would it be to replace a bespoke wayside cable link with a VPN running over the internet.
In terms of direct sabotage, card based microprocessor systems are somewhat harder to modify than relay ones as they require special reprogramming tools ranging from EEPROM programmers, link cables and the almost certainly proprietary software used by the C&M maintainers. It's entirely possible that the available tools would themselves would not allow for the creation of unsafe situations, thus requiring further reverse engineering. Nevertheless, cyber-physical systems still have a physical component and it is still possible to move output wires around to create the desired result.
I hope this provides a little insight into how train control and interlocking systems can be attacked by either remote or local actors. In the grand scheme of things, physical sabotage is generally considered to be beyond the scope of technical security beyond the presence of robust locks and an alarm system. However, during some sort of armed conflict or occupation the possibility of such attacks would increase.
"Solid state or microprocessor based interlocking... unfortunately imports all of the problems associated with industrial control systems and Internet of Things from a security point of view... the downside is that tends not to include security testing."
ReplyDelete"On the other hand if the interlocking processor accepts bytes of data, it is almost certain that flaws exist within the code that would allow an attacker to take full control of the interlocking process given sufficient knowledge and preparation."
These statements are so not true. Microprocessor-based interlockings do accept vital information in the form of data. But, first of all, they do it in a way that there're virtually no difference between "data" and "states".
One one hand, interlocking programs are designed to be pure functions without any side effects. Operations that may induce security problems, such as memory allocation, are not allowed by design. Moreover, some designs even ban the use of branches, in which a normal function like this:
function(x)
{
if(x==1)
{
return 2;
}
else if(x==2)
{
return 5;
}
}
must be rewritten like this:
function(x)
{
return 3*x-1;
}
On the other hand, interlocking processors internally encode states like "1" or "0" as digital data, while communication between them also takes this form. This is mainly for safety, so that any random flip of bits will not cause a wrong-side failure. Contrary to your belief that "highly unlikely a 1980's or 90's grade computing system would have much in the way of 'securing' its safety critical messages beyond a parity or other redundancy check", it goes far beyond a simple parity check. GRS developed what is known as the NISAL algorithm in the 80's, which is still in use in Alstom's newest generation of signalling products. Here's a reference: https://ieeexplore.ieee.org/document/175397
Then, in the interlocking world, vital communications should either go through dedicated channels, or via open channels but with data encrypted. (surely this option hadn't been available until public usage of cryptography was allowed.) As you've mentioned in the post, dedicated communication channel does not guarantee security, but at least it's not less secure than conventional signalling systems - One may trespass into the ROW and send false indications through signal fibers, while he/she may also connect a function generator to the running rails to create a wrong-side failure in conventional cab signalling.
ReplyDeleteOnly non-vital data as CTC commands are allowed to transmit via public channels without being encrypted. Due to the layered structure of microprocessor based interlockings, this will not cause a security issue. Under either American or European design, the CTC will only interface with the non-vital subsystem (known as "the upper machine" in Europe) of these interlocking machines. The vital subsystem ("the lower machine"), however, has its own proof of security: the input vital data or states are projected into vital output states via distinct controlling algorithms often running - in Europe - on distinct processors. To succeed, the attacker must first breach into the vital subsystem, find out a common mode failure of these different algorithms and exploit it, which is literally Mission Impossible, even with powerful tools like Stuxnet. Also since 90's, many of the controlling algorithms have been developed in the so-called "formal method". This automatically and mathematically ensured no vulnerability to exploit.
Last but not least, microprocessor based interlockings differ significantly from "industrial control systems and Internet of Things", even in the context of security. In Europe, interlocking machines must be SIL-4 certified, while few commercial off-the-shelf PLC controllers can get such certification.
ReplyDelete