In August 2016, users of Singapore’s normally-reliable Circle Line increasingly found their daily journeys disrupted. Seemingly without warning, trains would suddenly apply their emergency brakes and rapidly come to a stop.
For the next few months, the problem persisted. Passengers became increasingly annoyed, but so too did the people at SMRT, the line’s operator. It wasn’t that they didn’t know what the problem was – that much had been diagnosed relatively quickly. Periodically, the trains would suffer signal loss, which would rightly trigger the emergency brake. The problem was that they couldn’t find anything wrong with the trains affected or the signal transmitters. This seemed to happen entirely at random.
Eventually, as much in frustration as with any hope of resolution, SMRT and the LTA, Singapore’s transport authority, began to look outside of their immediate comfort zone for help. This led them to Singapore’s GovTech Data Science Division. More specifically, to data scientists Daniel Sim, Lee Shangqian and Clarence Ng.
“I take a train on the Circle Line to my office at one-north every morning.” Daniel Sim would later write. “So on 5 November, when my team was given the chance to investigate the cause, I volunteered without hesitation.”
Given the nature of their work, it is perhaps not surprising that Sim and the others in his team refused to accept that the problem was actually random. Yet the data available to disprove this was limited. SMRT could only provide the date and time of each incident, its location, the ID number of the train involved and the direction it was travelling in.
Nonetheless, Sim and his team got to work, compiling the data into a useful format and setting to work on its analysis. Their early efforts revealed little of promise. The incidents were spread throughout the day and happened at various locations. Nor did they seem to affect only a small group of trains.
None of this was news to SMRT. Indeed it largely reflected what they’d concluded themselves. As dedicated data scientists, however, Sim and his colleagues weren’t finished. They kept working with the data, looking for a better way to visualise it. They decided to plot the relationship between the passage of time, trains, and location in a more complex way so, inspired by Edward Tufte, they put together a Marey Chart.
Original chart, via Data.gov.sg
And suddenly, there on the screen, they begin to see a pattern. The breakdowns always happened in sequence. Whenever a train got hit by signal interference, another train behind it, moving in the same direction, would be affected soon after.
Original chart, via Data.gov.sg
“What we’d established was that there seemed to be a pattern over time and location.” Sim wrote. “Incidents were happening one after another, in the opposite direction of the previous incident. It seemed almost like there was a “trail of destruction”.
This pattern got Sim and his team thinking: Maybe the trains that were experiencing the problem weren’t causing it. Maybe it was something that was being done to them. For this to be true, however, then that cause would also have to be moving. Otherwise, the signal loss would happen in a more consistent location. This led to an obvious question – what was happening on the other track at the time each incident occurred?
That question, in turn, led the team to a hypothesis – the line had a mysterious ‘rogue train’. One that was fine in and of itself, but whose proximity to other trains would somehow cause them problems.
It was a solid hypothesis and one that should have been easy enough to test. The problem was, the dataset that SMRT had provided Sim’s team was incomplete. SMRT didn’t have data available, at least in a format the team could work with, on where every train on the network had been at the time of each incident.
Unwilling to admit defeat, however, the team got permission to head down to Kim Chuan Train Depot that night and get creative. It wasn’t that SMRT didn’t have that data, they suspected, it was just that they weren’t thinking about data in the right way.
The rogue one
By 03:00 the next morning Sim and his team were tired, but feeling confident. At the depot, they had found the extra data source they needed – video records of every departure from every station. It had been a painstaking, manual process, but by cross-checking incident times against video recordings of station platform departures they had identified a likely candidate for their rogue train hypothesis – unit PV46.
Later that day, SMRT put the team’s theory to the test – PV46 went into service during off-peak hours and, as the team had predicted, a number of failures soon occurred. On Monday, this happened again. SMRT removed the train from service and, after careful investigation, the issue finally became clear. For reasons unknown, PV46 had developed a unique fault – its signalling hardware was emitting additional, erroneous signals. This didn’t affect its own operational pattern, but it meant that, when PV46, another unit and the trackside signalling system were in a specific alignment, the other train’s own signal would be interrupted.
By analysing the data, and by finding more of it to fill the gaps in their knowledge, Sim and his team had solved the Circle Line mystery.
Bringing ‘Big Data’ to rail
“If you look at why ‘Big Data’ and ‘Industrial Internet’ has happened now and not, say, 15 or even 10 years ago. If you ask ‘what has caused this digital revolution now?’ It’s a kind of perfect storm.”
John Raymond is a Director of Digital Services at Thales. That title brings with it both the responsibility for their cloud-based Asset Management solution for railways and metro systems around the world and an understanding that situations (and solutions) like those seen in Singapore are increasingly common.
“The cost of transmitting data has significantly reduced.” He continues. “The cost of storing data in the cloud is negligible, and the personal revolution of the internet now gives us the ability to access that data anywhere, anytime, on any platform.”
‘Big Data’ and ‘Industrial Internet’ may seem like buzzwords, but ultimately they’re just technical shorthand for the world that John describes – one in which it is possible to track, store and access data about infrastructure and networks on a level that has not previously been seen.
None of those activities in themselves would be special. In combination, however, they enable exactly the kind of activities that Sim’s team undertook in Singapore. They allow both human and, increasingly, automatic assessment of the state of a piece of infrastructure. That assessment can then be used to spot emerging problems and trigger interventions before they occur, or – as in Singapore – help make seemingly random occurrences significantly less so.
As a concept, of course, this is nothing new. The ‘predict and prevent’ maintenance methodology has existed since the moment the first linesman walked the track, tapping the rail. What’s changed, however, is the cost of collecting and storing the information. Where once actively monitoring the state of an asset required frequent, physical inspection and acres of physical storage space, now the same information can be gathered, transmitted and stored remotely.
This doesn’t just make gathering (and storing) the same information as before cheaper. It also opens up a whole new range of data sources on which decisions can be based.
“As an example, let’s take a level crossing.” Raymond says. “Twenty years ago, how interested was someone in knowing how many times that had gone up and down? Even if they were, the cost of storing that data was prohibitive. No one would have wanted to keep that data for five years at that point, because it was pointless – how were you going to use it? How were you going to transmit it? It just wasn’t worthwhile.”
Today, however, that information can be added to the pool. It becomes another data point available for establishing the health of the infrastructure and – beyond that – contributing to strategic decisions about the railways.
Just how quickly the railways can adapt to this change, however, is something that is still to be seen.
“Big Data is mostly being used really for maintenance rather than operational decisions.” Raymond admits. “You’re using data to predict failure.”
“It can do more than that, but to make strategic decisions you need more data.”
Gathering that extra data isn’t always simple. The railways bring with them a unique set of challenges, not least due to the slow pace at which various layers of the underlying infrastructure change. Baking the ability to collect and transmit data into new systems is easy. Retro-fitting it is significantly harder – particularly in areas with a long railway history, like London.
“The whole point of digitisation is you need assets that can produce data.” Raymond explains. “If you’ve got Victorian infrastructure, I can’t help you. Even systems that are twenty years old… the skills and technology move so fast that there often aren’t enough people around anymore that understand how the software on them was built and really works.”
This isn’t to say that gathering data from those systems isn’t possible, just that the options are limited.
“If you’ve got a railway that is particularly old and unreliable, you could say ‘where is it unreliable?’ But it does limit what you can do.”
The devil in the analysis
Gathering the data is, of course, only half of the battle. In order to draw effective conclusions, it has to be both the right data and be analysed.
“A lot of organisations go into this and think Big Data is the end result.” Raymond explains. “In a way, I suppose it is. Provided you do the analysis, and use that data in the right way.”
That analysis isn’t always easy. As Sims admits in his own, comprehensive write up of the events in Singapore, the team there got lucky. The distribution pattern of stations and the train frequencies on the Circle Line meant that the failures happened close together. This resulted in a clear pattern once the data was plotted correctly. That pattern would still have been there if the network had been shaped differently, but it would have been far harder for a data scientist unfamiliar with transport infrastructure to spot.
“Cross-industry, at the moment this is an area being dominated by traditional IT companies.” Says Raymond. “They struggle to understand how a railway works though. They don’t understand the challenges that the industry has. They don’t, for example, build point machines, so they don’t understand the different ways in which point machines fail.”
That understanding is critical, he says. So is accepting that the act of analysing the data, at least to begin with, is just as likely to highlight gaps in an operator’s knowledge that need to be filled as it is to yield quick solutions to perceived problems.
“Those companies that start off small, incrementally proving that Big Data has a ‘Small Data’ application, they generally get there.” He comments. “It may take longer, but each expansion – from operational to strategic – leaves them understanding how to use data and what they should be collecting, a little bit better.”
That incremental approach is something that traditionally hasn’t sat well in rail. This is one area in which Raymond argues the industry needs to accept that planning for, collecting and analysing infrastructure data is a bit different.
“The cost of repeating the architecture if you end up having two Big Data systems isn’t enormous.” He says. “Prove to the business that you can deliver direct, digital change and the architecture will follow. That’s more important.”
“Predictive analytics and assisted decision-making are difficult to adapt to culturally, though.” He admits. “And a lot of this is about cultural change and changes to the business model. Air traffic management faced a similar issue in previous decades. Rail has lagged behind that.”
Changing road infrastructure management
Bringing that same Big Data approach to roads is the next challenge. As with rail, this represents its own, unique challenge. Not least because in some ways roads are actually significantly ahead.
“With everything that’s going on in the road market with regards to driverless cars and Uber, it’s more advanced and taking a larger risk in some areas.” Says Raymond.
At the same time, however, he acknowledges that the role Big Data can play in addressing the wider strategic and operational issues the road industry faces is still open to debate.
Partly, this is because those issues are broad, but it’s also because so too are the opportunities for intervention. This makes finding the right starting points (and establishing both their cost and impact) more complex.
An obvious opportunity for Big Data exists in smoothing traffic management and reducing congestion, for example, and in various areas, this is already underway. Not only does this have a positive impact on journey times, but it helps tackle the problem of emissions-based pollution as well. Something that’s an increasingly critical issue for cities like London.
Yet this isn’t the only potential low-hanging fruit in this area. One of the biggest causes of delays in the capital is badly-timed (or badly placed) roadworks. Taking the ‘predict and prevent’ methodology more effectively to road, thus allowing critical works to be scheduled for less-busy times ahead of actual infrastructure failure, would have a similar effect.
Which of those two things would have more of an impact both for road users and the bodies tasked with managing local and national infrastructure? Which is more likely to trigger that cultural transition to an organisation which ‘gets’ the need to digitally iterate? What else can be done?
This is a debate which we will return to in future.
Cover photo by mailer_diablo (Self-taken (Unmodified)) [GFDL or CC BY-SA 3.0], via Wikimedia Commons
This comment is about the new website.
On my phone the text is rather pale. I suggest higher contrast in the body text. The headings are fine.
I see todays headlines blame SWR’s Class 442s for being ‘rogue trains’. Interesting that they are affecting just the one signalling area, I assume this is in contrast to the signalling interference previously caused by the Intercity Express trains north of Doncaster or almost any new train in the Heathrow tunnels?
Would be interesting to find out how the engineers identified the causative class.