There are nine exchanges where OpenNet fibres from customer premises are terminated at. One of these, the one at Bukit Panjang, was badly damaged by a fire last week (9 Oct). It resulted in a variety of service disruptions to business and residential users across the island. SingTel and M1 reportedly restored service to all their affected customers on Monday (14 Oct) and Wednesday (16 Oct) respectively.
StarHub announced that as of 4pm on 14 Oct, they still had some 500 customers affected by the disruption.
Just one fire. Imagine if there were coordinated acts of terror targeted against multiple exchanges. Some exchanges can be quite easily assumed to be more important than others. Some would have more economic impact than others. Bukit Panjang can be said to be a smallish exchange, just serving suburban neighbourhoods. But see what services where impacted:
- 100 mobile base stations
- 60K fixed broadband lines
- 30K mioTV connections
- 30K voice lines
- 18 DBS bank branches
- 2 UOB bank branches
- Many DBS ATMs and AXS machines
- 11 OCBC ATMs
- Singapore Pools and Singapore Turf Club
- NETS payments at retailers
There are many more StarHub, M1, and other OpenNet connections that are likely not even counted in the above list.
SingTel’s first announcement stated that just 33 fibre cables were damaged. It sounded too good to be true. Either the fire was very small, or this exchange is really very small. It eventually emerged that the trouble was much bigger than just that 33 cables. OpenNet reported 81 of their cables were damaged, and that affected 23K fibre strands that served 46K connections.
I won’t be surprised that there are many more disruptions that have gone unreported.
My point is not to complain about the lack of clarity and transparency on the true impact of the fire. Instead, I wanted to point out our increasing dependency on data communication networks, and the apparent lack of redundancy businesses have put in their service provisioning.
Ordinary end-users (e.g. residential broadband customers) are already complaining about the lack of network redundancy. As someone familiar with computer networks, I can appreciate that providing redundancy at the network edges isn’t easy. It’s not impossible, but it’s usually more costly than it is worth for non-critical use.
However, the case is different for business users. I expect banking services should be unaffected by such outages. There are only 30 DBS bank branches, but 18 of them were taken out by the outage. That’s really bad isn’t it? (There are 84 POSB and DBS branches together, but 18 out of 84 is still a rather large fraction.)
Realistically, what are private businesses expected to do? Today, the Singapore government is pushing OpenNet as the platform for all data communication connections. Sure, we have competition at the retail service provider level (i.e. the ISP), but the government is converging us onto a single physical communication infrastructure. Even if you buy redundant network connections, they are going to run physically the same way and into the same exchange. One fire, like this one at Bukit Panjang, will take out whatever redundancy you have planned.
It’s like how our electricity grid is. We have competition between power generating plants (though it only matters to large consumers), but there is only one grid. So what if you are a big consumer with the money to buy electricity from two different power plants, but there’s just one grid delivering power to your premises. If the grid is out, you’re still out of power.
Our MRT could very much be said to be planned the same way too. If you have a problem with the North-South line, there’s not much of a realistic alternative for commuters to take. It’s not that we need to build a standby North-South line. It’s the lack of contingency or continuity planning. Suppose there is, touch wood, a catastrophic event on the North-South line that would render it inoperable for 1 month, what will Singapore do?
We seem to have been planning our infrastructure, our critical ones I must add, in such a manner that they are extremely vulnerable to single points of failure. It’s not a single point that is subject to a single point of failure, but a whole big area where any part of it is a single point of failure.
A single common shared infrastructure like what we’ve been building makes good economic sense in the way it saves money, minimises effort, and maximises returns. But from a service resiliency point of view, we are exposing ourselves to too much risks.
Can we not let infrastructure providers/operators compete with each other right to the door step of, at the very least, business user? For example, could we have two or more NetCos (what OpenNet is in the NGNBN framework) laying fibre infrastructure across our island?
On a separate matter, businesses themselves don’t seem to do much to plan for their own service resiliency. Case in point: The fire at M1’s data centre earlier this year took out a pretty big part of their mobile service across Singapore. Their mobile base stations apparently just had one single uplink to that one data centre. When service at that data centre was disrupted, there was no automatic failover to a backup data centre. M1 was still apparently very proud of themselves to have restored service within 3 days.
If that’s how our businesses think, then it’s no wonder the kind of service levels we are getting. SingTel’s five days for restoration this time around, although most would say took too long, isn’t nearly half as bad as M1’s incident.
Our data communication networks are like the nation’s heartbeat. It’s ironical that I recently posted about CyberShock, an event which aimed to educate us about IT security risks and how cyber attacks can impact us. We don’t need to worry about cyber attacks yet. Accidents are enough to take us out. The sad thing is, these accidents are quite easily preventable, to the extent that I’d prefer to label this outage incident as one of negligence.