Optus Internet traffic (as viewed from Cloudflare) is getting back to normal levels after today’s widespread Internet and mobile outage.
Still no real information from Optus as to the root cause however it has been claimed by the Minister for Communications that it was not to be due to a cyber attack or hack, and was a “deep network” fault. Deep network meaning core network.
My feeling is that it was due to some maintenance gone wrong. The outage started at 4am AEDT and 3-4am is a common time for network maintenance as it is early hours of the morning in all states and about the lowest point for traffic levels. The maintenance may have been something like a firmware update or configuration change on core devices.
I’ve seen suggestions that the issue was due to a fault on their route reflectors. Route reflectors and basically a special purpose routers responsible for receiving and re-advertising BGP routes to all other routers in a network to avoid the need for setting up a full BGP mesh. It’s hard to speculate if this was the cause of today’s issue without knowing the topology of Optus’ network but it is conceivable.
The biggest question I have though is why it took about 9 hours from identifying that there was an issue to finding the cause and starting to bring services back online.
#Optus #OptusOutage #OptusFail 