Update: Instagram has confirmed that service is now restored via a tweet sent in the early hours of this morning (CET)…
Facebook has yet to tweet confirmation of its return to stable uptime but at the time of writing the service is at least accessible in Europe so it looks to be getting on top of whatever caused the downtime. We’ve reached out to the company for confirmation.
Original report follows below…
At least one security firm thinks the culprit of the Facebook outage could be a border gateway protocol routing leak.
Routing internet traffic around the world relies on the border gateway protocol (BGP), which manages how internet traffic is routed the internet. BGP relies on trust between network operators to not send incorrect or malicious data. But mistakes happen, and malformed data can form a “route leak” that leads to confusion over where internet traffic should go, and can lead to massive outages.
In a BGP route leak, the routing announcements from an autonomous system that guides the information to its destination is inaccurate and is rejected by either receiver, the sender or an intermediary along the route that packet is supposed to travel.
That may be what happened to Facebook.
“At approximately 12:52PM EST on March 13th, 2019, it appears that an accidental BGP routing leak from a European ISP to a major transit ISP, which was then propagated onwards to some peers and/or downstreams of the transit ISP in question, resulted in perceptible disruption of access to some well-known Internet properties for a short interval,” explained Roland Dobbins, a NETSCOUT principal engineer in an email to TechCrunch.
At least one professor isn’t convinced that a BGP leak explains the outage. “It is possible that a route leak could have caused/contributed to the outage event Facebook and its affiliated applications faced today. When routes are ‘leaked’ erroneously they can have a large impact to the negative on functions and availability of services,” wrote Tulane University adjunct faculty member, Tom Thomas.
“However, BGP is a usually a static protocol, meaning that once it’s setup it rarely changes. More likely a cause of this nature would be due to a mistake in programmatic automation and various health checks that they perform to ensure optimal functionality for users. If I had to conjecture, I would suspect that the outage today was likely due to a flaw in the code that controls such functions on a high-level business wise. Consider that the impact was across several Facebook owned services therefore the likelihood of them trying to be efficient in their code and its centralization for many services is more likely the root cause,” Thomas wrote.
Facebook and its related family of apps have been down for most of Wednesday.
There’s not much more information to share at this point, but the web is freaking out (as is to be expected).
Facebook has confirmed the outage and we’ll update as we get more information.
The social media management tool Naytev also confirmed the outage. “Facebook is experiencing a large outage, impacting posting to Facebook and the ability to log into Naytev. We are actively monitoring the issue and we hope Facebook resolves it soon,” the company said in a message to customers seen by TechCrunch.
As the outage is persisting throughout the day, Facebook has taken to Twitter to respond to some of the claims that are floating around. The company earlier batted down a rumor that a DDoS attack was behind the outage.