Cloudflare owned DNS service faced a Critical issue during the new interaction and goes down for 17 Minutes that lead to 22.214.171.124 DNS Resolver to be Globally Inaccessible.
DNS 126.96.36.199 was recently announced by Cloudflare which is one of the Internet’s Fastest DNS Service that extremely focuses to prevent sophisticated DDoS attacks.
Also, Cloudflare using Gatebot DDoS mitigation pipeline that performs hundreds of mitigations a day also Gatebot mainly protecting Cloudflare infrastructure and their customers from L3/L4 and L7 attacks.
Users Pointing their Router level DNS resolution at 188.8.131.52 on 31 May, 7:58 UTC would have experienced Exactly 17 Minutes disruption.
Cloudflare deploys mitigations for large DDoS attacks to reduce the CPU consumed by malicious traffic Also implemented multiple layers of defense.
— Cloudflare (@Cloudflare) June 1, 2018
What Actually Went Wrong with Cloudflare’s Gatebot
Cloudflare was tried to deployed new code that introduced Gatebot to Provision API which is one of the Cloudflare’s internal integration points that helps to figure out the IP’s addresses belongs to one of Cloudflare’s addresses or not.
But Provision API didn’t know about that 184.108.40.206/24 and 220.127.116.11/24 are special IP ranges and during the integration work, they didn’t implement this manual exception.
Earlier time Cloudflare mitigations were applied manually by their tireless System Reliability Engineers and later Gatebot were introduced by Cloudflare to aid and reduce the manual Work.
In this case, Cloudflare forgot to implement this manual exception while They were doing the integration work.
So Cloudflare’s Gatebot suddenly started interpreting traffic to 18.104.22.168 as a DDoS attack on its infrastructure.
According to Cloudflare, The automatic systems deployed DNS mitigations for our DNS resolver IP ranges for 17 minutes, between 17:58 and 18:13 May 31st UTC. This caused 22.214.171.124 DNS resolver to be globally inaccessible.
In this case, Cloudflare was completely transparent and said, “We want to apologize to all of our customers. We will use today’s incident to improve. The next time we mitigate 126.96.36.199 traffic, we will make sure there is a legitimate attack hitting us.”