Search:  

 
 
   News
home

Experiences with DDOS mitigation - Part I
02:52AM Sunday Apr 20 2008 by justin
The last few weeks we have seen a number of distributed denial of service (ddos) aimed a dslreports.com. An attack would be from an hour to 48 hours. Over two weeks the attacker experimented with larger and different methods.

nac.net kindly provided me with a graph that exhibit one attack from their perspective.

At 13:00 the excess traffic started, over 600mbit more than normal. It continued and reached a peak of 1.1 gigabit over normal 7am the next morning, when it quickly ceased.

Click for full size


From our perspective, the attack looked like this. The graph above is the high spike at the first thursday on this graph, reading from the right.

Click for full size


The traffic rose at 13:00 on thursday until it quickly overwhelmed our port. When nac.net added two filters to block the traffic that was most easily identified as bad we regained some site function (the blue line is our traffic outbound).

The next attack (the lump on the left of the graph above) started perhaps coincidentally almost at the same time and day of the subsequent week and lasted through to saturday afternoon although we'd figured out how to block it well before it finished. The original filters were re-applied, and what was left was the TCP traffic (web server requests on port 80) alone. Those were the ones we had to deal with.

The ddos originated from a botnet. (wikipedia definition). The first botnet was fairly small and located largly in Russia. The second botnet was much larger and was mainly out of several south east asian countries, as well as Russia and Poland. In the end we heard garbage traffic from well over 65,000 IPs that would otherwise never have reason to visit our little corner of the world.

What is a botnet?

It is estimated that there are hundreds and maybe thousands of (overlapping) groups of compromised PCs ready to do the bidding of a C&C (command and control) machine, or machines. Most if not all participants in the botnet are home ISP subscribers who have no clue that their PC is moonlighting for another employer (although they may wonder why their connection is slow, or their pc appears busy).

A botnet can be small or huge: 100,000+ machines with no theoretical upper limit. The more botnet members connected to the net via broadband rather than dial-up, the more effective the net. The faster and more recent the PCs, the more effective the net. The more widely geographically distributed, the more effective the net.

A botnet will employ a number of techniques aimed at resource exhaustion of a web site. If the botnet can request more resources than the target can deliver, then (and actually before that point) legitimate users are impacted via slow-downs and errors, and in the end the site just appears "down" or so slow as to be unusable.

Resources can be exhausted in several ways:

immediate compute resources of the target can be exhausted by heavy traffic in fragmented packets. These packets require CPU for re-assembly and so in theory with enough of this traffic the CPU(s) will become too busy to attend to everything else and incoming traffic will be dropped. A badly designed firewall may also operate inefficiently under load and exhaust compute resources well before the incoming bandwidth reaches maximum levels.

memory resources can be exhausted by filling up various kernel tables that are not tuned to be sufficiently large.

page production capacity can be exhausted simply by requesting so many pages that most pages fail to be produced or fetched in a reasonable enough time.

bandwidth capacity can be exhausted simply by sheer quantity of arriving packets. Nearly every site has a fixed maximum bandwidth budget (even as simple as the number of ports they have live, times their maximum port speed). When bandwidth capacity is used, there is little else the site can do. The site also faces the possibility of being charged onerously for the period of excessive incoming bandwidth - although one hopes that a web host only does this if the ddos attacks become a common pattern for the unfortunate client.

Defenses

In the linux/apache world there are a number of different defenses floating around that claim to mitigate DDOS attacks. Do they work? well, I had plenty of time to experiment with all of them.

1. Web server anti-ddos modules (protect only against TCP attacks)

mod_evasive, mod_limitipconn, etc, are only effective against a single or handful of attacking IPs. Their limitations derive from their attempt to respond in SOME way (an error response, rather than ignoring the IPs), and lack of information these modules have on the current activity of an IP or its past activity. If you are expecting these modules to provide attack insurance then re-evaluate because even a small botnet of say 100 machines can easily succeed. One particular form of attack: "open and hold", is particularly difficult to counter by program-based ddos prevention. Admins struggle to identify open and hold attackers (see topics like this) due to the limitation apache has in reporting the IP of a new and as yet uncompleted request.
In the last week another apache module has come to my attention: mod_qos. It appears to be much more ambitious. In particular I like that it tries to give priority to IP addresses it has already seen. (VIPs). Unfortunately mod_qos appears to be very "alpha" with almost nobody using it in production yet. If it is debugged it will be the best apache module to use.

2. A better web server

Apache is honestly not very good at large numbers of simultaneous connections. Although it CAN be driven to support 27,000 connections, it is a very uncomfortable situation to be handling. Lighttpd is better, but is single process / single thread. At this point I suspect that the russian nginx webserver is the best: multiple processes, event driven - and is frugal with resources even with over 10,000 long lived connections. From the documentation I suspect it can easily break the 65k barrier.
By the way: you may wonder what is point of handling so many concurrent connections, or such a high request rate? the reason is that it is much easier to identify "bad IPs" if they are allowed at least for a short while to go all the way through to requesting a URL, which means getting handled by a resilient web server, and getting logged. After getting logged the logs are analyzed (hopefully in near real-time) and it is usually easier to identify which IPs need to be blocked further upstream. Which brings us to...

3. Netfilter (firewall) DROP rules

The netfilter firewall is extremely flexible - but is not fast enough at evaluating each individual packet (or even just each incoming SYN packet) against a long (1000+) list of DROP rules. It is necessary to use some netfilter extensions for the more difficult task of recognizing and dropping SYN packets against a large list of bad hosts.

4. Netfilter (firewall) extensions

There are only two extensions worth anything for ddos attacks: ipt_hashlimit (comes in netfilter and the 2.6 kernel) and ipset. Hashlimit helps in identifying IPs that have risen to the level of fast consumers, and ipset handles block-lists of up to 65535 IP addresses that can be queried, loaded and unloaded from user-space. In addition, a carefully designed netfilter firewall setup will drop known bad packets as a priority, before they can be connection tracked, or traverse all the other rules. Typically netfilter firewall example scripts have a default of DROP (this is done at the end of the rule chain): your list of rules decides what is accepted. The implications are obvious: bad packets traverse the entire firewall stack before being dropped. In a ddos, bad packets are the majority of packets you receive, so you want to decide right at the start an origin IP address is no good, and move on. If your firewall is slow in processing all the packets incoming you will see drops recorded by the network card (see "6" below) under even a moderately sized attack.

5. kernel tuning

A number of kernel parameters should be increased or modified to help with extremely high request rates and tcp trouble. Generally where there is a maximum, it is increased. Where there is a timeout, it is decreased. Where there is a retry number, it is decreased! One frequent gotcha is the connlimit netfilter module. If you must use connlimit (by virtue of other rules in your firewall), and you think it will be tracking packets from the botnet at least bat the beginning before you have identified it, then make sure connlimit is loaded as a module with a high hashsize (eg 256000), and then tuned for a high maximum number of tracked connections! See the pleas for help when this table hits the limit.

6. network card

Your network card should be a quality one, with the most recent stable driver available. Unless your card driver has NAPI support, it will interrupt as often as once for every packet arrival.

During a ddos, your interface may get a LOT of very small packets per second. If your ethernet card driver has bugs or is inefficient, it may start to drop the packets. If your kernel is not able to process the packets quickly, the card will also show packet drops - but it will not be the fault of the driver or the card, it is instead the fault of your netfilter firewall rules (see "3" above).

Interrupts of 50k per second are still ok, but beyond that you may want to use the card in NAPI mode (often requires a driver recompilation and rmmod/insmod).
Consider that small packets over a 100mbit interface might arrive at a rate of 300k per second, or 3m per second for a gig-e!
For what it is worth our experience was with the Intel e1000 card, and the current driver appears to be fairly efficient and does not crash under load. I did find an Intel research paper boasting of the increases in performance and decreases in cpu used by each of their driver releases under linux 2.6 (but now I can't find it again).

7. Active mitigation

It is vital to know during an attack what the hell is going on! You need to be able to quickly identify from logs what the most frequently hit URL is, you need to sort the most frequently attacking IPs by descending order over a small time horizon, you need to be able to tcpdump, and you need to review the request headers for clues as to fingerprints. Using simple pipelines or scripts operating on data such as web server logs, or the active hashlimit table (which is reported in real time in /proc), you can then produce lists of IPs to supplement your blacklist, or obtain modify your server configuration to dump bad requests by picking up user-agent, headers, or other "tells".

Note that the netstat command becomes worse than useless with large amounts of concurrent connections. Too slow to run, too cpu intensive.

Are you prepared with some simple tools.. if your blacklist contains over 10,000 IPs, do you have some scripts ready to review if most are in a single class A or B space? or associated with a country? Do you have a database of class-B and class-C networks by country code ready? Using ipset to swap in a blocklist of networks that "turns off" entire countries can be a short-cut while you work out what else to do. Who knows, it might also give the attacker comfort that their attack is succeeding when in fact it is not.

And most importantly of all, can you safely remotely access your network while the front-door is getting pounded down :) I heard that Disney land(s) are built with employee-only tunnels linking everything. For good reason.

8. upstream filtering

A data center can help by at least applying filters for UDP and ICMP (if they are a large, by size, component of the attack) and these will be applied upstream from your port. They may also be able to arrange with you other more advanced filters either by trouble ticket, or even an API. But don't expect much. Proper DDOS mitigation is a costly service they have no interest in providing for free.

9. black boxes

The big names sell hardware advertised as being ddos resistance. I've no experience with them and I have to wonder whether lab tests and white papers substitute for real attacks. There are also some inline devices appearing that can be placed outside your firewall. I am (again) suspicious that these can be real set-and-forget solutions. The ultimate botnet looks exactly like real users making real requests. Whether there are any ultimate botnets out there yet is a different issue, but that is the aim for all of them. As they approach this perfection, a black box might increasingly start to issue false positives. You are in the best position to spot what a botnet is doing vs the habits of your real users and that needs eyeballs on logs.

The other problem for hardware solutions is that you need a port (to the black box) that is larger than the largest attack. If an attack is 1.2 gig-e and your expensive purchased port is 1 gig-e, then real traffic is getting blocked no matter how good your anti-ddos hardware.

8. Distributed content networks, Proxy services, etc.

The big guys handle ddos mitigation with distributed capacity. Multiple IPs for a single DNS lookup. Specialized proxy front-ends by akamai and others that are written with capacity and ddos-resistance in mind. This is too expensive a proposition for second or third tier sites. A less expensive solution is an on-demand proxy service: in the event of a DDOS, DNS lookups for your domain are switched or automatically switch to a larger proxy server that is able to handle the attack, pick apart the legit requests, pass them to you, and throw away the rest. A service such as ProxyShield can apparently be available on hot standby for a sub $1000 fee per month, and when turned on cost only a few times more than that. If however your ddos attack is going exceed 100mbit then the fees rise proportionally.

9. Cutting off the botnet

This would be the most satisfying, and we've done that before with the undercover help of friendly ISP network admins. The process is involved you can compare it to getting a trace on a phone call before the phone is hung up. I think you have to assume that tracking down the command and control mechanism and taking it out is a pretty rare occurrence so if it is more like a lottery ticket than a solution to anything.

10. Reporting the botnet

If you have your wits about you there is one log that you need to keep or produce, and that is an accurately timestamped log of participating IPs. It can be as simple as UTC Time and IP address. It is important to try to keep false positives out, so the list should be combed for any prior-to-attack users of the website. There is undoubtedly a correct place, or places, to report such a list but right now I don't have that info handy. Perhaps someone in the comments can expand this point. Some ISPs will act on lists like these, but most will not. Expect absolutely no cooperation from ISPs based in most overseas countries but applaud any cooperation that does happen.

Final Thoughts

Our data-center believes that the best way to mitigate a DDOS attack is to "not piss people off". If you can be online and arrange your affairs so that you never piss anyone off then good luck to you. Unfortunately, however, botnets are getting cheap enough to build or rent that they will be deployed more frequently, and in retaliation for ever small disagreements. Supposedly also gangs will extort money to stop an attack. If you are in the online gambling business that might be the biggest risk.

In my view the best defense is to simply be a harder target than average. Spending some time to make sure that a small attack has no effect will probably discourage bigger efforts and save you from running into the arms of an anti-ddos vendor the moment one person tries a few things to tie up your site! As a side-effect it will also probably make the site faster and more able to handle non-ddos "slashdot effects" as well.

I'll update this article further if I can, and do a second one with more technical details that we have found effective.

rss feed About dslreports.com

Random site news information and ponderings, by Justin
Forums » Experiences with DDOS mitigation - Part I
view: topics flat text 
Post a:

MattE
Obama '08
Premium
join:2003-07-20
Jamestown, NC

Good Writeup

Good info Justin. Can't wait for the 2nd article with technical specifics.

Also, did NAC really tell you to just "not piss people off"? Wow.

justin
Australian
join:1999-05-28
Brooklyn, NY

Re: Good Writeup

No, they didn't tell me. Just an opinion they had on the best way to mitigate. I can see where they are coming from it sucks to have other customers knocked around.

Calliope
Premium
join:2005-09-19
Madison, WI

Re: Good Writeup

Thank you for the article. Although I do not understand it in its entirety, it is certainly enlightening and answered a few questions for me.

SND2005
Premium
join:2001-09-15
Im Over Here
·CWLab

said by justin See Profile :

No, they didn't tell me. Just an opinion they had on the best way to mitigate. I can see where they are coming from it sucks to have other customers knocked around.
Would it be possible to switch over to a "CAPTCHA" type mechanism to weed out real traffic during an attack? Possibly you could route requests from BOTs back to themselves, and route positive CAPTCHA replies to the real web or backup web address? Sorta a pseudo secret handshake?

justin
Australian
join:1999-05-28
Brooklyn, NY

Host:
IPv6
Business Connectiv..
Home/Office setup ..
Console/Handheld g..
Console Tech

Re: Good Writeup

That would be even harder than serving them a normal page upon request.

Some newer web servers have introduced a way of giving preference and priority to existing users (on the site before it got full up), or users with an encrypted token that they pick up when they login.

La Luna
Surviving Ashraful
Premium
join:2001-07-12
Warwick, NY
clubs:

Thank you....

....for the informative and easy (mostly) to understand write up justin See Profile, much appreciated!

jonnyz
Premium
join:2003-03-20
Canfield, OH
clubs:

DShield

DShield loves logs like this and is a good security site in general.
--
Join the
RC5 team.
Jman99

join:2007-04-24
Etobicoke, ON

Poirot mode

Is there any evidence that links Rogers or Bell to the hiring of the botnet?
Forums » Experiences with DDOS mitigation - Part I


Saturday, 30-Aug 01:32:57 Terms of Use | Privacy Policy | Hosting by www.nac.net - DSL,Hosting & Co-lo | feedback | contact
over 9 years online! © 1999-2008 dslreports.com.republican-creole