Prior to implementing the HTTP_REFERER blacklist described previously, I investigated the source of the faked HTTP requests. If they were all coming from the same place, I could simply block access from that address.
But the attacks are distributed: they come from many IP addresses on many networks. Here’s an example, showing the request count and source address for all hits to this site containing the work pokerin the past week:
nsa /var/log/httpd : cat debris_access_log | grep poker | awk '{print $1}' | sort | uniq -c | sort -rn | head 91 65.165.84.11 27 68.22.118.212 20 12.172.137.13 14 195.30.153.194 13 38.223.231.8 13 212.211.130.248 12 203.199.92.158 11 65.88.84.205 10 168.11.16.22 9 82.148.70.171Just to confirm that the methodology above isn’t whacked, here are the faked REFERERs from the top IP address:
nsa /var/log/httpd : grep 65.165.84.11 debris_access_log | awk '{print $11}' | sort | uniq -c | sort -rn | head 8 "http://www.nutzu.com/poker-hands.html" 8 "http://www.nutzu.com/free-texas-hold-em.html" 7 "http://www.nutzu.com/internet-poker.html" 7 "http://www.nutzu.com/free-online-poker.html" 6 "http://www.nutzu.com/world-series-of-poker.html" 6 "http://www.nutzu.com/strip-poker.html" 5 "http://www.nutzu.com/poker-tournament.html" 4 "http://www.nutzu.com/texas-holdem-poker.html" 4 "http://www.nutzu.com/rules-of-poker.html" 4 "http://www.nutzu.com/poker-tables.html"How could the referer spammers be operating from so many different networks? Here's my best guess: all those IP addresses represent Wintel machines that have been hijacked by viruses and trojan horses, and they're running distributed REFERER attacks without the knowledge of their owners. The machines are probably sending tons of spam email, too.
So when I previously said "this is all Google's fault," what I really meant is "this is all Microsoft's fault."
(In Microsoft's defense, they've only been working on making Windows more secure for two years... I'm sure they'll have some meaningful progress to report RSN.)