« Posts tagged perl

Identify and block malicious HTTP traffic with IPtables

So I was looking through my webservers’ access_log files and this popped up every couple of days:

93.157.0.142 - - [14/Dec/2010:16:01:19 -0500] "GET /w00tw00t.at.blackhats.romanian.anti-sec:) HTTP/1.1" 404 13
72.167.164.72 - - [17/Dec/2010:02:02:54 -0500] "GET /w00tw00t.at.blackhats.romanian.anti-sec:) HTTP/1.1" 404 13
74.55.205.98 - - [18/Dec/2010:03:06:49 -0500] "GET /w00tw00t.at.blackhats.romanian.anti-sec:) HTTP/1.1" 404 13
150.217.19.5 - - [19/Dec/2010:14:36:52 -0500] "GET /w00tw00t.at.blackhats.romanian.anti-sec:) HTTP/1.1" 404 13
173.201.39.105 - - [21/Dec/2010:08:16:35 -0500] "GET /w00tw00t.at.blackhats.romanian.anti-sec:) HTTP/1.1" 404 13
74.55.205.98 - - [24/Dec/2010:14:43:28 -0500] "GET /w00tw00t.at.blackhats.romanian.anti-sec:) HTTP/1.1" 404 13

This is a truncated list, but each one of these “romanian blackhats” would attempt a few other directories as well. These are not really critical intrusion attempts but they do indicate drones that scan the Internet for potential security holes in webservers (read Phil’s Getting A Little Sick of ZmEu). I don’t want these hosts to access my server in any way since, well, they don’t really need to. I could’ve blocked each one of those IPs by hand but I decided to script it and crontab it.

The first thing I needed is a chain that would handle all of these bad IP addresses:

[root@demon ~]# iptables -N bad_traffic
[root@demon ~]# iptables -A INPUT -j bad_traffic
[root@demon ~]# iptables -A INPUT -p tcp -m tcp --dport 80 -j ACCEPT

The two rules should be applied in the order specified above. You want to DROP bad traffic before you ACCEPT any web connection.

This script will add a rule for each IP with the DROP target in the bad_traffic chain, if it is not already in the chain:

#!/usr/bin/env perl
# badht - Bad HTTP Traffic blocker
#
# Scans an Apache access log file for bad
# requests and blocks the IP responsible
#
# Usage: badht <access_log> [iptables_chain]
#
# ./badht /var/log/httpd/access_log bad_traffic
#
# badht will use the chain 'bad_traffic' unless
# otherwise specified

use strict;
use warnings;
use POSIX qw(strftime);

die("Usage: $0 </var/log/httpd/access_log> [iptables_chain]") if !$ARGV[0];
my $log = $ARGV[0];

my $chain = ($ARGV[1] ? $ARGV[1] : "bad_traffic");

my @bad = `grep w00tw00t $log|cut -f1 -d" "|sort -u`;
my @ablk = `/sbin/iptables -S $chain|grep DROP|awk '{print \$4}'|cut -d"/" -f1`;

foreach my $ip (@bad) {
    if (!grep $_ eq $ip, @ablk) {
        chomp $ip;
        `/sbin/iptables -A $chain -s $ip -j DROP`;
        print strftime("%b %d %T",localtime(time))." badht: blocked bad HTTP traffic from: $ip\n";
    }
}

By the way, it’s a good idea to block ALL incoming traffic (line 29) coming from these IP addresses because chances are they have already attempted to brute-force your SSH service:

[root@demon admin]# grep -E "sshd.*Failed password for.*from ([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+)" /var/log/secure|wc -l
103
[root@demon admin]#

… within just 7 days of bringing demon.* online! These packets are just wasted CPU cycles from compromised hosts and they should be dropped before they get to any of my services.

Anyway… when I execute badht I get this output:

[root@demon admin]# ./badht /var/log/httpd/access_log bad_traffic
Dec 25 15:56:44 badht: blocked bad HTTP traffic from: 150.217.19.5
Dec 25 15:56:44 badht: blocked bad HTTP traffic from: 173.201.39.105
Dec 25 15:56:44 badht: blocked bad HTTP traffic from: 72.167.164.72
Dec 25 15:56:44 badht: blocked bad HTTP traffic from: 74.55.205.98
Dec 25 15:56:44 badht: blocked bad HTTP traffic from: 93.157.0.142
[root@demon admin]# ./badht /var/log/httpd/access_log bad_traffic
[root@demon admin]# iptables -L bad_traffic -n
Chain bad_traffic (1 references)
target     prot opt source               destination
DROP       all  --  150.217.19.5         0.0.0.0/0
DROP       all  --  173.201.39.105       0.0.0.0/0
DROP       all  --  72.167.164.72        0.0.0.0/0
DROP       all  --  74.55.205.98         0.0.0.0/0
DROP       all  --  93.157.0.142         0.0.0.0/0
[root@demon admin]#

As you can see the second time I ran the script it skipped the already-blocked IPs and said nothing.

I don’t want to run this manually, so I’ll let crontab handle it:

[root@demon ~]# crontab -lu root
*/30 * * * * ~/admin/badht /var/log/httpd/access_log bad_traffic >> /var/log/bad_traffic 2>&1
[root@demon ~]#

… this will run twice an hour and send all output to /var/log/bad_traffic. You can increase the frequency but you should keep in mind that this may needlessly slow the system down on large access_log files.

Note: The rules created by badht are temporary and will be lost on system reboot or when the iptables ‘service’ is restarted. Remember to periodically save the iptables rules, or at least the ‘bad_traffic’ chain. Since the crontab is persistant, badht will recreate all the rules the next time it runs.

Search Keyword Highlighting with Perl

Here’s how to break down a search string into uniq keywords and highlight them using HTML:

# user input
my $search_string = "I really really need this";

# highlighting, get the s _uniq_ keyword
# use grep() to make sure we skip short keywords
my @kw = grep {length($_) > 2} split /[^\w]/,$search_string;

# expected output: keyword1|keyword2|keyword3
my $regex_keys = join "|", sort keys %{{ map { $_ => 1 } @kw }};

$data =~ s!($regex_keys)!\1!ig;

print $data;

This obviously assumes you’ve received a line of text $data from a database of some sort.

The interesting part is:

keys %{{ map { $_ => 1 } @kw }}

… which creates an anonymous hash with all the elements of the array as keys with a dummy 1 value. The keys sub in turn returns the keys and since map would’ve overwritten any duplicate keys, they are now unique!

Cool right?