« Posts under Development

Force Google Chrome to start in Incognito mode regardless of how it is invoked

I’ve recently switched to Google’s Chrome mainly because of it’s excellent implementation of Windows 7′s Integrity Levels. Even when running under an Administrative context, the child processes will run with Low Integrity (IE9 doesn’t do this and FF never respects WIL regardless of context). This is a good thing, since any exploits that may compromise the browser directly (either through JS JIT or Flash plugins) can’t really do much damage when running with Low Integrity. More on Integrity Levels here: http://msdn.microsoft.com/en-us/library/bb625957.aspx

Add a NoScript-like extension such as ScriptNo by ‘Andrew Y’, and Chrome is a safe and fast browser in an otherwise hostile world wide web.

Much like it’s competitors, Chrome allows an Incognito mode which will discard any browser data after the session ends. This is great, however there is no way (that I could find) to tell Chrome to always start in this mode. Yes you can change the shortcut on your desktop and add the -incognito switch but this is not a fool proof solution. If Chrome is your default browser Start > Run > http://www.google.com will not launch it in Incognito Mode. If any applications start the browser without using your shortcut (through protocol or file associations) the browser will start in normal mode.

There is no .conf or .ini or .json file you can edit to tell Chrome to always start in Incognito Mode, which seems like a strange omission from the Chrome dev team. By altering a few default settings, FF and IE can be told to remove all traces of browser data upon exiting. The only thing in Chrome that comes close is under Privacy\Cookies section. You can remove all cookies and “other site data” when exiting the browser but this is not the equivalent of Ctrl + Shift + Del.

What we can do is modify some registry settings and tell Windows to start a batch file instead of the chrome.exe main application. When Chrome is made the default browser, among other things, it modifies a few registry keys to tell Windows where to go when associating a protocol with an application (in our case: HTTP and HTTPS).

So let’s tell it to use chrome.cmd instead of chrome.exe:

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\ChromeHTML\shell\open\command]
@="\"C:\\Tools\\start_chrome_incognito.cmd\" -- \"%1\""

[HKEY_CLASSES_ROOT\http\shell\open\command]
@="\"C:\\Tools\\start_chrome_incognito.cmd\" -- \"%1\""

[HKEY_CLASSES_ROOT\https\shell\open\command]
@="\"C:\\Tools\\start_chrome_incognito.cmd\" -- \"%1\""

[HKEY_CLASSES_ROOT\htmlfile\shell\open\command]
@="\"C:\\Tools\\start_chrome_incognito.cmd\""

[HKEY_CLASSES_ROOT\htmlfile\shell\opennew\command]
@="\"C:\\Tools\\start_chrome_incognito.cmd\" %1"

Save this as chrome_file_association_fix.reg and run it. For reasons I have yet to understand, you can’t use environment variables in the registry path. Likely has to do with the host process not having an environment when it executes the application. But who knows..

You cannot add the switches directly to the registry key. This would be more convenient since it wouldn’t require a separate batch file to maintain, but this breaks the host process that attempts to start the application.

Create a start_chrome_incognito.cmd in your C:\Tools folder and put this into it:

@echo off
start /D"%LocalAppData%\Google\Chrome\Application\" chrome.exe -incognito --purge-memory-button --memory-model=low %*
:: for XP use the following
:: start /D"%AppData%\Google\Chrome\Application\" chrome.exe -incognito --purge-memory-button --memory-model=low %*

Add whatever options you want before %* and you should be good to go. If you are on Windows XP still, upgrade. If you can’t upgrade then make sure you use the appropriate path to chrome.exe in your batch file.

Now when you start Chrome using something like Start > Run > http://www.google.com you will be browsing in Incognito mode.

Hacky but it works.

Google, please add an option to do this natively, thanks.

Quickly rename a large number of files with PowerShell and Regular Expressions

There have been a few times in the past where I’ve had to rename a large number of files for various reasons (ie: remove a common piece of text from the name) and I’ve always resorted to PowerShell.

Piping dir into a where and matching the files I wanted to rename was effective but tedious. Cue the mass_rename.ps1 script:

$ext = $args[0];
$dir = $args[1];

$what = $args[2];
$with = $args[3];

$whatif = $args[4];

$count = 0;

if ($args.length -lt 4) {
    write-host "Invalid parameters" -fore red;
    ""
    write-host "   .\mass_rename.ps1 <ext> <dir> <what> <with> [-whatif]";
    ""
    write-host " Example (don't do any replacing, -whatif):";
    write-host "   .\mass_rename.ps1 .docx c:\Documents 'version 1\.1' 'version 1.2' -whatif";
    ""
    exit 1;
}

ls -recurse -path $dir | ?{ ($_.name.endswith($ext)) -and ($_.name -imatch $what) } | %{
    if ($whatif -eq "-whatif") {
        write-host("whatif: '" + $_.fullname + "' -> '" + ($_.name -ireplace $what,$with) + "'");
    }
    else {
        $from = $_.fullname;
        $to = ($_.name -ireplace $what,$with);
        mv -literalpath $from -destination ($_.directoryname + "\" + $to) -force;
        write-host "Renamed '$from' -> '$to'" -fore yellow;
        $count++;
    }
}

write-host "Done. Processed $count files." -fore green

The script will accept 4 parameters with an optional -whatif as the 5th. Fairly self explanatory with one mention: the <what> parameter is a regular expression. Keep this in mind when, for example, you are trying to match for a period (.) as you would have to escape it (as per the example usage).

The -whatif parameter will only output the before and after file names thout modifying the files themselves.

That’s it, set the execution policy and enjoy.

Identify and block malicious HTTP traffic with IPtables

So I was looking through my webservers’ access_log files and this popped up every couple of days:

93.157.0.142 - - [14/Dec/2010:16:01:19 -0500] "GET /w00tw00t.at.blackhats.romanian.anti-sec:) HTTP/1.1" 404 13
72.167.164.72 - - [17/Dec/2010:02:02:54 -0500] "GET /w00tw00t.at.blackhats.romanian.anti-sec:) HTTP/1.1" 404 13
74.55.205.98 - - [18/Dec/2010:03:06:49 -0500] "GET /w00tw00t.at.blackhats.romanian.anti-sec:) HTTP/1.1" 404 13
150.217.19.5 - - [19/Dec/2010:14:36:52 -0500] "GET /w00tw00t.at.blackhats.romanian.anti-sec:) HTTP/1.1" 404 13
173.201.39.105 - - [21/Dec/2010:08:16:35 -0500] "GET /w00tw00t.at.blackhats.romanian.anti-sec:) HTTP/1.1" 404 13
74.55.205.98 - - [24/Dec/2010:14:43:28 -0500] "GET /w00tw00t.at.blackhats.romanian.anti-sec:) HTTP/1.1" 404 13

This is a truncated list, but each one of these “romanian blackhats” would attempt a few other directories as well. These are not really critical intrusion attempts but they do indicate drones that scan the Internet for potential security holes in webservers (read Phil’s Getting A Little Sick of ZmEu). I don’t want these hosts to access my server in any way since, well, they don’t really need to. I could’ve blocked each one of those IPs by hand but I decided to script it and crontab it.

The first thing I needed is a chain that would handle all of these bad IP addresses:

[root@demon ~]# iptables -N bad_traffic
[root@demon ~]# iptables -A INPUT -j bad_traffic
[root@demon ~]# iptables -A INPUT -p tcp -m tcp --dport 80 -j ACCEPT

The two rules should be applied in the order specified above. You want to DROP bad traffic before you ACCEPT any web connection.

This script will add a rule for each IP with the DROP target in the bad_traffic chain, if it is not already in the chain:

#!/usr/bin/env perl
# badht - Bad HTTP Traffic blocker
#
# Scans an Apache access log file for bad
# requests and blocks the IP responsible
#
# Usage: badht <access_log> [iptables_chain]
#
# ./badht /var/log/httpd/access_log bad_traffic
#
# badht will use the chain 'bad_traffic' unless
# otherwise specified

use strict;
use warnings;
use POSIX qw(strftime);

die("Usage: $0 </var/log/httpd/access_log> [iptables_chain]") if !$ARGV[0];
my $log = $ARGV[0];

my $chain = ($ARGV[1] ? $ARGV[1] : "bad_traffic");

my @bad = `grep w00tw00t $log|cut -f1 -d" "|sort -u`;
my @ablk = `/sbin/iptables -S $chain|grep DROP|awk '{print \$4}'|cut -d"/" -f1`;

foreach my $ip (@bad) {
    if (!grep $_ eq $ip, @ablk) {
        chomp $ip;
        `/sbin/iptables -A $chain -s $ip -j DROP`;
        print strftime("%b %d %T",localtime(time))." badht: blocked bad HTTP traffic from: $ip\n";
    }
}

By the way, it’s a good idea to block ALL incoming traffic (line 29) coming from these IP addresses because chances are they have already attempted to brute-force your SSH service:

[root@demon admin]# grep -E "sshd.*Failed password for.*from ([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+)" /var/log/secure|wc -l
103
[root@demon admin]#

… within just 7 days of bringing demon.* online! These packets are just wasted CPU cycles from compromised hosts and they should be dropped before they get to any of my services.

Anyway… when I execute badht I get this output:

[root@demon admin]# ./badht /var/log/httpd/access_log bad_traffic
Dec 25 15:56:44 badht: blocked bad HTTP traffic from: 150.217.19.5
Dec 25 15:56:44 badht: blocked bad HTTP traffic from: 173.201.39.105
Dec 25 15:56:44 badht: blocked bad HTTP traffic from: 72.167.164.72
Dec 25 15:56:44 badht: blocked bad HTTP traffic from: 74.55.205.98
Dec 25 15:56:44 badht: blocked bad HTTP traffic from: 93.157.0.142
[root@demon admin]# ./badht /var/log/httpd/access_log bad_traffic
[root@demon admin]# iptables -L bad_traffic -n
Chain bad_traffic (1 references)
target     prot opt source               destination
DROP       all  --  150.217.19.5         0.0.0.0/0
DROP       all  --  173.201.39.105       0.0.0.0/0
DROP       all  --  72.167.164.72        0.0.0.0/0
DROP       all  --  74.55.205.98         0.0.0.0/0
DROP       all  --  93.157.0.142         0.0.0.0/0
[root@demon admin]#

As you can see the second time I ran the script it skipped the already-blocked IPs and said nothing.

I don’t want to run this manually, so I’ll let crontab handle it:

[root@demon ~]# crontab -lu root
*/30 * * * * ~/admin/badht /var/log/httpd/access_log bad_traffic >> /var/log/bad_traffic 2>&1
[root@demon ~]#

… this will run twice an hour and send all output to /var/log/bad_traffic. You can increase the frequency but you should keep in mind that this may needlessly slow the system down on large access_log files.

Note: The rules created by badht are temporary and will be lost on system reboot or when the iptables ‘service’ is restarted. Remember to periodically save the iptables rules, or at least the ‘bad_traffic’ chain. Since the crontab is persistant, badht will recreate all the rules the next time it runs.

Re-writing vcstool to use existing Mozharness handlers

It turns out that the MozTools class I had created in 0.1.0, was actually a duplicate of the existing MozHarness class: BaseScript. BaseScript and BaseConfig together handle logging and command-line arguments (much better than the crude MozTools) so naturally, vcstool should inherit these handlers and make use of their robust methods.

Since vcstool no longer has direct access to optparse (BaseScript does the parsing and locking) the main issue became returning the value of –vcs and creating the VCS object  BEFORE initializing the BaseScript functionality. If MercurialVCS was the class inheriting BaseScript it wasn’t possible to determine what VCS object to create without actually creating the MercurialVCS object first.

The solution was to create a wrapper class that would handle the creation of the object. This class would actually inherit BaseScript giving it access to the parsed command-line options, while also providing a method to create the VCS class and return it based on the –vcs switch selection:

class VCSWrapper(BaseScript):
    supported_vcs = ('hg',)

    def __init__(self, config_options=None):
        BaseScript.__init__(self, config_options=config_options)

    def get_vcs_object(self):
        __tmp_vcs = self.config.get('vcs', None)
        if __tmp_vcs == 'hg':
            return MercurialVCS(wrapper_obj=self)
        else:
            self.error('Unsupported VCS: ' + __tmp_vcs
                       + ', please use one of the following: '
                       + ' '.join(self.supported_vcs))
            return None

So now vcstool would create a wrapper object to handle logging and command-line options as well as a vcs object (returned by VCSWrapper.get_vcs_object()) to do the checkouts and whatnot.

One more thing: to simplify the whole thing, vcstool now accepts two extra switches: –repo and –dest. These are meant to replace hgtool’s [repo] and [dest] unnamed arguments. Because of the way BaseScript handles unnamed arguments this is the less complicated solution for passing these values to vcstool.

If you want to see this changeset’s files, click here.

P.S.: We have opened bug 606963 to track the progress of this port. All the patches to go from vcstool version 0.0 to 0.1.1 (and above) will be attached there.

Porting hgtool to Mozharness

I’ve started work on Mozharness by moving hgtool functionality to Mozharness. The tool originally allows you to specify a bunch of options including: what revision/branch to get, which json properties file to use and so on. This is, however, specific to HG and we want to make this a bit more general. We want vcstool to use the appropriate class to handle pulling of data from a repository.

So far:

I’ve created the MercurialVCS class which will handle tasks such as updates and checkouts. Later on we will have a library for all of the popular VCSs (ie: GitVCS, SubversionVCS, BazaarVCS) which vcstool can use based on the command line specification.

$ python vcstool.py -h
Usage: vcstool.py [-s|--vcs VCS] [-p|--props-file] [-l|--log-level LEVEL] [-r|--
rev REVISION] [-b|--branch BRANCH] repo [dest]
...

The log level can now be modified using the –log-level switch for vcstool.

I’ve also ported commands.py to MozTools class. It is responsible for tasks such as run_cmd (execute a command, such as ‘hg clone …’ and get_output (execute command and return the stdout/stderr). MercurialVCS (and others) makes use of these commands during a pull. The MozTools class also creates the logging object, so VCS child classes can reference self.log.{info|debug|whatever}.

You can follow the development progress on this blog (top-right) or here: http://git.slacknet.ca/mozharness.git/rss

Migrating from Subversion (svn) to Mercurial (hg)

I’ve been using Subversion as my primary VCS for quite a while now and I loved it. There were a couple of things that were odd but for the most part, it worked well for one developer. With the Mozharness project, I was introduced to a new VCS concept called Distributed VCS.

In a nutshell, each developer has their own repository with their own detailed history of changes (changesets). This makes it easier to merge code (with less problems) when it comes time to put the application into production. If you have been using CVS/SVN for your version control, Mercurial concepts may get a bit confusing and a chart that gives you hg equivalents to svn commands, will complicate things further.

I found a site that actually has a re-education section for the Subversioners willing to switch to Mercurial/GIT: HGinit. If you’ve never used a VCS (distributed or otherwise) this is a great resource to get started.

I’ve setup my own Mercurial central repository: http://code.darkminds.org/hg/mozharness

EDIT: I’ve converted the mercurial repos to git: http://git.slacknet.ca

All I had to do is clone Mozharness from Mozilla’s central repository and it was ready to go. Compared to Subversion, setting up the published repository was actually quite painless and this tutorial can help you get one going too.

Oh and migration from SVN (or Git, or CVS, or Bazaar) couldn’t be easier:

$ hg convert http://nexusframework.svn.sourceforge.net/svn/nexusframework

Search Keyword Highlighting with Perl

Here’s how to break down a search string into uniq keywords and highlight them using HTML:

# user input
my $search_string = "I really really need this";

# highlighting, get the s _uniq_ keyword
# use grep() to make sure we skip short keywords
my @kw = grep {length($_) > 2} split /[^\w]/,$search_string;

# expected output: keyword1|keyword2|keyword3
my $regex_keys = join "|", sort keys %{{ map { $_ => 1 } @kw }};

$data =~ s!($regex_keys)!\1!ig;

print $data;

This obviously assumes you’ve received a line of text $data from a database of some sort.

The interesting part is:

keys %{{ map { $_ => 1 } @kw }}

… which creates an anonymous hash with all the elements of the array as keys with a dummy 1 value. The keys sub in turn returns the keys and since map would’ve overwritten any duplicate keys, they are now unique!

Cool right?