Re-writing vcstool to use existing Mozharness handlers

It turns out that the MozTools class I had created in 0.1.0, was actually a duplicate of the existing MozHarness class: BaseScript. BaseScript and BaseConfig together handle logging and command-line arguments (much better than the crude MozTools) so naturally, vcstool should inherit these handlers and make use of their robust methods.

Since vcstool no longer has direct access to optparse (BaseScript does the parsing and locking) the main issue became returning the value of –vcs and creating the VCS object  BEFORE initializing the BaseScript functionality. If MercurialVCS was the class inheriting BaseScript it wasn’t possible to determine what VCS object to create without actually creating the MercurialVCS object first.

The solution was to create a wrapper class that would handle the creation of the object. This class would actually inherit BaseScript giving it access to the parsed command-line options, while also providing a method to create the VCS class and return it based on the –vcs switch selection:

class VCSWrapper(BaseScript):
    supported_vcs = ('hg',)

    def __init__(self, config_options=None):
        BaseScript.__init__(self, config_options=config_options)

    def get_vcs_object(self):
        __tmp_vcs = self.config.get('vcs', None)
        if __tmp_vcs == 'hg':
            return MercurialVCS(wrapper_obj=self)
        else:
            self.error('Unsupported VCS: ' + __tmp_vcs
                       + ', please use one of the following: '
                       + ' '.join(self.supported_vcs))
            return None

So now vcstool would create a wrapper object to handle logging and command-line options as well as a vcs object (returned by VCSWrapper.get_vcs_object()) to do the checkouts and whatnot.

One more thing: to simplify the whole thing, vcstool now accepts two extra switches: –repo and –dest. These are meant to replace hgtool’s [repo] and [dest] unnamed arguments. Because of the way BaseScript handles unnamed arguments this is the less complicated solution for passing these values to vcstool.

If you want to see this changeset’s files, click here.

P.S.: We have opened bug 606963 to track the progress of this port. All the patches to go from vcstool version 0.0 to 0.1.1 (and above) will be attached there.

Porting hgtool to Mozharness

I’ve started work on Mozharness by moving hgtool functionality to Mozharness. The tool originally allows you to specify a bunch of options including: what revision/branch to get, which json properties file to use and so on. This is, however, specific to HG and we want to make this a bit more general. We want vcstool to use the appropriate class to handle pulling of data from a repository.

So far:

I’ve created the MercurialVCS class which will handle tasks such as updates and checkouts. Later on we will have a library for all of the popular VCSs (ie: GitVCS, SubversionVCS, BazaarVCS) which vcstool can use based on the command line specification.

$ python vcstool.py -h
Usage: vcstool.py [-s|--vcs VCS] [-p|--props-file] [-l|--log-level LEVEL] [-r|--
rev REVISION] [-b|--branch BRANCH] repo [dest]
...

The log level can now be modified using the –log-level switch for vcstool.

I’ve also ported commands.py to MozTools class. It is responsible for tasks such as run_cmd (execute a command, such as ‘hg clone …’ and get_output (execute command and return the stdout/stderr). MercurialVCS (and others) makes use of these commands during a pull. The MozTools class also creates the logging object, so VCS child classes can reference self.log.{info|debug|whatever}.

You can follow the development progress on this blog (top-right) or here: http://git.slacknet.ca/mozharness.git/rss

Migrating from Subversion (svn) to Mercurial (hg)

I’ve been using Subversion as my primary VCS for quite a while now and I loved it. There were a couple of things that were odd but for the most part, it worked well for one developer. With the Mozharness project, I was introduced to a new VCS concept called Distributed VCS.

In a nutshell, each developer has their own repository with their own detailed history of changes (changesets). This makes it easier to merge code (with less problems) when it comes time to put the application into production. If you have been using CVS/SVN for your version control, Mercurial concepts may get a bit confusing and a chart that gives you hg equivalents to svn commands, will complicate things further.

I found a site that actually has a re-education section for the Subversioners willing to switch to Mercurial/GIT: HGinit. If you’ve never used a VCS (distributed or otherwise) this is a great resource to get started.

I’ve setup my own Mercurial central repository: http://code.darkminds.org/hg/mozharness

EDIT: I’ve converted the mercurial repos to git: http://git.slacknet.ca

All I had to do is clone Mozharness from Mozilla’s central repository and it was ready to go. Compared to Subversion, setting up the published repository was actually quite painless and this tutorial can help you get one going too.

Oh and migration from SVN (or Git, or CVS, or Bazaar) couldn’t be easier:

$ hg convert http://nexusframework.svn.sourceforge.net/svn/nexusframework

Search Keyword Highlighting with Perl

Here’s how to break down a search string into uniq keywords and highlight them using HTML:

# user input
my $search_string = "I really really need this";

# highlighting, get the s _uniq_ keyword
# use grep() to make sure we skip short keywords
my @kw = grep {length($_) > 2} split /[^\w]/,$search_string;

# expected output: keyword1|keyword2|keyword3
my $regex_keys = join "|", sort keys %{{ map { $_ => 1 } @kw }};

$data =~ s!($regex_keys)!\1!ig;

print $data;

This obviously assumes you’ve received a line of text $data from a database of some sort.

The interesting part is:

keys %{{ map { $_ => 1 } @kw }}

… which creates an anonymous hash with all the elements of the array as keys with a dummy 1 value. The keys sub in turn returns the keys and since map would’ve overwritten any duplicate keys, they are now unique!

Cool right?

Drive Backup over SSH Compressed with Gzip

If you’ve worked hard to configure your Linux machine and can’t afford to lose it due to drive failure, try creating an image of it using dd periodically.

It wouldn’t make much sense to store the image of the drive on the drive itself, but luckily dd is smart and you can combine it with ssh and gzip to store your stuff off-site.

# dd if=/dev/sda | ssh user@backup.remotehost.com dd of=/backup/drive.img.gz

At this point the drive.img.gz file is quite large. If you’re going over the internet this will take a really long time and kill your bandwidth.

Try this:

# dd if=/dev/sda | gzip | ssh user@backup.remotehost.com dd of=/backup/drive.img.gz

Notice the gzip pipe right before the ssh command compressing the stream before it gets sent to backup.remotehost.com.

You can also tell dd to create an image of a specific partition only (specify /dev/sda2 as the input stream)

To restore a drive image, log into backup.remotehost.com and type:

# dd if=/backup/drive.img.gz | gzip -d | ssh root@livecd.host dd of=/dev/sda

You should only restore to a drive that is not in use (possibly an OS running off of a Live CD?).

Warning: These commands are not idiot-proof and it’s all too easy to wipe the wrong drive!

I was able to reduce my drive.img.gz by almost 75% using gzip! You may find other compression tools to be better or worse depending on the data you are imaging.

Recently I was working on a project where we configured secure MySQL replication between Fedora host A and Fedora host B. There was no RAID or any kind of redundancy and me being paranoid I quickly imaged drive B to drive A and vice-versa.

Warning: keep in mind that creating an image of an OS that uses logical volumes (LVM) may not restore to a new drive properly.

How to speed up RPM building

If you’ve run into a situation where you wanted to repeatedly test some code in your .spec file after the %build process, you know you have to wait for the application to re-compile every time. For larger RPMs this waiting compounds to a lot of lost time.

During my build for CodeLite I ran into several issues, mostly when cleaning up the rpmlint messages from the resulting package. The average build time was around 16 minutes with the optimal make -j value. It doesn’t sound like much but if you have to compile 5 or 6 or 7 times, it adds up.

The .spec macros will do cleaning of the working directories every time you execute rpmbuild -ba and for good reason. Ideally you should re-compile every time to ensure a repeatable build environment, however if you are impatient like me try the following:

I assume you…

  • have already tested and confirmed your package compiles successfully with rpmbuild -ba
  • are confident the build process can be skipped safely without compromising the rest of the .spec file
  • understand that this is for testing purposes only and the .patch created below will not be included in the final .rpm
  • understand this particular .patch file should only be used on the system it is created

First of all, expand the source tarball to two separate directories: pkg and pkg.compiled.

Navigate to the pkg.compiled directory and go through the configure and make process mirroring the %build process from your working .spec file. You do not need to ‘make install’ and this point.

Change the directory to the pkg.compiled parent and run the following:

$ diff -Naur pkg pkg.compiled > ~/rpmbuild/SOURCES/app-version-skip_make.patch

You will be using this patch file to “trick” the %build process into skipping most of the compiling process. I say most because make may detect inconsistencies in the source directory after the patch and go through some re-linking.

The next step is to add this patch to your .spec file:

...
Patch99:    app-version-skip_make.patch
%prep
%setup -q
%patch99 -p1
...
%build
#%configure
make
...

Notice I’ve commented out %configure because the patch file will take care of this also.

You can now go through the rpmbuild -ba process again and you should see a very significant speed increase.

Using this, I was able to decrease the build time for CodeLite by 400%!

Good luck, and remember to remove this patch from your final RPM.

Creating a .spec-ial RPM for CodeLite 2.7.0

I have decided to make an RPM for codelite since there isn’t one on the Fedora repositories. The source tarball contains a .spec file, however rpmlint produces a bunch of errors and warnings on the resulting .rpm and so I started from scratch.

Having built this package manually already, I knew what the pre-requisites were. These are also listed in the *.tar.gz/BuildInfo.txt. See my previous post, for more information on compiling manually.

The rpm built successfully after making the necessary modifications to the %files directive and all that was left was rpmlint-ing the resulting package and .spec file.

I got a bunch of warnings and errors but the most cryptic was:

codelite.i386: W: unstripped-binary-or-object /usr/lib/codelite/*.so

This was repeated for every object and binary produced by make. Usually if you tell rpmlint to be more informative using the -i switch: rpmlint -i *.rpm, you can follow the instructions or clues to fix each error/warning. In this case the -i switch had no information for that message-id and Google produced no answers. I found a Common Rpmlint Issues page on the Fedora wiki, but there was nothing about “unstripped objects”. I talked to someone on #fedora-devel@Freenode and they pointed me to the strip utility.

I narrowed the *.so and binary objects to 3 strip lines and added them to the %install section as follows:

%install
rm -rf $RPM_BUILD_ROOT
make install DESTDIR=$RPM_BUILD_ROOT
strip -s -v %{buildroot}%{_bindir}/%{name}{,_indexer,_cppcheck}
strip -s -v %{buildroot}%{_libdir}/%{name}/*.so
strip -s -v %{buildroot}%{_libdir}/%{name}/debuggers/*.so

Make sure you add the strip lines after your rm -rf and make commands or the stripping will fail.

I also got a bunch of devel-file-in-non-devel-package messages from rpmlint, but these were aimed at a templates/* folder. They seemed to be example projects and thus not-critical (I should do further testing to ensure they are actually non-critical).

One error I got was the world-writable message on codelite-icons.zip in %{_datadir}/codelite/. The vendor configure script explicitly set the permissions to 0666 on this file. This was interesting because it means that they were either careless when capturing files for the configure script’s initial creation, or the application actually needs access to modify this zip file under non-privileged user context. The latter is more likely, however it seems like a bad design choice as this could cause conflicts between users’ distinct working environments.

There were also a backup files (files that end in ~) left behind in the templates/* folder prompting rpmlint to give backup-file-in-package error messages. If you want to remove all these files from your install add this to %install section of your .spec file:

$ find %{buildroot}%/ -type f -name "*~" -exec rm -rf ’{}’ \;

I, however decided to do away with this problematic templates directory by preventing the Makefile script from copying it during its install phase. To do this I had to modify the original configure script and comment out the appropriate lines (see attached .patch file).

Since modifying the vendor/upstream tarball is against best practices, the only way to do this is to create a patch and let rpmbuild take care of the modifications on the fly.

$ diff -u configure configure.fixed > ~/rpmbuild/SOURCES/codelite-2.7.0.4375-configure_fix.patch

The patch should be unique so as to not conflict with current or future patches. The next step is to modify the .spec file telling it to make the necessary modifications before %configure creates the Makefile:

...
BuildRoot:      %{_tmppath}/%{name}-%{version}-%{release}-root-%(%{__id_u} -n)
Patch0:        %{name}-%{version}-configure_fix.patch
...
%prep
%setup -q
%patch0

%build
%configure

%patch0 is a macro for the patch command and therefore accepts the usual switches such as -pN (where N is the number of directories to strip from the file being patched, see manpage). I had created the patch in the same directory as the original configure (after %setup happens the working directory is %{_builddir}).

After all this rpmlint was pretty clean. The only outstanding warning was no-manual-page-for-binary, and since the vendor tarball did not contain any manpages, I can safely ignore this.

P.S.: If rpmlint complains about not having a dictionary for en_US (enchant-dictionary-not-found) install aspell and aspell-en.

Download codelite-2.7.0.4375-configure_fix.patch

SSD TRIM in Windows 7

If you’ve recently purchased an SSD you should probably be aware of performance issues that arise over time when it comes to writing data. TRIM technology was designed to make sure your drives write performance is consistent throughout the life of the drive. What is TRIM?

Windows 7 supports TRIM natively and you probably wont need to mess around with it. However if you want to check if TRIM is turned on or off use the following command:

C:>fsutil behavior query disabledeletenotify
DisableDeleteNotify = 0

0 – TRIM is enabled
1 – TRIM is disabled

Windows 7, for example, will query the drive’s RPM and if it responds with 0, it assumes the drive is an SSD. It then turns on TRIM and disables disk defrags as they are no longer needed.

If your system reports TRIM as disabled you can enable it by setting the property to 0:

C:>fsutil behavior set disabledeletenotify 0
DisableDeleteNotify = 0

… or if you want to showcase TRIM vs. No-TRIM:

C:>fsutil behavior query disabledeletenotify 1
DisableDeleteNotify = 1

If you’re in the market for an SSD, remember you get what you pay for, so if that drive is relatively cheap make sure it supports TRIM. Also keep in mind that not all operating systems support TRIM, some may need patches and some may not support it at all for the time being.

Performance improvement is significant when deploying or capturing on an SSD, I highly recommend this upgrade to your lab environment.

SSH RSA1 Publickey Authentication Issues

Recently I’ve had to do some basic publickey authentication using Fedora 13 x86/x64, so I started by creating the id file using ssh-keygen -t rsa. After copying the *.pub file to the remote host and making sure the permissions are properly set, I tried connecting.

The pubkey authentication failed and the remote /var/log/secure wasn’t showing anything interesting. On the client I typed the following to troubleshoot further:

slave$ ssh -vvv user@master
Connecting to master…
...
debug1: Connection established.
debug1: identity file /home/user/.ssh/id_rsa type -1
debug3: Not a RSA1 key file /home/user/.ssh/id_rsa.
debug2: key_type_from_name: unknown key type '---- BEGIN'
debug3: key_read: missing keytype
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
...
debug2: key_type_from_name: unknown key type '---- END'
debug3: key_read: missing keytype
...
[pubkey auth fails]
slave$

I looked for an answer on Google, but nothing seemed to help. I was puzzled because I had accomplished the exact same thing on the exact same distro days earlier.

Well it turns out there is a bug in the openssh 5.4p1 (build 1) package that is shipped with the live CD image.

This fixed the issue immediately:

slave# yum update openssh

OpenSSH 5.4p1 (build 3) will be downloaded and everything should be fine.

To figure out exactly what fixed it, I downloaded the source RPM for build 3 and exacted it. The .spec file’s %changelog referenced the bug: #595935 which documents the same problem I encountered.

If you are building from source I’ve attached the .patch to this post. Good luck!

Download OpenSSH patch #595935

Building CodeLite and finding its dependencies

I found CodeLite (2.6.0.4189 as of today) on sourceforge.net and it seems like a useful tool for programming under Linux. It’s got the goodness of syntax highlighting for a bunch of languages:

C/C++
Java
Perl
XML
Makefile
Lua
Diff files
PHP
JavaScript
Python
HTML
ASP

It’s cross-platform and fairly easy to compile and install. It can debug (built-in gdb support), compile and the highlighting for Makefile/Perl/C/Python/Diff should prove quite useful for SBR. See their website for more info on the features.

I’ve done the following under Fedora 13 x86, but the process should be similar for all the other popular platforms:

$ wget [some sourceforge mirror]/codelite-2.6.0.4189.tar.gz
$ tar xvf codelite-2.6.0.4189.tar.gz
$ cd codelite-*
$ ./configure
$ make
# make install
$ codelite

That’s it. Now obviously there are dependencies and a C++ compiler is needed to do the build (g++ to be exact). Fedora13 and the wonderful yum makes this process easy.

Download and install the following packages:

wxGTK-devel (you will neex wx-config to build the Makefile)
gcc-c++ (the g++ binary needed to ‘make’)

I looked for a *g++* package on the repositories but couldn’t find anything. I thought it was odd since a GNU C++ compiler should be a fairly common so I did this:

# yum install "*/*g++"

This was great and all but it listed something like 47 packages. I obviously did not need that many and looking through the list, gcc-c++ stood out.

I figured that when looking to install dependencies it’s a good idea to keep the as lean as possible. This way I can figure out exactly what package fixed what issue as well as avoid or at the very least minimize conflict with other applications.

The wxGTK-devel is the package name on FC13′s repositories so it should be easy to install. Before running ./configure make sure wx-config is in $PATH.

It also looks like CodeLite is trying to sync with a VCS using svn at start-up but it is not a requirement.

This is the final product: