2007-05-18 spamcop
Some people say spamcop is good, others say it's bad. Personally I don't know what to think still. I have a home mail server and often I have to receive mail from dubious sources for things like sending out mail to friends who may be using something that's not run quite so well. Yes they need to ask their providers to fix a problem, but I also need to communicate with them. There is a happy medium I suppose.
Anyway, back to the article in hand. What I have found of great use is bogofilter, which I have written about before. This is a great little binary that does some Bayesian parsing of spam and ham mail and keeps a statistical log of this. For a while now I have had a cron job that scans a 'spam to learn' and a 'ham to learn' directory of my Maildir. But I thought today, "hang on! why not send this spam off to spamcop for processing at the same time?". Sounds simple enough doesn't it! - Sure was.
#!/usr/bin/perl
use strict;
use warnings;
use MIME::Base64;
my $bogo=`which bogofilter`;
my $spam=$ENV{'HOME'};
my $sc = "submit.72uCiA8LhYgoRhMw\@spam.spamcop.net";
my $from = "ed-sc\@is-cool.net";
$bogo =~ s/\n//;
$spam =~ s/\/$//; # remove trailing slash, if it exists
my $ham = "$spam/Maildir/.jokes.ham_not_spam/cur";
my $mvham = "$spam/Maildir/cur/";
$spam .= "/Maildir/.jokes.spam_to_learn/cur";
sub send_to_spamcop {
my $file = shift;
my $file_contents;
open(F, "<$file") or
die( "cannot open file" );
my @a = <F>;
$file_contents = join( "", @a );
close(F);
#my $msg = encode_base64( $file_contents, "\n" );
my $msg = $file_contents . "\n";
my $headers = "To: <$sc>\nFrom: <$from>\n" .
"Mime-Version: 1.0
Content-Type: multipart/mixed; " .
"boundary=MP_XwhykYAupd_k5MVqakW.wfZ";
my $wire = "$headers\n\n" .
"--MP_XwhykYAupd_k5MVqakW.wfZ
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
--MP_XwhykYAupd_k5MVqakW.wfZ
Content-Type: message/rfc822
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
$msg\n\n
--MP_XwhykYAupd_k5MVqakW.wfZ--\n\n";
open( S, "|/usr/sbin/sendmail -f$from $sc" );
print( S $wire );
close(S);
}
sub updatespam {
my $file = shift;
my $action = shift;
my $cmd = "$bogo -" . ( $action eq "spam" ? "s" : "n" ) .
" -I $file";
open( F, "$cmd|" );
while( defined( my $i = <F> ) ) {
print( STDERR "$i");
}
close(F);
if( $action eq "spam" ) {
send_to_spamcop($file);
unlink($file);
}
else {
my $fn = $file;
$fn =~ s/^.*\///;
rename($file, "$mvham$fn") or die( "Cannot move file" );
}
}
opendir( my $d, $spam );
while( my $f = readdir($d) ) {
my $fn = "$spam/$f";
next if( -d $fn );
print( "$fn\n" );
updatespam($fn, "spam");
}
closedir($d);
opendir( $d, $ham );
while( my $f = readdir($d) ) {
my $fn = "$ham/$f";
next if( -d $fn );
print( "$fn\n" );
updatespam($fn, "ham");
}
closedir($d);
2007-05-17 session manger
I came across a nice extension for firefox. It taxed me that Opera has been saving my sessions on exit for a long time, and ff only does this if you `kill` it or power the box off. After some googling I did find the session manger, this offers a bit more flexibility than Opera's default save on close, such as saving the tabs that you recently closed too. A very useful tool to have indeed!
In other news, I found an article about MS Word where the end user cannot save in a format other than Word! How narrow minded!
2007-05-09 weird problem
Came across a problem today with Apache 1.3 where it ran out of file descriptors, and my knowledge of the problem I think was at the cause of finding a solution. In my eyes the /proc/ file system contains real time values that restrict system processes. However, it seems that until I killed the parent process (service) that the "apachectl restart" could take effect of the new values of /proc/sys/fs/fs-max.
Not only this but the reported open file descriptors (using lsof) did not count as high as the kernel limit. To be on the safe side I inserted ulimit -n 65535 prior to executing apache in the service run file.
On my path to finding (read fudging) a solution I found this article which promises much "performance tuning" information. As the article states, it's a bit out of date, but it's all good sysadmin knowledge.
2007-05-03 dealing with the spammers (again)
It's interesting to see a spammer return to the systems at work. This time all my scripts were waiting to see his tell-tale signs of activity. I was alerted about 2 minutes after he dropped some spam and promptly took action.
One thing I advise fellow administrators is to get their perl skills in order. Perl is ideal for anyone who has a few tasks to do that require some manipulation of text, as we know, UNIX is heavily reliant on text configuration. This is perhaps one of the biggest differences between DOS/Windows and the nixes. Most commands in unix operating systems will output text, it is therefore very useful to have some programs that can manipulate this.
An exception to this is the qmail queue data files. These have to be processed as text. When the spammer drops his egg I have scripts that go through the mail directories and count the number of messages that have a large number of recipients.
There are many other things that I have noticed about this single spammer, although he always uses a different IP block, he does have some common traits in his behaviour. Anyone who has read the book the Cuckoos Egg will know what I am talking about.
Unfortunately there is still some tidying up to be done but life gets in the way and I have things to do right now.
2007-05-01 upstart
Recently I had a bit of confusion with the new init system. These things happen... Things get changed, I don't pay attention... Anyone reading the messages would know what is going on... I was in a bit of a rush and skipped all that.
During the process of an upgrade DJB's svscan was omitted from the boot sequence. It's not really difficult to figure out on a working system how to replace the init command:
SV:123456:respawn:/command/svscanboot
With the upstart method but this is how I did it anyway. Firstly, the file /etc/event.d/svscan should be created and should contain something like the following:
start on runlevel-2 start on runlevel-3 start on runlevel-4 start on runlevel-5 stop on shutdown exec /command/svscanboot respawn
Going through the line by line (near enough) we're telling upstart to make sure that on run levels 2,3,4,5 the "exec" parameter should run. On runlevel shutdown (0/6) the exec should be killed. The exec command will run /command/svscanboot, should that die then the respawn will restart this.
Now to me, that's a whole lot more config that the original inittab entry that we had to begin with! Further more, the daemontools package from DJB handled all this quite well, we don't really care about runlevels all that much these days, besides startup, "running" and shutdown, we don't really spend much time thinking about it.
Personally I think the daemontools package handles most of the things that upstart is providing. I see from their documentation that the project looks like it is going to become a be-all-and-end-all for system process starting, but what happens when the init process has a problem of it's own? Daemontools just runs whatever should be running and symlinked in /service, this looks like a much more complex system to me.
Info