Mornings at the Spam Bank

“Ahhh, there’s nothing like the sweet smell of spam in the morning…”

We have a morning ritual around here that’s been going on for years, but in the last few months it’s become what I might dare to call “slightly fun”.

Every morning, no matter where I am in the world, as long as I have a computer and net access, I pull up my shell account & check the mail. I start off typing in more .mailog. That gives me sort of a scrolling screen of all the mail that was headed for my domain the past 24 hours and what happened to it.

“Want to look younger?….ayup; get your Boom Box at No Charge….ayup; Hot Sexy Videos 18 +….ayup; Cheer me Up!….ayup; Finally. Buy Viagra at a discount….ayup; You blocked my ICQ…Illegal blonde studs….Size Does Matter!…..ayup, ayup, ayup.”

There’s usually between 1 and 2 thousand of these little puppies there every day and I scroll the
mailog to make sure that no important mail got sent the wrong way. It usually doesn’t, since my domain host started implementing SpamAssassin. Which is where the slightly fun part comes in.

SpamAssassin is an open source application that now
incorporates Bayesian filtering. Someone else can explain what that does, but what it means to me is that I now have two files full of email that I maintain. One has over 1000 spams in it; one has over
1000 good emails in it. New spams and new good emails are added every day, usually automatically, and when the files get too large, I delete them and start over. The Bayesian part of the deal is that SpamAssassin “learns” from both of those files. As more and more new mail comes into each one, SpamAssassin gets smarter & smarter, at least in terms of my personal mail.

Every morning, when I’m sure that false positives and false negatives are in their respective files, I type:
sa-learn spam mbox mail/spam.mail. Then I type:
sa-learn ham mbox mail/good.mail. That’s slightly fun to me. The process used to be more of a hair-pulling chore as I examined each spam that made it through my old filters and tried to figure out why my old filters thought I needed a larger penis or Jenny “Just saying hey!”. Out of the 1 to 2 thousand emails each day, all but about 50 are spam. Why do I get that much? Well, I manage a domain that’s been around since 1994 and has several members is probably one reason. Refusing to post with a munged address on usenet or web forums is probably another. Tradition is tradition. Whatever, it’s too much to “just hit the delete key” and I’m grateful for a shell account with procmail installed.

In a related note, in an attempt to learn the finer points of SpamAssassin, I’ve been frequenting various anti-spam forums. One of them recently reminded me of how the net used to be and it was heart warming to see a return to the old days. I didn’t read all thousand or more posts, but the gist of it was that some anti-spammers got hold of a big fish and went to work on him. It was a joyous reminder of how the folks on the net would band together to “take care of business”. One group was ripping his entrails out while another team went to work on his legs with an ax. No, not literally (standard Homeland Security disclaimer).

Jeezus, I’m a donkey, not a psycho.

posted by elburro @ 09:25:14

5 thoughts on “Mornings at the Spam Bank”

  1. where’d the quoted text go? Censoring this site, burro?!? Sheesh. Cobb County is negatively impacting you. (Figures that you’ve transitioned to corporatespeak as well)(Pentecostal Republicanism, too?)

    I meant to ask whether SpamAssasin would work for those of us not running a mail server or without a shell account.

    I also inquire if, upon reflection, perhaps you should reconsider that “not a psycho” statement.
    Fri.July.03 @ 14:26:46

  2. Quoted text? I’m not sure what you meant. I only censor myself when it gets too weird and possibly incriminating. Since I’ve moved to Cobb County, I’m basically a Republican. If I want to vote, that is.
    It’s kind of like in grammer school when the teachers told us that they were technically able to vote in Communist Russia except that there was only one party on the ballot….

    As far as Spamassassin, you could check out the site; I’m not sure if there’s a Win-version that you can stick on a client machine. I’m attempting to set it up at home with sendmail on an old linux 266 mhz I had lying around. I am trying anything to keep from “tweaking” the new WinXP box. It’s the one thing that actually works around here.

    As far as the psycho statement, don’t worry, I’ve been reconsidering it all morning. At this rate I’ll get another post up by October.
    Fri.July.03 @ 16:39:46

  3. Spam. It’s overwhelming us all. Procmail isn’t an option for me (for more than one reason), so I rely on the small chunk that Earthlink’s “Spaminator” might trim before it hits my mailbox, and then useMailwasher (mailwasher.net), which identifies most spam while it’s still on the server (as well as virus’), deletes it, and sends a bounce for good measure. Only then do I actually download e-mail to my computer.

    I couldn’t tell you how much I get per day. But for every legitimate e-mail I download, I surely delete 10 on the server.

    “I am trying anything to keep from ‘tweaking’ the new WinXP box. It’s the one thing that actually works around here.”

    Teehee. I can’t hardly break mine, so far. But I’m about to try again, now that Earthlink won’t let us have 3 IP’s per DSl connection. Have to go get one of those Netgear routers and see if I can mess up both my computer and Susan’s more fragile Win98 system.

    “At this rate I’ll get another post up by October.”

    Stubborn donkey.
    Sun.July.03 @ 15:26:48

  4. Mailwasher’s a good app, since it does it’s stuff directly on the server and uses the major spam databases. SpamPal (spampal.org?) is the other.

    The Netgear router will be so simple you won’t believe it. One thing you actually need on both computers though is to “share” folders or drives (right-click/Sharing/properties). Also trivial, but it can lead to major hairpulling if overlooked.

    Speaking of Earthlink, one of their other new policies is causing a shitstorm over at the ELNK forum on DSLreports. Apparently they finally tested and brought online some serious fast nntp servers. They had the best usenet by far for ISP-provided nntp. Now that the testing is over and it’s all set up, they just limited the monthly download quota to 1.5 gb. That would be fine for me, but apparently a bunch of people are downloading 20 to 40 gigs of legal Grateful Dead and Phish files as well as a bunch of legal open source software and this will seriously cramp their style.
    Sun.July.03 @ 20:05:48

Comments are closed.