totient: (Default)
[personal profile] totient
It's not like I get all that much spam. But some of it causes Eudora 4.2 to crash, and I don't feel like using the adware version or paying all over again for a program that's fairly similar in features. So a while ago I installed CRM114, and I've been training it since. It got to 98 or 99% quickly, but stubbornly refuses to get any better than that. And the failures happen in both directions.

Recently I've been trying a compromise between TOE and TEFT; if a message comes through with a confidence level under 100, I'll train on it. That hasn't really helped CRM114 converge any quicker. I think fast convergence really requires shared data sets, a la Google.

Speaking of which, I've also got a Gmail account (with the same username as this one). I use this for signing up for commercial services that I think will sell my address or otherwise be annoying, and mostly only check that address when I am expecting a particular piece of mail. I thought of giving up on maintaining my own Bayesian filters and just forwarding all my mail to Gmail (which near as I can tell uses pattern-based filtering and ever-vigilant professional pattern authors), but 4.2 doesn't talk POP over SSL, and I want to be able to read mail offline, and to search current and historic mail together. And I do like Bayesian filters' ability to give me only that portion of a mailing list's traffic that will actually interest me, even when the non-interesting parts aren't spam per se.

So, why not filter just the spam to Gmail? Forwarding just the high-confidence messages should keep my Eudora from crashing, and I'll still get the false positives on my Eudora client where I can see them. But the mailbox filter to separate low-confidence and high-confidence spam was after whatever was making Eudora crash. Fortunately, rewriting the filter in procmail wasn't too hard, and now my high-confidence spam goes to my Gmail account, where it can rot for 30 days before Google automatically deletes it.

CRM114: Crash's Bayesian mail filtering program.
TOE: Train On Error.
TEFT: Train Everything.
(will be screened)
(will be screened if not validated)
If you don't have an account you can create one now.
HTML doesn't work in the subject.
More info about formatting

Profile

totient: (Default)
phi

January 2026

S M T W T F S
    1 23
45678910
11121314151617
18192021222324
25262728293031

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jan. 14th, 2026 03:14 pm
Powered by Dreamwidth Studios