slamb is currently certified at Journeyer level.

Name: Scott Lamb
Member since: 2001-01-11 05:20:44
Last Login: 2012-12-05 07:01:30

FOAF RDF Share This



Learn about me at my homepage - Scott Lamb.


Recent blog entries by slamb

Syndication: RSS 2.0
akira, re: deleting code

One of my most productive days was throwing away 1000 lines of code.
Ken Thompson

Interesting. One of my most productive days was throwing away 15000 lines of code.

A consequence of the increased scale of systems? Maybe; probably also apples and oranges. These 15000 lines of code were written by a poorly supervised contractor, and Ken Thompson's 1000 lines were probably his own work.

spam flags

I mentioned before that Thunderbird and have slightly different flags for indicating that a message is ham rather than spam. Well, their interaction seemed to be even weirder than that alone would explain - if a message was marked as not junk in, no attempt to mark it as junk in Thunderbird would stick. Look for NonJunk and you'll find this (reformatted to fit your television):

PRBool messageClassified = PR_TRUE;
if (FindInReadable(NS_LITERAL_CSTRING("NonJunk"), keywords...)
  mDatabase->SetStringProperty(uidOfMessage, "junkscore", "0");
// Mac Mail uses "NotJunk"
else if (FindInReadable(NS_LITERAL_CSTRING("NotJunk"), keywords...)
  mDatabase->SetStringProperty(uidOfMessage, "junkscore", "0");
// ### TODO: we really should parse the keywords into
// space delimited keywords before checking
else if (FindInReadable(NS_LITERAL_CSTRING("Junk"), keywords...)
  PRUint32 newFlags;
  dbHdr->AndFlags(~MSG_FLAG_NEW, &newFlags);
  mDatabase->SetStringProperty(uidOfMessage, "junkscore", "100");
  messageClassified = PR_FALSE;

On startup, Thunderbird says that a message is not junk if said it was NotJunk. When marking a message as Junk, it doesn't clear's NotJunk flags. Brilliant! How could this plan possibly fail?

What annoys me is that Thunderbird added this feature after but made a subtle change that broke interoperability. Then they realized their parsing sucked and they were interpreting's NotJunk as saying Junk. They fixed it with this hack job and the bug popped up elsewhere - now Thunderbird's attempt to change the marking to junk won't stay across restarts. A little forethought and there wouldn't have been this mess.

8 Jun 2007 (updated 8 Jun 2007 at 05:08 UTC) »
Training server-side Bayesian filters

Last night I worked on an unobtrusive way to train SpamAssassin's Bayesian database. (Autotraining sure spam and ham as it's delivered is nice, but you at least need a way of correcting its mistakes or it will keep making them.) The sa-learn utility is quite easy to use, but how do you specify what messages to feed to it? I haven't seen any good glue for this. You want to feed it messages which have been examined and categorized, and ideally you want to feed it each message exactly once. (sa-learn does realize that it's seen a message before, but it still takes some processing time to do even that.)

I decided to harness the power of RFC 2060. My trainer connects via IMAP4rev1, executes a SEARCH command for candidates (letting the server do the work of an arbitrarily complex query), downloads the messages and pipes them through sa-learn, flags them as learned (so the next search will skip them), and disconnects. I implemented it using imapfilter, and so far it works quite well. This approach would even work well if the SpamAssassin machine were separate from the mail store machine.

In the process, I noticed that Thunderbird updates spam status on the IMAP server in the Junk and NonJunk keywords. does the same, in the Junk and NotJunk keywords (plus a few others). Did you see it? One uses NonJunk, the other NotJunk. How hard would it have been to get these guys in a room to fight this one out? Grr. They have a weird interaction because they just didn't put any thought into it.

I also tried out Lua for the first time, as it's imapfilter's extension language. Turns out I hate it. I really wanted to like it. I had been thinking of using it all over an embedded product for rapid development with little resources. It's minimalist, fast, and so on. But it's just unpleasant to use. Maybe it's too minimalist. I would have liked a separate array type (rather than just "tables" / associate arrays), and I hate "high-level" languages without exceptions. imapfilter's library is also a bit limiting - its fetch_message and pipe_to do everything in memory. That makes me more irritated that Lua doesn't just have an array slice syntax I can use to pass message lists to fetch_message. And it means I have to spawn sa-learn a bunch of times for reasonable memory consumption, and starting a Perl process heavy with modules takes a long time.

I might end up rewriting my trainer in Python using either imaplib and subprocess or twisted.mail.imap4 and twisted.internet.process. I'm not real impressed with either mail API, though. I like the JavaMail API better, but forking and interacting with child processes from Java (or even Jython) sounds painful.

2 May 2007 (updated 2 May 2007 at 22:40 UTC) »
clarkbw, re: security choices

C. You are connected to a site pretending to be … Something evil could be going on! Someone might be trying to trick you! Though odds are this isn’t true, it’s likely that guilt or the legal department required us to put this dialog up just for this case.

No, no, no, no, no! This text is the entire purpose of SSL. If it's really unlikely, then thousands of people wouldn't have created an entire ecosystem around validating identities. You have to realize that a private conversation is totally worthless if you don't know who you are talking to, and if nothing warns you when that validation fails, why would you have validation at all? This text wasn't added by lawyers; it was added by people who just spent man-centuries creating cryptosystems which would be absolutely worthless if this text were not displayed.

This dialog box shouldn't say "don't worry, this is probably something wrong with their setup. Just go on, send them your credit card number like always." That would defeat the purpose of the system so bady I'm having trouble coming up with an analogy. It's sort of like a policeman seeing someone trying to pick a lock and opening it for them, then standing by, smiling, as they walk off with all the valuables the lock was protecting. If you downplay the security concerns of sending important information over this link, you're basically telling the lock "sometimes keys screw up, just let him in." (I warned you the analogy sucked.)

It should be alarming! It needs to be alarming enough that if someone goes to their bank's website and sees this dialog box, they won't enter their password. Instead, they'll call their bank on the telephone and tell them that they've spotted fraud. This is the correct action - it's either true or it will get the correct people angry at the security people who screwed up the configuration. It's very rare for a major bank to totally botch their security setup like this.

On the other hand, it shouldn't be so alarming that it will prevent people from browsing some random untrusted website which they have no intention of sending important information to. It's not uncommon for people to require SSL on a site, not bother paying the money to have it signed by a widely- trusted CA, and have instructions for people with particularly sensitive passwords to import the certificate into their browser. That's not a site configuration problem, either - it's a "you haven't given the computer a way to verify their identity" problem.

I agree that examining a certificate and finding the problem is unrealistic for most people. Maybe the details of the certificate should be in an "Advanced" pull-out or something.

2 May 2007 (updated 2 May 2007 at 00:17 UTC) »
clarkbw, re: security choices

I'm not convinced there's a problem with the status quo. For the 90% of people you describe, the SSL certificate dialog box comes down to this:

Your connection to is insecure. It's likely that people are trying to steal your money.

Give them my money | Cancel

My parents don't understand X.509 PKI, but they do understand that they care if a connection is secure if and only if they plan to send financial credentials over it. They know - and the computer doesn't - what information they are planning to send. Thus, they are capable of responding to this dialog correctly 100% of the time. Choosing either option for them would be right less than 100% of the time. A complicated voting scheme would be right less than 100% of the time.

60 older entries...


slamb certified others as follows:

  • slamb certified slamb as Apprentice
  • slamb certified Akira as Journeyer
  • slamb certified trukfixer as Apprentice
  • slamb certified ak as Journeyer
  • slamb certified raph as Master
  • slamb certified markonen as Apprentice
  • slamb certified zanee as Apprentice
  • slamb certified chipx86 as Journeyer
  • slamb certified gstein as Master
  • slamb certified sussman as Master
  • slamb certified jerenkrantz as Master
  • slamb certified tmorgan as Apprentice
  • slamb certified habes as Journeyer
  • slamb certified ncm as Master
  • slamb certified chant as Apprentice
  • slamb certified ramoth4 as Apprentice
  • slamb certified skx as Journeyer
  • slamb certified MartySchrader as Journeyer
  • slamb certified returnoftheredi as Journeyer
  • slamb certified cdfrey as Apprentice
  • slamb certified DeepNorth as Apprentice

Others have certified slamb as follows:

  • badvogato certified slamb as Apprentice
  • trukfixer certified slamb as Apprentice
  • slamb certified slamb as Apprentice
  • CaptainNemo certified slamb as Journeyer
  • ak certified slamb as Journeyer
  • fejj certified slamb as Apprentice
  • braden certified slamb as Apprentice
  • Stevey certified slamb as Apprentice
  • lerdsuwa certified slamb as Apprentice
  • mpr certified slamb as Journeyer
  • mishan certified slamb as Apprentice
  • markonen certified slamb as Journeyer
  • ramoth4 certified slamb as Journeyer
  • richdawe certified slamb as Journeyer
  • ncm certified slamb as Journeyer
  • redi certified slamb as Journeyer
  • fxn certified slamb as Journeyer
  • nikole certified slamb as Master
  • MartySchrader certified slamb as Journeyer
  • zanee certified slamb as Journeyer
  • Omnifarious certified slamb as Journeyer

[ Certification disabled because you're not logged in. ]

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

Share this page