90 minutes, 13 companies, six minutes apiece. (Six minutes… sort of like the elevator pitch at Sears Tower.)
Update: In a followup conversation, Ross Mayfield told me the 30% figure is apparently from a Gartner Group study. The term they used for this class of email is “occupational spam.”
This was incredibly cool. I want this in my email client right now.
PubSub’s search is “prospective,” in that they maintain no index or archive. Rather, users write and save queries. There are no immediate results, in most cases, because there’s no index to search. Instead, PubSub monitors thousands of inbound links and performs matches in realtime. So, write a query today, and check back tomorrow for fresh results.
The amazing thing about PubSub, the thing that merits a 3rd paragraph, is the processing speed of the system. They currently have about 700,000 saved queries, and they monitor millions of websites. For every new website update, they match it against the 700k saved queries. It comes out to over a trillion matches per day. And they run it on one dual-cpu Xeon box. That’s just astounding to me. They claim they can scale to 1.5M saved queries per box. Figure the typical user has 5 saved queries… that’s 300,000 users per box. Seems like cheap scaling to me.
Update: In a followup conversation, Bob Wyman told me that the figure of “2-3 trillion” matches per box per day is actually an “effective” rate — there is some duplication in the query list. The number of unique matches per day is smaller.
Most users, he said, save about 3 queries. “Their name, their blog…” I offered. “And their employer,” said Wyman. Me, me, and me.
I pointed out that PubSub faces a nontrivial usability challenge, in that users submit a query and get 0 results. Whether it’s a superior experience in the future or not, in the moment it feels like a failure. Wyman acknowledged this and said they’re a switch-throw away from using their own retrospective search engine to show the most recent 32 matches. They have the data, but they’ve chosen not to show it.