start | find | index | login or register | edit
comment-2008-12-23-6
by earl, 5795 days ago
Yes, as expected, things are getting much better with the merged index.

Indexing speed stayed the same at roughly 8m7s. merge.py suggested merging all segments into one, which I did and then rebuilt a skipfile with buildskips.py.

With the same timing method as before, most queries now take between 0.1s and 3s. For queries with large result sets, most of that time is spent reading the actual mail content, so if you page through the results there wouldn't be no noticeable delay at all before you get the first page of content.

That's once again with all settings left at their defaults. Maybe I'll do a testrun with a smaller chunksize later, but it's already fast enough:

$ time ./fts mbox 'kragen from ' 'bicicleta ' |
formail -3 -s formail -z -x Subject:
naming of "self" in Bicicleta
rational number class in Bicicleta's language
various approaches to primitive functions

real 0m0.128s
user 0m0.056s
sys 0m0.028s

(That requires a small patch of msgs.py to ignore the IOError that's caused by formail closing the pipe.)
powered by vanilla
echo earlZstrainYat|tr ZY @.
earl.strain.at • esa3 • online for 8662 days • c'est un vanilla site