start | find | index | login or register | edit | ||
2008-05-24
by earl, 6055 days ago
Kragen Sitaker and Aristotle Pagaltzis on strlen performance with UTF-8: "GCC is better at writing x86 assembly than I (Kragen) am. Aristotle is better at writing x86 assembly than GCC is. [The] penalty for counting [..] or iterating over the characters of a UTF-8 string [..] is very small."I think the conclusion that "indexing into" an UTF-8 string is also carrying only a small penalty is the result an editorial mistake in Kragen's write-up; considering that in the introduction he quotes Aristotle with "All you lose with a variable-width encoding is direct random access to arbitrary indices in the string." Otherwise, a very fine write-up. I just love this style of exposition, with a clearly-defined hypothesis, working code and proper performance measurements. Just like early CACM, and I think that's the way a lot of computer science (still) should be presented. |
search 75 active users
backlinks (more) none, yet recent stores (more) recent comments echo earlZstrainYat|tr ZY @. |
|
earl.strain.at • esa3 • online for 8692 days • c'est un vanilla site |