start | find | index | login or register | edit
by earl, 4827 days ago
MonetDB/X100 - A DBMS in the CPU Cache (2005) is a highly recommended read for those interested in modern main-memory data processing (read: those who share my passion for K). A few gems:

"[The] actual data processing in the operators is performed by a set of execution primitives - simple, specialized and CPU-efficient functions. [..] The simplicity of these primitives allows compilers to produce code that achieve [instructions per cycle] of over 2 and [..] spend only a few cycles per tuple. In the case of a multiplication of two values, X100 spends 2 cycles per tuple, whereas MySQL spends 49."

MonetDB/X100 combines a decomposed storage model (inverted, column-oriented; choose your preferred naming) with a tuple-oriented processing and tries to limit materialization of temporary results as far as possible. Columns are further split up into 'vectors' which are sized to optimize cache usage. A set of such attribute vectors forms the basic block of processing for their 'primitives'.

"[..] primitives in X100 are generated from a so-called signature request and a code pattern. This allows X100 to generate compound primitives that execute an entire expression subtree (e.g. (a*1.19+b)*0.95). Currently, primitive generation is predefined, but the ultimate goal is to allow a query optimizer to generate and compile compound primitives in a just-in-time fashion."

A bit more detail is given in in Boncz et. al. (2005), suggested reading if you find the shorter paper stimulating :)

Zukowski, M., Boncz, P. A., Nes, N., Heman, S. (2005) MonetDB/X100 - A DBMS In The CPU Cache. IEEE Data Engineering Bulletin, 28(2), June 2005.

Boncz, P. A., Zukowski, M., Nes., N. (2005) MonetDB/X100: Hyper-Pipelining Query Execution. In Proceedings of the Biennial Conference on Innovative Data Systems Research (CIDR), Asilomar, CA, USA, January 2005.
powered by vanilla
echo earlZstrainYat|tr ZY @. • esa3 • online for 6687 days • c'est un vanilla site