Research group

Machine Learning and Information Management Lab

PosDB: Distributed Column-Store

Project supervisors: George Chernishev, Kirill Smirnov
Status: Active

PosDB is an engine of a disk-based column-store DBMS designed for processing OLAP queriesin a shared nothing environment. It is written completely from scratch and aims to become a platform for studying the distributed query processing in column-stores.

Currently, query execution in PosDB is based on the Volcano model with block-oriented processing and late materialization. Various physical operators have been developed for relational operations such as join, aggregation, and selection. Some auxiliary operators were developed to support intraquery parallelism and network communication. Data distribution is achieved using horizontal range partitioning and data replication.

The current version of PosDB can execute all queries fromthe Star Schema Benchmark in both centralized and distributed environments.

Participants

Publications