Research group

Machine Learning and Information Management Lab

PosDB: Distributed Column-Store

Project supervisors: George Chernishev, Kirill Smirnov
Status: Active

PosDB is an engine of a disk-based column-store DBMS designed for processing OLAP queriesin a shared nothing environment. It is written completely from scratch and aims to become a platform for studying the distributed query processing in column-stores.

Currently, query execution in PosDB is based on the Volcano model with block-oriented processing and late materialization. Various physical operators have been developed for relational operations such as join, aggregation, and selection. Some auxiliary operators were developed to support intraquery parallelism and network communication. Data distribution is achieved using horizontal range partitioning and data replication.

The current version of PosDB can execute all queries fromthe Star Schema Benchmark in both centralized and distributed environments.

Participants

Publications

  • George Chernishev, Vyacheslav Galaktionov, Valentin Grigorev, Evgeniy Klyuchikov, Kirill Smirnov
    In Proceedings of the Second Conference on Software Engineering and Information Management. Saint Petersburg, Russia. CEUR Workshop Proceedings, 1864,
  • George Chernishev, Viacheslav Galaktionov, Valentin Grigorev, Evgeniy Klyuchikov, Kirill Smirnov
    PosDB: a Distributed Column-Store Engine
    In Proceedings of A.P. Ershov Informatics Conference (the PSI Conference Series, 11th edition), Moscow, Russia,
  • George Chernishev
    The design of an adaptive column-store system
    Journal of Big Data, 4:5,,
  • Chernishev George
    Towards Self-management in a Distributed Column-Store System
    In: Morzy T., Valduriez P., Bellatreche L. (eds) New Trends in Databases and Information Systems. ADBIS 2015. Communications in Computer and Information Science, vol 539. Springer, Cham.,