There are two different aspects of BIG DATA among the challenging Vs. Veracity and Variety require sophisticated statistical analysis, including machine learning. While Volume and Velocity are impossible without efficient technical solutions. Our research interests are spread among these directions. Most of our experience comes from the field of Information Retrieval and Databases. Currently we are focused on the following topics:
Theoretical Machine Learning (ML):
- Tree models, sequence analysis, ensembles
- GPU enabled ML
Search Engines (SE) and Information Retrieval (IR):
- ML for Ranking, SE User Behaviour Analysis, SE performance, SE evaluation
- Storage and processing of scientific, graph, etc. data
Information Management:
- Stream processing, declarative computation in BD environment
- Efficient storage and index structures, e.g column-oriented DB
- Optimization and execution of declarative queries and workflows
- Holistic application, optimization and tuning
- Data quality
- Consistency and high reliability
Besides research projects, we deliver the special courses:
- Machine Learning @Computer Science Centre (https://compscicenter.ru/courses/machine-learning-1/2016-autumn/, https://compscicenter.ru/courses/machine-learning-2/2017-spring/)
- Search Engine Architecture @St.Petersburg State University
- Information Management @St.Petersburg State University
- Database algorithms @St.Petersburg State University
Students interested in research problems in the areas of our interest are welcome to join our lab. The best way to learn more about our research is to take our courses or attend open seminars (to be done). New projects are launched regularly, sometimes it is also possible to join an ongoing project or extend its scope. Please contact the project leader for information on a specific project.
All students willing to join our projects must be skilled in either statistics, or programming, preferably in both. The best successful candidates will be invited to join one of projects as regular team members.
Publications
- In Proceedings of the Second Conference on Software Engineering and Information Management. Saint Petersburg, Russia. CEUR Workshop Proceedings, 1864,
- Proceedings of the Second Conference on Software Engineering and Information Management. Saint Petersburg, Russia,
-
Message from the editorsCEUR Workshop Proceedings, 1864,
- In Marite Kirikova, Kjetil Nørvåg, George A. Papadopoulos, Johann Gamper, Robert Wrembel, Jérôme Darmont, and Stefano Rizzi, editors, New Trends in Databases and Information Systems - ADBIS 2017 Short Papers and Workshops, AMSD, BigNovelTI, DAS, SW4CH, DC, Nicosia, Cyprus, September 24-27, 2017, Proceedings, volume 767 of Communications in Computer and Information Science, pages 275–284. Springer,
- In Yassine Ouhammou, Mirjana Ivanovic, Alberto Abelló, and Ladjel Bellatreche, editors, Model and Data Engineering - 7th International Conference, MEDI 2017, Barcelona, Spain, October 4-6, 2017, Proceedings, volume 10563 of Lecture Notes in Computer Science, pages 208–222. Springer,
-
PosDB: a Distributed Column-Store EngineIn Proceedings of A.P. Ershov Informatics Conference (the PSI Conference Series, 11th edition), Moscow, Russia,
- CEUR Workshop Proceedings, 1864,
-
PosDB: A Survey of ArchitectureProgramming and Computer Software,
-
The design of an adaptive column-store systemJournal of Big Data, 4:5,,
- Selected Papers of the XVIII International Conference on Data Analytics and Management in Data Intensive Domains (DAMDID/RCDL 2016), Ershovo, Moscow Region, Russia,
- In Proceedings of the 2016 International Conference on Management of Data (SIGMOD '16). ACM, New York, NY, USA, 2251-2252.,
- In Proceedings of DAMDID / RCDL'2016 (local), pages 132–137, Ershovo,
-
A parallel R-tree bulk-loading for shared-memory architectureIn proceedings of CIMSP'15,
-
Towards Self-management in a Distributed Column-Store SystemIn: Morzy T., Valduriez P., Bellatreche L. (eds) New Trends in Databases and Information Systems. ADBIS 2015. Communications in Computer and Information Science, vol 539. Springer, Cham.,
- In Selected Papers of XVI All-Russian Scientific Conference "Digital libraries: Advanced Methods and Technologies, Digital Collections Dubna, Russia,
-
R-tree re-evaluation effort: a reportTechnical report,
-
The Study of Multidimensional R-Tree-Based Index Scalability in Multicore EnvironmentIn: Voronkov A., Virbitskaite I. (eds) Perspectives of System Informatics. PSI 2014. Lecture Notes in Computer Science, vol 8974. Springer, Berlin, Heidelberg,
-
To sort or not to sort: the evaluation of R-Tree and B+-Tree in transactional environment with ordered result requirementТруды Института системного программирования РАН. – Т. 26. – №. 4.,
-
To Sort or not to Sort: The Evaluation of R-Tree and B+-Tree in Transactional Environment with Ordered Result Set RequirementSYRCoDIS. – Т. 1031. – С. 27-34.,
-
Реализация уровня изоляции Read Committed для древовидных структур данныхМатериалы третьей межвузовской научной конференции по проблемам информатики СПИСОК-2012,
-
ACM SIGMOD Programming Contest: an opportunity to study distinguished aspects of database systems and software engineering (in Russian)Компьютерные инструменты в образовании, 6,
-
Benchmarking Inter and Intra Operator Parallelism on Contemporary Desktop HardwareSYRCoDIS. – С. 62-67.,
-
On two methods of star query execution (in Russian)In proceedings of SPISOK conference p. 253-257.,
-
Networking and multithreading architectural aspects of distributed DBMS (in Russian)Программные продукты и системы. – №. 1.,
-
Empirical study of parallel SQL query executionТруды Института системного программирования РАН. – Т. 21.,
-
Distributed Database Query Engine”Contest Poster, ACM SIGMOD/PODS 2010, Indianapolis.,
-
ScienceDirect goes social: a social network for scientists integrated with online digital libraryContest Poster, ACM SIGIR 2010, Geneve.,
Group Members
-
Boris Novikov Head of Laboratory
-
Igor Kuralenok Researcher
-
George Chernishev Lead scientist
-
Kirill Smirnov Lead scientist
-
Anastasia Birillo Researcher
-
Nikita Bobrov Researcher
-
Viacheslav Galaktionov Researcher
-
Valentin Grigorev Researcher
-
Evgeniy Klyuchikov Researcher
-
Nikita Marshalkin Researcher
-
Nickolay Saveliev Researcher
-
Vsevolod Sevostyanov Researcher
-
Artem Trofimov Researcher