Titel: Optimizing (R)SQL: Efficient Analysis of Role-Based Data
Abstract: Modern software systems face a variety of demanding challenges. Emerging hardware, changing environments, and changing development requests drive traditional software modeling processes towards their limit. As pointed out by recent research, immutable runtime objects are likely to become the Achilles heel of modern software systems. To model long-living, highly adaptive, and seamlessly extensible systems, the concept of roles has been introduced. Like any traditional runtime object, role objects can be organized in a relational schema to be persisted in a database.
To gain business insights database systems need to answer complex queries as fast as possible. On existing infrastructures, this task boils down to finding efficient query execution plans. Although being a core research topic for decades, this task is far from being solved and designing a query optimizer tailored to role-based data yet again reveals the weak-spots of existing query optimizers: (i) frequently changing data distributions, (ii) arbitrary complex filter conditions, (iii) strong attribute correlations, (iv) and complex join paths.
This thesis therefore develops a novel selectivity estimator that achieves precise estimates independent of the filter complexity, attribute correlation, used data-types, and data-updates. The thesis subsequently studies the connection between precise filter selectivity estimates and good join-orderings. These insights enable us to tackle complex select-project-join queries with a novel enumeration scheme leading to robust join-orderings that scale particularly well within the number of joined relations. As query response times are not only dictated by the join order but also by the selection of appropriate physical operator implementations, the thesis further develops and analysis a novel learning-based operator selection strategy.