Context-Driven Compressed Data Representations in Databases

Database systems are characterized by two important aspects: (1) consistent and permanent data storage and (2) efficient and isolated data processing. To ensure these two aspects, database systems have to meet three challenges nowadays, which are briefly outlined below.

The first challenge is the efficient use of increasing main memory capacity. In order to achieve this, numerous database systems utilize a memory-centric architecture which occupies an important role for data compression. Because of the reduction in data size, the transfer times from CPU and main memory are reduced as well. This leads to decreased processing time. For that, beside basis data also intermediate results are compressed. Different algorithms for compression and decompression are suitable for various data characteristics, query types and hardware capabilities. But all algorithms share the opportunity to process the compressed data.

The second challenge is the protection of data misuse. For this, encryption and anonymization can be used. According to data characteristics and query types different property preserving encryptions like order preserving or homomorphic encryptions are suitable. Property preserving encryptions enable an efficient query processing on encrypted data.

A third challenge is the fault-tolerance on unreliable hardware. Because of the reduction in circuit line widths more transistors fit on a single chip, but the error rate of these chips increases. This is marked by transient bit flips, that occur in main-memory and CPU as well as during the data transmission. With the focus on data consistency, different kinds of error-detecting and error-correcting codes are suitable to handle this challenge.

In summary one may say that it is necessary to integrate an individual data representation for the storage and processing of data regarding to data characteristics, query types, hardware capabilities and weight of processing efficiency, confidentiality and fault-tolerance. With a view to the GRK \emph{RoSI}, the framework conditions represent a context, so that a role modeling is an adequate approach. This very aspect is intended to be examined. Moreover it shall be analyzed how to use role modeling and a corresponding model-driven approach to integrate the different encoding algorithms. The manual integration of such algorithms is possible, but time-consuming and error-prone. Furthermore a configurability of data representations and adaptivity according to the three challenges is limitedly feasible.