Johannes Fett – Research Training Group

Titel: Adaptive Query Processing on GPUs

Data-level parallelism (DLP) is a heavily used hardware-driven parallelization
technique to optimize the analytical query processing, especially in in-memory
column stores. This kind of parallelism is characterized by executing essen-
tially the same operation on different data elements simultaneously. While
x86-processors support several Single Instruction Multiple Data (SIMD) exten-
sions, GPUs support a different DLP model which is called Single Instruction
Multiple Threads (SIMT). Vectorized implementations based on CPU SIMD ex-
tensions can not be ported to GPU right now. To bridge this gap, we introduce
our vision to virtualize GPUs as virtual vector engines with software-defined
SIMD intructions and to specialize hardware-oblivious vectorized operators to
GPUs using our Template Vector Library (TVL). TVL allows programming
against a hardware agnostic vectorized interface of primitive operations. The
same computation code can mapped to different instruction sets through c++
template meta programming. Context-sensitive query execution is possible by
using TVL. Context for a TVL query is a set of properties containing the avail-
able hardware, the query and possibly additional constraints. A code generator
that generates different TVL variants for instruction sets (TVL-GEN) enables
compile time adaption of execution paths for a given query.