Abstract:
|
Aggregations are almost always done at the top of operator tree after all selections
and joins in a SQL query. But actually they can be done before joins and make later
joins much cheaper when used properly. Although some enumeration algorithms
considering eager aggregation are proposed, no sufficient evaluations are available
to guide the adoption of this technique in practice. And no evaluations are done
for real data sets and real queries with estimated cardinalities. That means it is not
known how eager aggregation performs in the real world.
In this thesis, a new estimation method for group by and join combining traditional
estimation method and index-based join sampling is proposed and evaluated.
Two enumeration algorithms considering eager aggregation are implemented and
compared in the context of estimated cardinality. We find that the new estimation
method works well with little overhead and that under certain conditions, eager
aggregation can dramatically accelerate queries. |