Título:
|
Adaptive sampling methods for scaling up knowledge discovery algorithms
|
Autor/a:
|
Domingo Soriano, Carlos; Gavaldà Mestre, Ricard; Watanabe, Osamu
|
Otros autores:
|
Universitat Politècnica de Catalunya. Departament de Ciències de la Computació |
Abstract:
|
One of the biggest research challenges in KDD and Data Mining
is to develop methods that scale up well to large amounts of data.
A possible approach for achieving scalability is to take a random sample and do data mining on it. In this paper, we propose an adaptive sampling method to solve a variety of practically appearing data mining tasks on very large data.
Our algorithms are adaptive in the sense that they determine
from the data whether it has already seen enough
data to reach a reliable conclusion.
We prove the correctness of our method,
estimate its efficiency theoretically, and show its
efficienty experimentally on a concrete task requiring sampling. |
Materia(s):
|
-Àrees temàtiques de la UPC::Informàtica -Scalability -KDD -Data mining -Knowledge discovery algorithms |
Derechos:
|
|
Tipo de documento:
|
Artículo - Borrador Informe |
Compartir:
|
|