Abstract
Data analysts often explore a large database to identify the data of interest, but may not be able to specify the exact query to send to the database. A manual data exploration process is labor intensive and time-consuming. In the new paradigm of system-aided interactive data exploration, the Database Management System presents the samples to the user and engages the user in an interactive exploration process to identify the user interest. In this article, we examine a number of initial sampling techniques to identify at least one positive (i.e., interesting) sample and compare them both theoretically and empirically.
| Original language | English |
|---|---|
| Pages (from-to) | 3820-3837 |
| Number of pages | 18 |
| Journal | Communications in Statistics - Theory and Methods |
| Volume | 47 |
| Issue number | 16 |
| DOIs | |
| Publication status | Published - 18 Aug 2018 |
| Externally published | Yes |
Keywords
- Databases
- Interactive data exploration
- Query-agnostic sampling