ACM SIGMOD Anthology ACM SIGMOD dblp.uni-trier.de

On the Relative Cost of Sampling for Join Selectivity Estimation.

Peter J. Haas, Jeffrey F. Naughton, Arun N. Swami: On the Relative Cost of Sampling for Join Selectivity Estimation. PODS 1994: 14-24
@inproceedings{DBLP:conf/pods/HaasNS94,
  author    = {Peter J. Haas and
               Jeffrey F. Naughton and
               Arun N. Swami},
  title     = {On the Relative Cost of Sampling for Join Selectivity Estimation},
  booktitle = {Proceedings of the Thirteenth ACM SIGACT-SIGMOD-SIGART Symposium
               on Principles of Database Systems, May 24-26, 1994, Minneapolis,
               Minnesota},
  publisher = {ACM Press},
  year      = {1994},
  isbn      = {0-89791-642-5},
  pages     = {14-24},
  ee        = {http://doi.acm.org/10.1145/182591.182594, db/conf/pods/pods94-14.html},
  crossref  = {DBLP:conf/pods/94},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}

Abstract

We compare the cost of estimating the selectivity of a "star join" using sampling procedure t_cross to the cost of simply computing the join and obtaining the exact answer. Our bounds and approximations for the relative cost of sampling show how this cost depends on the size of the input relations, the number of input relations, and the precision criterion used by the estimation procedure. We also demonstrate the deleterious effect of dangling tuples and the mixed effect of data skew on the relative cost of sampling. These results provide insight into when sampling should or should not be used for join selectivity estimation.

Copyright © 1994 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


Load The ACM SIGMOD Anthology, CDROM Edition, Volume 1-3, PODS '82-'98. and ... Load The ACM SIGMOD Anthology, Silver Edition, DVD 1, Proceedings. and ...

Printed Edition

Proceedings of the Thirteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, May 24-26, 1994, Minneapolis, Minnesota. ACM Press 1994, ISBN 0-89791-642-5
Contents CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML

Online Edition: ACM Digital Library

[Abstract and Index Terms]
[Full Text in PDF Format, 924 KB]

References

[DNS92]
David J. DeWitt, Jeffrey F. Naughton, Donovan A. Schneider, S. Seshadri: Practical Skew Handling in Parallel Joins. VLDB 1992: 27-40 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[HN+93a]
Peter J. Haas, Jeffrey F. Naughton, S. Seshadri, Arun N. Swami: Fixed-Precision Estimation of Join Selectivity. PODS 1993: 190-201 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[HN+93b]
...
[HS92a]
Peter J. Haas, Arun N. Swami: Sequential Sampling Procedures for Query Size Estimation. SIGMOD Conference 1992: 341-350 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[HS92b]
...
[HOT88]
Wen-Chi Hou, Gultekin Özsoyoglu, Baldeo K. Taneja: Statistical Estimators for Relational Algebra Expressions. PODS 1988: 276-287 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[HOT89]
Wen-Chi Hou, Gultekin Özsoyoglu, Baldeo K. Taneja: Processing Aggregate Relational Queries with Hard Time Constraints. SIGMOD Conference 1989: 68-77 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[HOD91]
Wen-Chi Hou, Gultekin Özsoyoglu, Erdogan Dogdu: Error-Constraint COUNT Query Evaluation in Relational Databases. SIGMOD Conference 1991: 278-287 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[LN89]
Richard J. Lipton, Jeffrey F. Naughton: Estimating the Size of Generalized Transitive Closures. VLDB 1989: 165-171 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[LN90]
Richard J. Lipton, Jeffrey F. Naughton: Query Size Estimation by Adaptive Sampling. PODS 1990: 40-46 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[LNS90]
Richard J. Lipton, Jeffrey F. Naughton, Donovan A. Schneider: Practical Selectivity Estimation through Adaptive Sampling. SIGMOD Conference 1990: 1-11 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[OR86]
Frank Olken, Doron Rotem: Simple Random Sampling from Relational Databases. VLDB 1986: 160-169 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[OR89]
Frank Olken, Doron Rotem: Random Sampling from B+ Trees. VLDB 1989: 269-277 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[ORX90]
Frank Olken, Doron Rotem, Ping Xu: Random Sampling from Hash Files. SIGMOD Conference 1990: 375-386 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Olk93]
...
[Ses92]
...

Copyright © Fri Mar 12 17:19:57 2010 by Michael Ley (ley@uni-trier.de)