ACM SIGMOD Anthology VLDB dblp.uni-trier.de

A Taxonomy and Performance Model of Data Skew Effects in Parallel Joins.

Christopher B. Walton, Alfred G. Dale, Roy M. Jenevein: A Taxonomy and Performance Model of Data Skew Effects in Parallel Joins. VLDB 1991: 537-548
@inproceedings{DBLP:conf/vldb/WaltonDJ91,
  author    = {Christopher B. Walton and
               Alfred G. Dale and
               Roy M. Jenevein},
  editor    = {Guy M. Lohman and
               Am\'{\i}lcar Sernadas and
               Rafael Camps},
  title     = {A Taxonomy and Performance Model of Data Skew Effects in Parallel
               Joins},
  booktitle = {17th International Conference on Very Large Data Bases, September
               3-6, 1991, Barcelona, Catalonia, Spain, Proceedings},
  publisher = {Morgan Kaufmann},
  year      = {1991},
  isbn      = {1-55860-150-3},
  pages     = {537-548},
  ee        = {db/conf/vldb/WaltonDJ91.html},
  crossref  = {DBLP:conf/vldb/91},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}

Abstract

Recent work on parallel joins and data skew has concentrated on algorithm design without considering the causes and characteristics of data skew itself. Existing analytic models of skew do not contain enough information to fully describe data skew in parallel implementations. Because the assumptions made about the nature of skew vary between authors, it is almost impossible to make valid comparisons of parallel algorithms. In this paper, a taxonomy of skew effects is developed, and a new performance model is introduced. The model is used to compare the performance of two parallel join algorithms.

Copyright © 1991 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.


Online Paper

ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 1 Issue 5, VLDB '89-'97" and ... DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...

Printed Edition

Guy M. Lohman, Amílcar Sernadas, Rafael Camps (Eds.): 17th International Conference on Very Large Data Bases, September 3-6, 1991, Barcelona, Catalonia, Spain, Proceedings. Morgan Kaufmann 1991, ISBN 1-55860-150-3
CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML

References

[Baru & Frieder 1989]
Chaitanya K. Baru, Ophir Frieder: Database Operations in a Cube-Connected Multicomputer System. IEEE Trans. Computers 38(6): 920-927(1989) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Baru et al. 1987]
Chaitanya K. Baru, Ophir Frieder, Dilip D. Kandlur, Mark E. Segal: Join on a Cube: Analysis, Simulation, and Implementation. IWDM 1987: 61-74 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Boral 1988]
Haran Boral: Parallelism and Data Management. JCDKB 1988: 362-373 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Christodoulakis 1983]
Stavros Christodoulakis: Estimating record selectivities. Inf. Syst. 8(2): 105-115(1983) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Copeland et al. 1988]
George P. Copeland, William Alexander, Ellen E. Boughter, Tom W. Keller: Data Placement In Bubba. SIGMOD Conference 1988: 99-108 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[DeWitt 1986]
David J. DeWitt, Robert H. Gerber, Goetz Graefe, Michael L. Heytens, Krishna B. Kumar, M. Muralikrishna: GAMMA - A High Performance Dataflow Database Machine. VLDB 1986: 228-237 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[DeWitt et al. 1988]
David J. DeWitt, Shahram Ghandeharizadeh, Donovan A. Schneider: A Performance Analysis of the Gamma Database Machine. SIGMOD Conference 1988: 350-360 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[DeWitt 1990]
David J. DeWitt, Shahram Ghandeharizadeh, Donovan A. Schneider, Allan Bricker, Hui-I Hsiao, Rick Rasmussen: The Gamma Database Machine Project. IEEE Trans. Knowl. Data Eng. 2(1): 44-62(1990) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[DeWitt et al. 1984]
David J. DeWitt, Randy H. Katz, Frank Olken, Leonard D. Shapiro, Michael Stonebraker, David A. Wood: Implementation Techniques for Main Memory Database Systems. SIGMOD Conference 1984: 1-8 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Frieder 1990]
Ophir Frieder: Multiprocessor Algorithms for Relational-Database Operators on Hypercube Systems. IEEE Computer 23(11): 13-28(1990) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Gerber 1986]
...
[Gerber & DeWitt 1987]
...
[Hu & Muntz 1989]
...
[Kitsuregawa et al. 1983]
...
[Lakshmi & Yu 1988]
M. Seetha Lakshmi, Philip S. Yu: Effect of Skew on Join Performance in Parallel Architectures. DPDS 1988: 107-120 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Lakshmi & Yu 1989]
M. Seetha Lakshmi, Philip S. Yu: Limiting Factors of Join Performance on Parallel Processors. ICDE 1989: 488-496 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Lynch 1988]
Clifford A. Lynch: Selectivity Estimation and Query Optimization in Large Databases with Highly Skewed Distribution of Column Values. VLDB 1988: 240-251 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Montgomery et al. 1983]
...
[Omiecinski & Liu 1989]
...
[Richardson et al. 1987]
James P. Richardson, Hongjun Lu, Krishna P. Mikkilineni: Design and Evaluation of Parallel Pipelined Join Algorithms. SIGMOD Conference 1987: 399-409 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Schneider 1990]
Donovan A. Schneider, David J. DeWitt: Tradeoffs in Processing Complex Join Queries via Hashing in Multiprocessor Database Machines. VLDB 1990: 469-480 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Schneider & DeWitt 1989]
Donovan A. Schneider, David J. DeWitt: A Performance Evaluation of Four Parallel Join Algorithms in a Shared-Nothing Multiprocessor Environment. SIGMOD Conference 1989: 110-121 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Stonebraker 1986]
Michael Stonebraker: The Case for Shared Nothing. IEEE Database Eng. Bull. 9(1): 4-9(1986) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Teradata 1983]
...
[Walton et al. 1990]
...
[Wolf et al. 1990]
Joel L. Wolf, Daniel M. Dias, Philip S. Yu, John Turek: An Effective Algorithm for Parallelizing Hash Joins in the Presence of Data Skew. ICDE 1991: 200-209 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Wolf et al. 1990]
Joel L. Wolf, Daniel M. Dias, Philip S. Yu: An Effective Algorithm for Parallelizing Sort Merge in the Presence of Data Skew. DPDS 1990: 103-115 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML

Copyright © Tue Mar 16 02:22:02 2010 by Michael Ley (ley@uni-trier.de)