ACM SIGMOD Anthology VLDB dblp.uni-trier.de

Data Placement in Shared-Nothing Parallel Database Systems.

Manish Mehta, David J. DeWitt: Data Placement in Shared-Nothing Parallel Database Systems. VLDB J. 6(1): 53-72(1997)
@article{DBLP:journals/vldb/MehtaD97,
  author    = {Manish Mehta 0002 and
               David J. DeWitt},
  title     = {Data Placement in Shared-Nothing Parallel Database Systems},
  journal   = {VLDB J.},
  volume    = {6},
  number    = {1},
  year      = {1997},
  pages     = {53-72},
  ee        = {db/journals/vldb/MehtaD97.html},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}

Abstract

Data placement in shared-nothing database systems has been studied extensively in the past and various placement algorithms have been proposed. However, there is no consensus on the most efficient data placement algorithm and placement is still performed manually by a database administrator with periodic reorganization to correct mistakes. This paper presents the first comprehensive simulation study of data placement issues in a shared-nothing system. The results show that current hardware technology trends have significantly changed the performance tradeoffs considered in past studies. A simplistic data placement strategy based on the new results is developed and shown to perform well for a variety of workloads.

Key Words

Declustering, Disk allocation, Resource allocation, Resource scheduling

Copyright © 1997 by Springer, Berlin, Heidelberg. Permission to make digital or hard copies of the abstract is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice along with the full citation.


Online Edition (Springer)

Citation Page

ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 4 Issue 1, Books, VLDB-j, TODS, ..." and ... DVD Version: Load ACM SIGMOD Anthology DVD 2" and ...

References

[Bitt88]
Dina Bitton, Jim Gray: Disk Shadowing. VLDB 1988: 331-338 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Bora90]
Haran Boral, William Alexander, Larry Clay, George P. Copeland, Scott Danforth, Michael J. Franklin, Brian E. Hart, Marc G. Smith, Patrick Valduriez: Prototyping Bubba, A Highly Parallel Database System. IEEE Trans. Knowl. Data Eng. 2(1): 4-24(1990) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Brow92]
...
[Brow93]
Kurt P. Brown, Michael J. Carey, Miron Livny: Managing Memory to Meet Multiclass Workload Response Time Goals. VLDB 1993: 328-341 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Brow94]
Kurt P. Brown, Manish Mehta, Michael J. Carey, Miron Livny: Towards Automated Performance Tuning for Complex Workloads. VLDB 1994: 72-84 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Ceri84]
Stefano Ceri, Giuseppe Pelagatti: Distributed Databases: Principles and Systems. McGraw-Hill Book Company 1984, ISBN 0-07-010829-3
CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Chen92a]
Ming-Syan Chen, Ming-Ling Lo, Philip S. Yu, Honesty C. Young: Using Segmented Right-Deep Trees for the Execution of Pipelined Hash Joins. VLDB 1992: 15-26 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Chen 92b]
Ming-Syan Chen, Philip S. Yu, Kun-Lung Wu: Scheduling and Processor Allocation for Parallel Execution of Multi-Join Queries. ICDE 1992: 58-67 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Cope85]
George P. Copeland, Setrag Khoshafian: A Decomposition Storage Model. SIGMOD Conference 1985: 268-279 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Cope88]
George P. Copeland, William Alexander, Ellen E. Boughter, Tom W. Keller: Data Placement In Bubba. SIGMOD Conference 1988: 99-108 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[DeWi84]
David J. DeWitt, Randy H. Katz, Frank Olken, Leonard D. Shapiro, Michael Stonebraker, David A. Wood: Implementation Techniques for Main Memory Database Systems. SIGMOD Conference 1984: 1-8 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[DeWi90]
David J. DeWitt, Shahram Ghandeharizadeh, Donovan A. Schneider, Allan Bricker, Hui-I Hsiao, Rick Rasmussen: The Gamma Database Machine Project. IEEE Trans. Knowl. Data Eng. 2(1): 44-62(1990) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[DeWi92a]
David J. DeWitt, Jim Gray: Parallel Database Systems: The Future of High Performance Database Systems. Commun. ACM 35(6): 85-98(1992) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[DeWi92b]
David J. DeWitt, Jeffrey F. Naughton, Donovan A. Schneider, S. Seshadri: Practical Skew Handling in Parallel Joins. VLDB 1992: 27-40 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Dowd82]
Lawrence W. Dowdy, Derrell V. Foster: Comparative Models of the File Assignment Problem. ACM Comput. Surv. 14(2): 287-313(1982) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Engl91]
...
[Falo93]
Christos Faloutsos, Pravin Bhagwat: Declustering Using Fractals. PDIS 1993: 18-25 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Gerb85]
David J. DeWitt, Robert H. Gerber: Multiprocessor Hash-Based Join Algorithms. VLDB 1985: 151-164 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Ghan90]
...
[Ghan92]
Shahram Ghandeharizadeh, David J. DeWitt, Waheed Qureshi: A Performance Analysis of Alternative Multi-Attribute Declustering Strategies. SIGMOD Conference 1992: 29-38 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Grae89]
...
[Gray87]
Jim Gray, Gianfranco R. Putzolu: The 5 Minute Rule for Trading Memory for Disk Accesses and The 10 Byte Rule for Trading Memory for CPU Time. SIGMOD Conference 1987: 395-398 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Haas90]
Laura M. Haas, Walter Chang, Guy M. Lohman, John McPherson, Paul F. Wilms, George Lapis, Bruce G. Lindsay, Hamid Pirahesh, Michael J. Carey, Eugene J. Shekita: Starburst Mid-Flight: As the Dust Clears. IEEE Trans. Knowl. Data Eng. 2(1): 143-160(1990) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Hua90]
Kien A. Hua, Chiang Lee: An Adaptive Data Placement Scheme for Parallel Database Computer Systems. VLDB 1990: 493-506 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Hua91]
Kien A. Hua, Chiang Lee: Handling Data Skew in Multiprocessor Database Computers Using Partition Tuning. VLDB 1991: 525-535 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[IBM93]
...
[Kits91]
Masaru Kitsuregawa, Yasushi Ogawa: Bucket Spreading Parallel Hash: A New, Robust, Parallel Hash Join Method for Data Skew in the Super Database Computer (SDC). VLDB 1990: 210-221 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Livn87]
Miron Livny, Setrag Khoshafian, Haran Boral: Multi-Disk Management Algorithms. SIGMETRICS 1987: 69-77 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Meht93]
Manish Mehta, David J. DeWitt: Dynamic Memory Allocation for Multiple-Query Workloads. VLDB 1993: 354-367 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Meht94]
...
[Nava89]
Shamkant B. Navathe, Minyoung Ra: Vertical Partitioning for Database Design: A Graphical Algorithm. SIGMOD Conference 1989: 440-450 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Ng91]
Raymond T. Ng, Christos Faloutsos, Timos K. Sellis: Flexible Buffer Allocation Based on Marginal Gains. SIGMOD Conference 1991: 387-396 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Omie91]
Edward Omiecinski: Performance Analysis of a Load Balancing Hash-Join Algorithm for a Shared Memory Multiprocessor. VLDB 1991: 375-385 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Oszu90]
M. Tamer Özsu, Patrick Valduriez: Principles of Distributed Database Systems. Prentice-Hall 1991, ISBN 0-13-715681-2
CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Padm92]
...
[Para93]
...
[Rahm93a]
Erhard Rahm, Robert Marek: Analysis of Dynamic Load Balancing Strategies for Parallel Shared Nothing Database Systems. VLDB 1993: 182-193 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Rahm93b]
Erhard Rahm: Parallel Query Processing in Shared Disk Database Systems. HPTS 1993: 0- CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Ries78]
...
[Schn90]
Donovan A. Schneider, David J. DeWitt: Tradeoffs in Processing Complex Join Queries via Hashing in Multiprocessor Database Machines. VLDB 1990: 469-480 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Schw90]
...
[Seli93]
Patricia G. Selinger: Predictions and Challenges for Database Systems in the Year 2000. VLDB 1993: 667-675 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Sell88]
Timos K. Sellis: Multiple-Query Optimization. ACM Trans. Database Syst. 13(1): 23-52(1988) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Shat93]
Ambuj Shatdal, Jeffrey F. Naughton: Using Shared Virtual Memory for Parallel Join Processing. SIGMOD Conference 1993: 119-128 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Stel93]
...
[Tand88]
The Tandem Performance Group: A Benchmark of NonStop SQL on the Debit Credit Transaction (Invited Paper). SIGMOD Conference 1988: 337-341 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Tera85]
...
[Walt91]
Christopher B. Walton, Alfred G. Dale, Roy M. Jenevein: A Taxonomy and Performance Model of Data Skew Effects in Parallel Joins. VLDB 1991: 537-548 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Weik91]
Gerhard Weikum, Peter Zabback, Peter Scheuermann: Dynamic File Allocation in Disk Arrays. SIGMOD Conference 1991: 406-415 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Weik92]
Gerhard Weikum, Peter Zabback: Tuning of Striping Units in Disk-Array-Based File Systems. RIDE-TQP 1992: 80-87 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Wils92]
Annita N. Wilschut, Jan Flokstra, Peter M. G. Apers: Parallelism in a Main-Memory DBMS: The Performance of PRISMA/DB. VLDB 1992: 521-532 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Wolf89]
Joel L. Wolf: The Placement Optimization Program: A Practical Solution to the Disk File Assignment Problem. SIGMETRICS 1989: 1-10 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Wolf90]
Joel L. Wolf, Daniel M. Dias, Philip S. Yu, John Turek: An Effective Algorithm for Parallelizing Hash Joins in the Presence of Data Skew. ICDE 1991: 200-209 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Youn92]
...
[Yu93]
Philip S. Yu, Douglas W. Cornell: Buffer Management Based on Return on Consumption in a Multi-Query Environment. VLDB J. 2(1): 1-37(1993) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Zipf49]
George Kingsley Zipf: Human Behaviour and the Principle of Least Effort: an Introduction to Human Ecology. Addison-Wesley 1949
CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML

Copyright © Mon Mar 15 04:08:31 2010 by Michael Ley (ley@uni-trier.de)