ACM SIGMOD Anthology VLDB dblp.uni-trier.de

Data Partitioning and Load Balancing in Parallel Disk Systems.

Peter Scheuermann, Gerhard Weikum, Peter Zabback: Data Partitioning and Load Balancing in Parallel Disk Systems. VLDB J. 7(1): 48-66(1998)
@article{DBLP:journals/vldb/ScheuermannWZ98,
  author    = {Peter Scheuermann and
               Gerhard Weikum and
               Peter Zabback},
  title     = {Data Partitioning and Load Balancing in Parallel Disk Systems},
  journal   = {VLDB J.},
  volume    = {7},
  number    = {1},
  year      = {1998},
  pages     = {48-66},
  ee        = {db/journals/vldb/ScheuermannWZ98.html},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}

Abstract

Parallel disk systems provide opportunities for exploiting I/O parallelism in two possible ways, namely via inter-request and intra-request parallelism. In this paper, we discuss the main issues in performance tuning of such systems, namely striping and load balancing, and show their relationship to response time and throughput. We outline the main components of an intelligent, self-reliant file system that aims to optimize striping by taking into account the requirements of the applications, and performs load balancing by judicious file allocation and dynamic redistributions of the data when access patterns change. Our system uses simple but effective heuristics that incur only little overhead. We present performance experiments based on synthetic workloads and real-life traces.

Key Words

Parallel disk systems - Performance tuning - File striping - Data allocation - Load balancing - Disk cooling

Copyright © 1998 by Springer, Berlin, Heidelberg. Permission to make digital or hard copies of the abstract is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice along with the full citation.


Online Edition (Springer)

Citation Page

ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 5 Issue 2, JACM, VLDB-J, POS, ..." and ... DVD Version: Load ACM SIGMOD Anthology DVD 2" and ...

References

[1]
Thomas E. Anderson, David E. Culler, David A. Patterson: A Case for NOW (Networks Of Workstations). IEEE Micro 15(1): 54-64(1995) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[2]
Baruch Awerbuch, Yair Bartal, Amos Fiat: Competitive distributed file allocation. STOC 1993: 164-173 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[3]
Yossi Azar, Joseph Naor, Raphael Rom: The Competitiveness of On-Line Assignments. SODA 1992: 203-210 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[4]
Sedat Akyürek, Kenneth Salem: Adaptive Block Rearrangement. ACM Trans. Comput. Syst. 13(2): 89-121(1995) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[5]
Yair Bartal, Amos Fiat, Howard J. Karloff, Rakesh Vohra: New Algorithms for an Ancient Scheduling Problem. STOC 1992: 51-58 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[6]
Yair Bartal, Amos Fiat, Yuval Rabani: Competitive Algorithms for Distributed Data Management (Extended Abstract). STOC 1992: 39-50 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[7]
...
[8]
Dina Bitton, Jim Gray: Disk Shadowing. VLDB 1988: 331-338 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[9]
Donald D. Chamberlin, Frank B. Schmuck: Dynamic Data Distribution (D3) in a Shared-Nothing Multiprocessor Data Store. VLDB 1992: 163-174 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[10]
Peter M. Chen, Edward L. Lee, Garth A. Gibson, Randy H. Katz, David A. Patterson: RAID: High-Performance, Reliable Secondary Storage. ACM Comput. Surv. 26(2): 145-185(1994) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[11]
Peter M. Chen, Edward K. Lee: Striping in a RAID Level 5 Disk Array. SIGMETRICS 1995: 136-145 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[12]
Peter M. Chen, David A. Patterson: Maximizing Performance in a Striped Disk Array. ISCA 1990: 322-331 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[13]
...
[14]
George P. Copeland, William Alexander, Ellen E. Boughter, Tom W. Keller: Data Placement In Bubba. SIGMOD Conference 1988: 99-108 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[15]
George P. Copeland, Tom W. Keller: A Comparison Of High-Availability Media Recovery Techniques. SIGMOD Conference 1989: 98-109 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[16]
...
[17]
Edward G. Coffman Jr., M. R. Garey, David S. Johnson: An Application of Bin-Packing to Multiprocessor Scheduling. SIAM J. Comput. 7(1): 1-17(1978) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[18]
David J. DeWitt, Jim Gray: Parallel Database Systems: The Future of High Performance Database Systems. Commun. ACM 35(6): 85-98(1992) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[19]
Lawrence W. Dowdy, Derrell V. Foster: Comparative Models of the File Assignment Problem. ACM Comput. Surv. 14(2): 287-313(1982) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[20]
David Hung-Chang Du, J. S. Sobolewski: Disk Allocation for Cartesian Product Files on Multiple-Disk Systems. ACM Trans. Database Syst. 7(1): 82-101(1982) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[21]
Christos Faloutsos, Dimitris N. Metaxas: Disk Allocation Methods Using Error Correcting Codes. IEEE Trans. Computers 40(8): 907-914(1991) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[22]
...
[23]
...
[24]
...
[25]
Gregory R. Ganger, Bruce L. Worthington, Robert Y. Hou, Yale N. Patt: Disk Arrays: High-Performance, High-Reliability Storage Subsystems. IEEE Computer 27(3): 30-36(1994) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[26]
M. R. Garey, David S. Johnson: Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman 1979, ISBN 0-7167-1044-7
CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[27]
Shahram Ghandeharizadeh, David J. DeWitt: A Multiuser Performance Analysis of Alternative Declustering Strategies. ICDE 1990: 466-475 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[28]
Shahram Ghandeharizadeh, David J. DeWitt: Hybrid-Range Partitioning Strategy: A New Declustering Strategy for Multiprocessor Database Machines. VLDB 1990: 481-492 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[29]
...
[30]
Garth A. Gibson, Lisa Hellerstein, Richard M. Karp, Randy H. Katz, David A. Patterson: Failure Correction Techniques for Large Disk Arrays. ASPLOS 1989: 123-132 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[31]
...
[32]
...
[33]
Jim Gray, Bob Horst, Mark Walker: Parity Striping of Disk Arrays: Low-Cost Reliable Storage with Acceptable Throughput. VLDB 1990: 148-161 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[34]
Jim Gray, Andreas Reuter: Transaction Processing: Concepts and Techniques. Morgan Kaufmann 1993, ISBN 1-55860-190-2
Contents CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[35]
Hui-I Hsiao, David J. DeWitt: A Performance Study of Three High Available Data Replication Strategies. Distributed and Parallel Databases 1(1): 53-80(1993) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[36]
Mark Holland, Garth A. Gibson: Parity Declustering for Continuous Operation in Redundant Disk Arrays. ASPLOS 1992: 23-35 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[37]
Mark Holland, Garth A. Gibson, Daniel P. Siewiorek: Architectures and Algorithms for On-Line Failure Recovery in Redundant Disk Arrays. Distributed and Parallel Databases 2(3): 295-335(1994) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[38]
Kien A. Hua, Chiang Lee, Honesty C. Young: Data partitioning for multicomputer datbase systems: a cell-based approach. Inf. Syst. 18(5): 329-342(1993) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[39]
...
[40]
David R. Karger, Steven J. Phillips, Eric Torng: A Better Algorithm for an Ancient Scheduling Problem. SODA 1994: 132-140 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[41]
...
[42]
Randy H. Katz, Wei Hong: The Performance of Disk Arrays in Shared-Memory Database Machines. Distributed and Parallel Databases 1(2): 167-198(1993) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[43]
Michelle Y. Kim: Synchronized Disk Interleaving. IEEE Trans. Computers 35(11): 978-988(1986) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[44]
Myoung-Ho Kim, Sakti Pramanik: Optimal File Distribution For Partial Match Retrieval. SIGMOD Conference 1988: 173-182 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[45]
Michelle Y. Kim, Asser N. Tantawi: Asynchronous Disk Interleaving: Approximating Access Delays. IEEE Trans. Computers 40(7): 801-810(1991) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[46]
...
[47]
Donald E. Knuth: The Art of Computer Programming, Volume III: Sorting and Searching. Addison-Wesley 1973, ISBN 0-201-03803-X
CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[48]
Edward K. Lee, Randy H. Katz: An Analytic Performance Model of Disk Arrays. SIGMETRICS 1993: 98-109 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[49]
...
[50]
Witold Litwin, Marie-Anne Neimat, Donovan A. Schneider: LH* - Linear Hashing for Distributed Files. SIGMOD Conference 1993: 327-336 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[51]
Miron Livny, Setrag Khoshafian, Haran Boral: Multi-Disk Management Algorithms. SIGMETRICS 1987: 69-77 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[52]
...
[53]
Jai Menon: Performance of RAID5 Disk Arrays with Read and Write Caching. Distributed and Parallel Databases 2(3): 261-293(1994) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[54]
Jai Menon, Jim Cortney: The Architecture of a Fault-Tolerant Cached RAID Controller. ISCA 1993: 76-86 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[55]
C. Mohan, Hamid Pirahesh, W. Grace Tang, Yun Wang: Parallelism in Relational Database Management Systems. IBM Systems Journal 33(2): 349-371(1994) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[56]
...
[57]
Richard R. Muntz, John C. S. Lui: Performance Analysis of Disk Arrays under Failure. VLDB 1990: 162-173 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[58]
Arif Merchant, Philip S. Yu: Design and Modeling of Clustered RAID. FTCS 1992: 140-149 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[59]
Arif Merchant, Philip S. Yu: Analytic Modeling and Comparisons of Striping Strategies for Replicated Disk Arrays. IEEE Trans. Computers 44(3): 419-433(1995) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[60]
Kazuhiko Mogi, Masaru Kitsuregawa: Dynamic Parity Stripe Reorganizations for RAID5 Disk Arrays. PDIS 1994: 17-26 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[61]
Kazuhiko Mogi, Masaru Kitsuregawa: Hot Block Clustering for Disk Arrays with Dynamic Striping. VLDB 1995: 90-99 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[62]
Randolf D. Nelson, Asser N. Tantawi: Approximate Analysis of Fork/Join Synchronization in Parallel Queues. IEEE Trans. Computers 37(6): 739-743(1988) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[63]
Yale N. Patt: The I/O Subsystem - A Candidate for Improvement: Guest Editor's Introduction. IEEE Computer 27(3): 15-16(1994) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[64]
Christos A. Polyzois, Anupam Bhide, Daniel M. Dias: Disk Mirroring with Alternating Deferred Updates. VLDB 1993: 604-617 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[65]
David A. Patterson, Garth A. Gibson, Randy H. Katz: A Case for Redundant Arrays of Inexpensive Disks (RAID). SIGMOD Conference 1988: 109-116 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[66]
...
[67]
Chris Ruemmler, John Wilkes: An Introduction to Disk Drive Modeling. IEEE Computer 27(3): 17-28(1994) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[68]
Kenneth Salem, Hector Garcia-Molina: Disk Striping. ICDE 1986: 336-342 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[69]
...
[70]
Daniel Stodolsky, Garth A. Gibson, Mark Holland: Parity Logging Overcoming the Small Write Problem in Redundant Disk Arrays. ISCA 1993: 64-75 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[71]
...
[72]
Peter Scheuermann, Gerhard Weikum, Peter Zabback: Adaptive Load Balancing in Disk Arrays. FODO 1993: 345-360 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[73]
...
[74]
Jon A. Solworth, Cyril U. Orji: Distored Mapping Techniques to Achieve High Performance in Mirrored Disk Systems. Distributed and Parallel Databases 1(1): 81-102(1993) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[75]
Michael Stonebraker, Paul M. Aoki, Witold Litwin, Avi Pfeffer, Adam Sah, Jeff Sidell, Carl Staelin, Andrew Yu: Mariposa: A Wide-Area Distributed Database System. VLDB J. 5(1): 48-63(1996) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[76]
Radek Vingralek, Yuri Breitbart, Gerhard Weikum: Distributed File Organization with Scalable Cost/Performance. SIGMOD Conference 1994: 253-264 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[77]
Radek Vingralek, Yuri Breitbart, Gerhard Weikum: Snowball: Scalable Storage on Networks of Workstations with Balanced Load. Distributed and Parallel Databases 6(2): 117-156(1998) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[78]
Gerhard Weikum, Christof Hasse, Alex Moenkeberg, Peter Zabback: The COMFORT Automatic Tuning Project, Invited Project Review. Inf. Syst. 19(5): 381-432(1994) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[79]
Gerhard Weikum, Peter Zabback, Peter Scheuermann: Dynamic File Allocation in Disk Arrays. SIGMOD Conference 1991: 406-415 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[80]
Gerhard Weikum, Peter Zabback: Tuning of Striping Units in Disk-Array-Based File Systems. RIDE-TQP 1992: 80-87 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[81]
John Wilkes, Richard A. Golding, Carl Staelin, Tim Sullivan: The HP AutoRAID Hierarchical Storage System. SOSP 1995: 96-108 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[82]
Joel L. Wolf: The Placement Optimization Program: A Practical Solution to the Disk File Assignment Problem. SIGMETRICS 1989: 1-10 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[83]
Ouri Wolfson, Sushil Jajodia: Distributed Algorithms for Dynamic Replication of Data. PODS 1992: 149-163 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[84]
...

Copyright © Fri Mar 12 17:34:26 2010 by Michael Ley (ley@uni-trier.de)