eProceedings
of the
Thirtieth
International Conference
on
Very Large Data Bases
Toronto, Canada
August 31 - September 3 2004
 

 

Research Sessions

 

 

Research Session 1: Compression & Indexing

Compressing Large Boolean Matrices using Reordering Techniques
David S. Johnson, Shankar Krishnan (AT&T Labs-Research), Jatin Chhugani, Subodh Kumar (John Hopkins Univ.), Suresh Venkatasubramanian (AT&T Labs-Research)

On the Performance of Bitmap Indices for High Cardinality Attributes
Kesheng Wu, Ekow Otoo, Arie Shoshani (Lawrence Berkeley National Laboratory)

Practical Suffix Tree Construction
Sandeep Tata, Richard A. Hankins, Jignesh M. Patel (Univ. of Michigan)

 

Research Session 2: XML Views and Schemas

Answering XPath Queries over Networks by Sending Minimal Views
Keishi Tajima, Yoshiki Fukui (JAIST)

A Framework for Using Materialized XPath Views in XML Query Processing
Andrey Balmin, Fatma Özcan, Kevin Beyer, Roberta Cochrane, Hamid Pirahesh (IBM Almaden)

Schema-Free XQuery
Yunyao Li, Cong Yu, H. V. Jagadish (Univ. of Michigan)

 

Research Session 3: Controlling Access

Client-Based Access Control Management for XML Documents
Luc Bouganim (INRIA Rocquencourt), François Dang Ngoc, Philippe Pucheral (PRiSM Laboratory)

Secure XML Publishing without Information Leakage in the Presence of Data Inference
Xiaochun Yang (Northeastern Univ.), Chen Li (Univ. of Californina, Irvine)

Limiting Disclosure in Hippocratic Databases
Kristen LeFevre (IBM Almaden and Univ. of Wisconsin-Madison), Rakesh Agrawal (IBM Almaden), Vuk Ercegovac, Raghu Ramakrishnan (Univ. of Wisconsin-Madison), Yirong Xu (IBM Almaden), David DeWitt (Univ. of Wisconsin-Madison)

 

Research Session 4: XML (I)

On Testing Satisfiability of Tree Pattern Queries
Laks V. S. Lakshmanan, Ganesh Ramesh, Hui (Wendy) Wang, Zheng (Jessica) Zhao (Univ. of British Columbia)

Containment of Nested XML Queries
Xin Dong, Alon Halevy, Igor Tatarinov (Univ. of Washington)

Efficient XML-to-SQL Query Translation: Where to Add the Intelligence?
Rajasekar Krishnamurthy (IBM Almaden), Raghav Kaushik (Microsoft Research), Jeffrey Naughton (Univ. of Wisonsin-Madison)

Taming XPath Queries by Minimizing Wildcard Steps
Chee-Yong Chan (National Univ. of Singapore), Wenfei Fan (Univ. of Edinburgh and Bell Laboratories), Yiming Zeng (National Univ. of Singapore)

The NEXT Framework for Logical XQuery Optimization
Alin Deutsch, Yannis Papakonstantinou, Yu Xu (Univ. of California, San Diego)

 

Research Session 5: Stream Mining

Detecting Change in Data Streams
Daniel Kifer, Shai Ben-David, Johannes Gehrke (Cornell Univ.)

Stochastic Consistency, and Scalable Pull-Based Caching for Erratic Data Stream Sources
Shanzhong Zhu, Chinya V. Ravishankar (Univ. of California, Riverside)

False Positive or False Negative: Mining Frequent Itemsets from High Speed Transactional Data Streams
Jeffrey Xu Yu (The Chinese Univ. of Hong Kong), Zhihong Chong (Fudan Univ.), Hongjun Lu (The Hong Kong Univ. of Science and Technology), Aoying Zhou (Fudan Univ.)

 

Research Session 6: XML (II)

Indexing Temporal XML Documents
Alberto Mendelzon, Flavio Rizzolo (Univ. of Toronto), Alejandro Vaisman (Univ. of Buenos Aires)

Schema-based Scheduling of Event Processors and Buffer Minimization for Queries on Structured Data Streams
Christoph Koch, Stefanie Scherzinger (Technische Univ. Wien), Nicole Schweikardt (Humboldt Univ. Berlin), Bernhard Stegmaier (Technische Univ. München)

Bloom Histogram: Path Selectivity Estimation for XML Data with Updates
Wei Wang (Univ. of NSW), Haifeng Jiang, Hongjun Lu (Hong Kong Univ. of Science and Technology), Jeffrey Xu Yu (The Chinese Univ. of Hong Kong)

 

Research Session 7: XML and Relations

XQuery on SQL Hosts
Torsten Grust, Sherif Sakr, Jens Teubner (Univ. of Konstanz)

ROX: Relational Over XML
Alan Halverson (Univ. of Wisconsin-Madison), Vanja Josifovski, Guy Lohman, Hamid Pirahesh (IBM Almaden), Mathias Mörschel

From XML View Updates to Relational View Updates: Old Solutions to a New Problem
Vanessa Braganholo (Universidade Federal do Rio Grande do Sul), Susan Davidson (Univ. of Pennsylvania and INRIA-FUTURS), Carlos Heuser (Universidade Federal do Rio Grande do Sul)

 

Research Session 8: Stream Mining (II)

XWAVE: Optimal and Approximate Extended Wavelets for Streaming Data
Sudipto Guha (Univ. of Pennsylvania), Chulyun Kim, Kyuseok Shim (Seoul National Univ.)

REHIST: Relative Error Histogram Construction Algorithms
Sudipto Guha (Univ. of Pennsylvania), Kyuseok Shim, Jungchul Woo (Seoul National Univ.)

Distributed Set-Expression Cardinality Estimation
Abhinandan Das (Cornell Univ.), Sumit Ganguly (IIT Kanpur), Minos Garofalakis, Rajeev Rastogi (Bell Labs, Lucent Technologies)

 

Research Session 9: Stream Query Processing

Memory-Limited Execution of Windowed Stream Joins
Utkarsh Srivastava, Jennifer Widom (Stanford Univ.)

Resource Sharing in Continuous Sliding-Window Aggregates
Arvind Arasu, Jennifer Widom (Stanford Univ.)

Remembrance of Streams Past: Overload-Sensitive Management of Archived Streams
Sirish Chandrasekaran, Michael J. Franklin (UC Berkeley)

WIC: A General-Purpose Algorithm for Monitoring Web Information Sources
Sandeep Pandey, Kedar Dhamdhere, Christopher Olston (Carnegie Mellon Univ.)

 

Research Session 10: Managing Web Information Sources

Similarity Search for Web Services
Xin Dong, Alon Halevy, Jayant Madhavan, Ema Nemes, Jun Zhang (Univ. of Washington)

AWESOME: A Data Warehouse-based System for Adaptive Website Recommendations
Andreas Thor, Erhard Rahm (Univ. of Leipzig)

Accurate and Efficient Crawling for Relevant Websites
Martin Ester (Simon Fraser Univ.), Hans-Peter Kriegel, Matthias Schubert (Univ. of Munich)

Instance-based Schema Matching for Web Databases by Domain-specific Query Probing
Jiying Wang (Hong Kong Univ. of Science and Technology), Ji-Rong Wen (Microsoft Research Asia), Fred Lochovsky (Hong Kong Univ. of Science and Technology), Wei-Ying Ma (Microsoft Research Asia)

 

Research Session 11: Distributed Search and Query Processing

Computing PageRank in a Distributed Internet Search System
Yuan Wang, David DeWitt (Univ. of Wisconsin-Madison)

Enhancing P2P File-Sharing with an Internet-Scale Query Processor
Boon Thau Loo (UC Berkeley), Joseph M. Hellerstein (UC Berleley and Intel Research Berkeley), Ryan Huebsch (UC Berkeley), Scott Shenker (UC Berkeley and International Computer Science Institute), Ion Stoica (UC Berkeley)

Online Balancing of Range-Partitioned Data with Applications to Peer-to-Peer Systems
Prasanna Ganesan, Mayank Bawa, Hector Garcia-Molina (Stanford Univ.)

Network-Aware Query Processing for Stream-based Applications
Yanif Ahmad, Uğur Çetintemel (Brown Univ.)

Data Sharing Through Query Translation in Autonomous Sources
Anastasios Kementsietsidis, Marcelo Arenas (Univ. of Toronto)

 

Research Session 12: Stream Data Management Systems

Linear Road: A Stream Data Management Benchmark
Arvind Arasu (Stanford Univ), Mitch Cherniack, Eduardo Galvez (Brandeis Univ.), David Maier (Oregon Health & Science Univ.), Anurag Maskey, Esther Ryvkina (Brandeis Univ), Michael Stonebraker, Richard Tibbetts (MIT)

Query Languages and Data Models for Database Sequences and Data Streams
Yan-Nei Law (UCLA), Haixun Wang (IBM T. J. Watson Res. Ctr.), Carlo Zaniolo (UCLA)

 

Research Session 13: Auditing

Tamper Detection in Audit Logs
Richard Snodgrass, Shilong Stanley Yao, Christian Collberg (Univ. of Arizona)

Auditing Compliance with a Hippocratic Database
Rakesh Agrawal, Roberto Bayardo, Christos Faloutsos, Jerry Kiernan, Ralf Rantzau, Ramakrishnan Srikant (IBM Almaden)

 

Research Session 14: Data Warehousing

High-Dimensional OLAP: A Minimal Cubing Approach
Xiaolei Li, Jiawei Han, Hector Gonzalez (Univ. of Illinois at Urbana-Champaign)

The Polynomial Complexity of Fully Materialized Coalesced Cubes
Yannis Sismanis, Nick Roussopoulos (Univ. of Maryland at College Park)

 

Research Session 15: Link Analysis

Relational Link-based Ranking
Floris Geerts (Univ. of Edinburgh), Heikki Mannila, Evimaria Terzi (Univ. of Helsinki)

ObjectRank: Authority-Based Keyword Search in Databases
Andrey Balmin (IBM Almaden), Vagelis Hristidis (Florida International Univ.), Yannis Papakonstantinou (UC San Diego)

Combating Web Spam with TrustRank
Zoltan Gyöngyi, Hector Garcia-Molina (Stanford Univ.), Jan Pedersen (Yahoo! Inc.)

 

Research Session 16: Sensors, Grid, Pub/Sub

Model-Driven Data Acquisition in Sensor Networks
Amol Deshpande (UC Berkeley), Carlos Guestrin (Intel Research Berkeley), Samuel Madden (MIT and Intel Research Berkeley), Joseph Hellerstein (UC Berkeley and Intel Research Berkeley), Wei Hong (Intel Research Berkeley)

GridDB: A Data-Centric Overlay for Scientific Grids
David Liu, Michael J. Franklin (UC Berkeley)

Towards an Internet-Scale XML Dissemination Service
Yanlei Diao, Shariq Rizvi, Michael J. Franklin (UC Berkeley)

 

Research Session 17: Top-K Ranking

Efficiency-Quality Tradeoffs for Vector Score Aggregation
Pavan Kumar C Singitham, Mahathi Mahabhashyam (Stanford Univ.), Prabhakar Raghavan (Verity Inc.)

Merging the Results of Approximate Match Operations
Sudipto Guha (Univ. of Pennsylvania), Nick Koudas, Amit Marathe, Divesh Srivastava (AT&T Labs-Research)

Top-k Query Evaluation with Probabilistic Guarantees
Martin Theobald, Gerhard Weikum, Ralf Schenkel (Max-Planck Institute of Computer Science)

 

Research Session 18: DBMS Architecture and Performance

STEPS Towards Cache-resident Transaction Processing
Stavros Harizopoulos, Anastassia Ailamaki (Carnegie Mellon Univ.)

Write-Optimized B-Trees
Goetz Graefe (Microsoft Corp.)

Cache-Conscious Radix-Decluster Projections
Stefan Manegold, Peter Boncz, Niels Nes, Martin Kersten (CWI)

Clotho: Decoupling Memory Page Layout from Storage Organization
Minglong Shao, Jiri Schindler, Steven Schlosser, Anastassia Ailamaki, Gregory R. Ganger (Carnegie Mellon Univ.)

 

Research Session 19: Privacy

Vision Paper: Enabling Privacy for the Paranoids
Gagan Aggarwal, Mayank Bawa, Prasanna Ganesan, Hector Garcia-Molina, Krishnaram Kenthapadi, Nina Mishra, Rajeev Motwani, Utkarsh Srivastava, Dilys Thomas, Jennifer Widom, Ying Xu (Stanford Univ.)

A Privacy-Preserving Index for Range Queries
Bijit Hore, Sharad Mehrotra, Gene Tsudik (Univ. of California, Irvine)

Resilient Rights Protection for Sensor Streams
Radu Sion, Mikhail Atallah, Sunil Prabhakar (Purdue Univ.)

 

Research Session 20: Nearest Neighbor Search

Reverse kNN Search in Arbitrary Dimensionality
Yufei Tao (City Univ. of Hong Kong), Dimitris Papadias, Xiang Lian (Hong Kong Univ. of Science and Technology)

GORDER: An Efficient Method for KNN Join Processing
Chenyi Xia (National Univ. of Singapore), Hongjun Lu (Hong Kong Univ. of Science and Technology), Beng Chin Ooi, Jing Hu (National Univ. of Singapore)

Query and Update Efficient B+-Tree Based Indexing of Moving Objects
Christian S. Jensen (Aalborg Univ.), Dan Lin, Beng Chin Ooi (National Univ. of Singapore)

 

Research Session 21: Similarity Search and Applications

Indexing Large Human-Motion Databases
Eamonn Keogh, Themis Palpanas, Victor B. Zordan, Dimitrios Gunopulos (Univ. of California, Riverside), Marc Cardle (Univ. of Cambridge)

On The Marriage of Lp-norms and Edit Distance
Lei Chen (Univ. of Waterloo), Raymond Ng (Univ. of British Columbia)

Approximate NN queries on Streams with Guaranteed Error/performance Bounds
Nick Koudas (AT&T Labs-Research), Beng Chin Ooi, Kian-Lee Tan, Rui Zhang (National Univ. of Singapore)

Object Fusion in Geographic Information Systems
Catriel Beeri, Yaron Kanza, Eliyahu Safra, Yehoshua Sagiv (The Hebrew Univ.)

Maintenance of Spatial Semijoin Queries on Moving Points
Glenn Iwerks, Hanan Samet (Univ. of Maryland at College Park), Kenneth Smith (MITRE Corp.)

Voronoi-Based K Nearest Neighbor Search for Spatial Network Databases
Mohammad R. Kolahdouzan, Cyrus Shahabi (Univ. of Southern California)

A Framework for Projected Clustering of High Dimensional Data Streams
Charu Aggarwal (T.J Watson Res. Ctr), Jiawei Han, Jianyong Wang (Univ. of Illinois at Urbana-Champaign), Philip Yu (T.J Watson Res. Ctr)

 

Research Session 22: Query Processing

Efficient Query Evaluation on Probabilistic Databases
Nilesh Dalvi, Dan Suciu (Univ. of Washington)

Efficient Indexing Methods for Probabilistic Threshold Queries over Uncertain Data
Reynold Cheng, Yuni Xia, Sunil Prabhakar, Rahul Shah, Jeffrey S. Vitter (Purdue Univ.)

Probabilistic Ranking of Database Query Results
Surajit Chaudhuri, Gautam Das (Microsoft Research), Vagelis Hristidis (Florida International Univ.), Gerhard Weikum (MPI Informatik)

 

Research Session 23: Novel Models

An Annotation Management System for Relational Databases
Deepavali Bhagwat, Laura Chiticariu, Wang-Chiew Tan, Gaurav Vijayvargiya (Univ. of California, Santa Cruz)

Symmetric Relations and Cardinality-Bounded Multisets in Database Systems
Kenneth Ross, Julia Stoyanovich (Columbia Univ.)

Algebraic Manipulation of Scientific Datasets
Bill Howe, David Maier (Oregon Health & Science Univ.)

 

Research Session 24: Query Processing and Optimization

Multi-objective Query Processing for Database Systems
Wolf-Tilo Balke (Univ. of California, Berkeley), Ulrich Güntzer (University of Tübingen)

Lifting the Burden of History from Adaptive Query Processing
Amol Deshpande (Univ. of California, Berkeley), Joseph M. Hellerstein (Univ. of California, Berkeley and Intel Research Berkeley)

A Combined Framework for Grouping and Order Optimization
Thomas Neumann, Guido Moerkotte (Univ. of Mannheim)

The Case for Precision Sharing
Sailesh Krishnamurthy, Michael J. Franklin (UC Berkeley), Joseph M. Hellerstein (UC Berkeley and Intel Research Berkeley), Garrett Jacobson (UC Berkeley)