Volume 12, 2018-2019

Lei Chen and Fatma Özcan
Founding Editor-in-Chief:
H. V. Jagadish
Managing Editor:
Divesh Srivastava
Advisory Committee:
Peter Boncz, Xin Luna Dong, Juliana Freire, Jayant Haritsa, Wolfgang Lehner, Renée J. Miller, Tova Milo, M. Tamer Özsu
Associate Editors:
Azza Abouzied, Selcuk Candan, Surajit Chaudhuri, Amol Desphande, Johann-Christoph Freytag, Rainer Gemulla, Nick Koudas, Georgia Koutrika, Yunyao Li, Alexandra Meliou, Arnab Nandi, M. Tamer Özsu, Themis Palpanas, Alkis Polyzotis, Kyuseok Shim, Xiaokui Xiao, Meihui Zhang
Review Board:

Volume 12, No. 1

Lei Chen and Fatma Özcan: Front Matter i - vi

1 - 13

List Intersection for Web Search: Algorithms, Cost Models, and Optimizations

Sunghwan Kim, Taesung Lee, Seungwon Hwang, Sameh Elnikety

14 - 27

Interactive Checks for Coordination Avoidance

Michael Whittaker, Joseph M. Hellerstein

28 - 42

Pigeonring: A Principle for Faster Thresholded Similarity Search

Jianbin Qin, Chuan Xiao

43 - 56

Local Algorithms for Hierarchical Dense Subgraph Discovery

Ahmet Erdem Sariyuce, C. Seshadhri, Ali Pinar

57 - 70

Cost-Effective Data Annotation using Game-Based Crowdsourcing

Jingru Yang, Ju Fan, Zhewei Wei, Guoliang Li, Tongyu Liu, Xiaoyong Du

71 - 84

Optimization for Active Learning-based Interactive Database Exploration

Enhui Huang, Liping Peng, Luciano Di Palma, Ahmed Abdelkafi, Anna Liu, Yanlei Diao

Volume 12, No. 2

Lei Chen and Fatma Özcan: Front Matter i - vi

85 - 98

Exploring Change - A New Dimension of Data Analytics

Tobias Bleifuß, Leon Bornemann, Theodore Johnson, Dmitri V. Kalashnikov, Felix Naumann, Divesh Srivastava

99 - 111

The Flexible Socio Spatial Group Queries

Bishwamittra Ghosh, Mohammed Eunus Ali, Farhana M. Choudhury, Sajid Hasan, Timos Sellis, Jianxin Li

112 - 127

The Lernaean Hydra of Data Series Similarity Search: An Experimental Evaluation of the State of the Art

Karima Echihabi, Kostas Zoumpatianos, Themis Palpanas, Houda Benbrahim

128 - 140

Rafiki: Machine Learning as an Analytics Service System

Wei Wang, Sheng Wang, Jinyang Gao, Meihui Zhang, Gang Chen, Teck Khim Ng, Beng Chin Ooi, Jie Shao

141 - 153

Automatic Index Selection for Large-Scale Datalog Computation

Pavle Subotic, Herbert Jordan, Lijun Chang, Alan Fekete, Bernhard Scholz

154 - 168

Start Late or Finish Early: A Distributed Graph Processing System with Redundancy Reduction

Shuang Song, Xu Liu, Qinzhe Wu, Andreas Gerstlauer, Tao Li, Lizy K. John

169 - 182

Improving Optimistic Concurrency Control Through Transaction Batching and Operation Reordering

Bailu Ding, Lucja Kot, Johannes Gehrke

Volume 12, No. 3

Lei Chen and Fatma Özcan: Front Matter i - vi

183 - 196

Query Log Compression for Workload Analytics

Ting Xie, Varun Chandola, Oliver Kennedy

197 - 209

The Maximum Trajectory Coverage Query in Spatial Databases

Mohammed Eunus Ali, Shadman Saqib Eusuf, Kaysar Abdullah, Farhana M. Choudhury, J. Shane Culpepper, Timos Sellis

210 - 222

Towards a Learning Optimizer for Shared Clouds

Chenggang Wu, Alekh Jindal, Saeed Amizadeh, Hiren Patel, Wangchao Le, Shi Qiao, Sriram Rao

223 - 236

Snuba: Automating Weak Supervision to Label Training Data

Paroma Varma, Christopher Re

237 - 250

On Obtaining Stable Rankings

Abolfazl Asudeh, H. Jagadish, Gerome Miklau, Julia Stoyanovich

251 - 264

PS-Tree-based Efficient Boolean Expression Matching for High Dimensional and Dense Workloads

Shuping Ji, Hans-Arno Jacobsen

265 - 277

SWIFT: Mining Representative Patterns from Large Event Streams

Yizhou Yan, Lei Cao, Samuel Madden, Elke Rundensteiner

278 - 291

Smurf: Self-Service String Matching Using Random Forests

Paul Suganthan G. C., Adel Ardalan, Anhai Doan, Aditya Akella

292 - 306

Chasing Similarity: Distribution-aware Aggregation Scheduling

Feilong Liu, Ario Salmasi, Spyros Blanas, Anastasios Sidiropoulos

307 - 320

ShrinkWrap: Efficient SQL Query Processing in Differentially Private Data Federations

Johes Bater, Xi He, William Ehrich, Ashwin Machanavajjhala, Jennie Rogers

Volume 12, No. 4

Lei Chen and Fatma Özcan: Front Matter i - vi

321 - 334

A Study of Partitioning Policies for Graph Analytics on Large-scale Distributed Platforms

Gurbinder Gill, Roshan Dathathri, Loc Hoang, Keshav Pingali

335 - 347

Utility-Driven Graph Summarization

K. Ashwin Kumar, Petros Efstathopoulos

348 - 361

ColumnML: Column-Store Machine Learning with On-The-Fly Data Transformation

Kaan Kara, Ken Eguro, Ce Zhang, Gustavo Alonso

362 - 375

Cost-efficient Data Acquisition on Online Data Marketplaces for Correlation Analysis

Yanying Li, Haipei Sun, Boxiang Dong, Hui (wendy) Wang

376 - 389

Cleaning Crowdsourced Labels Using Oracles For Statistical Classification

Mohamad Dolatshah, Mathew Teoh, Jiannan Wang, Jian Pei

390 - 403

Beyond Macrobenchmarks: Microbenchmark-based Graph Database Evaluation

Matteo Lissandrini, Martin Brugnara, Yannis Velegrakis

404 - 418

IPA: Invariant-preserving Applications for Weakly consistent Replicated Databases

Valter Balegas, Sérgio Duarte, Carla Ferreira, Rodrigo Rodrigues, Nuno Preguiça

419 - 432

DIFF: A Relational Interface for Large-Scale Data Explanation

Firas Abuzaid, Peter Kraft, Sahaana Suri, Edward Gan, Eric Xu, Atul Shenoy, Asvin Anathanaraya, John Sheu, Erik Meijer, Xi Wu, Jeff Naughton, Peter Bailis, Matei Zaharia

433 - 445

Stream Frequency Over Interval Queries

Ran Ben Basat, Roy Friedman, Rana Shahout

446 - 460

Helix: Holistic Optimization for Accelerating Iterative Machine Learning

Doris Xin, Stephen Macke, Litian Ma, Jialin Liu, Shuchen Song, Aditya Parameswaran

Volume 12, No. 5

Lei Chen and Fatma Özcan: Front Matter i - vi

461 - 474

Fast Approximate Nearest Neighbor Search With The Navigating Spreading-out Graph

Cong Fu, Chao Xiang, Changxu Wang, Deng Cai

475 - 487

Document Reordering for Faster Intersection

Qi Wang, Torsten Suel

488 - 501

Correlation Constraint Shortest Path over Large Multi-Relation Graphs

Xiaofei Zhang, Tamer Özsu

502 - 515

Performance-Optimal Filtering: Bloom overtakes Cuckoo at High-Throughput

Harald Lang, Thomas Neumann, Alfons Kemper, Peter Boncz

516 - 530

Analyzing Efficient Stream Processing on Modern Hardware

Steffen Zeuch, Sebastian Breß, Tilmann Rabl, Bonaventura Del Monte, Jeyhun Karimov, Clemens Lutz, Manuel Renz, Jonas Traub, Volker Markl

531 - 543

Efficient Data Ingestion and Query Processing for LSM-Based Storage Systems

Chen Luo, Michael Carey

544 - 556

HetExchange: Encapsulating heterogeneous CPU–GPU parallelism in JIT compiled engines

Periklis Chrysogelos, Manos Karpathiotakis, Raja Appuswamy, Anastasia Ailamaki

557 - 569

Meta-Mappings for Schema Mapping Reuse

Paolo Atzeni, Luigi Bellomarini, Paolo Papotti, Riccardo Torlone

570 - 583

An Experimental Evaluation of Garbage Collectors on Big Data Applications

Lijie Xu, Tian Guo, Wensheng Dou, Wei Wang, Jun Wei

584 - 596

Adaptive Optimistic Concurrency Control for Heterogeneous Workloads

Jinwei Guo, Peng Cai, Jiahao Wang, Weining Qian, Aoying Zhou

597 - 610

MgCrab: Transaction Crabbing for Live Migration in Deterministic Database Systems

Yu-Shan Lin, Shao-Kan Pi, Meng-Kai Liao, Ching Tsai, Aaron Elmore, Shan-Hung Wu

611 - 623

Unifying Consensus and Atomic Commitment for Effective Cloud Data Management

Sujaya Maiyya, Faisal Nawab, Divy Agrawal, Amr El Abbadi

Volume 12, No. 6

Lei Chen and Fatma Özcan: Front Matter i - vi

624 - 638

Autoscaling Tiered Cloud Storage in Anna

Chenggang Wu, Vikram Sreekanti, Joseph Hellerstein

639 - 652

Snapshot Semantics for Temporal Multiset Relations

Anton Dign√∂s, Boris Glavic, Xing Niu, Johann Gamper, Michael Böhlen

653 - 666

Certus: An Effective Entity Resolution Approach with Graph Differential Dependencies (GDDs)

Selasi Kwashie, Jixue Liu, Jiuyong Li, Lin Liu, Markus Stumptner, Lujing Yang

667 - 680

Efficient and Effective Algorithms for Clustering Uncertain Graphs

Kai Han, Fei Gui, Xiaokui Xiao, Jing Tang, Yuntian He, Zongmai Cao, He Huang

681 - 694

Pangea: Monolithic Distributed Storage for Data Analytics

Jia Zou, Arun Iyengar, Chris Jermaine

695 - 708

Scaling-Up In-Memory Datalog Processing: Observations and Techniques

Zhiwei Fan, Jianqiao Zhu, Zuyu Zhang, Aws Albarghouthi, Paraschos Koutris, Jignesh Patel

709 - 723

Cache-aware load balancing of data center applications

Aaron Archer, Kevin Aydin, Mohammadhossein Bateni, Vahab Mirrokni, Aaron Schild, Ray Yang, Richard Zhuang

Volume 12, No. 7

Lei Chen and Fatma Özcan: Front Matter i - vi

724 - 737

Minimizing Cost by Reducing Scaling Operations in Distributed Stream Processing

Michael Borkowski, Christoph Hochreiner, Stefan Schulte

738 - 751

ProvCite: Provenance-based Data Citation

Yinjun Wu, Abdussalam Alawini, Daniel Deutch, Tova Milo, Susan Davidson

752 - 765

Deducing Certain Fixes to Graphs

Wenfei Fan, Ping Lu, Chao Tian, Jingren Zhou

766 - 778

Solving k-center Clustering (with Outliers) in MapReduce and Streaming, almost as Accurately as Sequentially

Matteo Ceccarello, Andrea Pietracaprina, Geppino Pucci

779 - 792

Explain3D: Explaining Disagreements in Disjoint Datasets

Xiaolan Wang, Alexandra Meliou

793 - 806

DASH: Database Shadowing for Mobile DBMS

Youjip Won, Sundoo Kim, Juseong Yun, Damquang Tuan, Jiwon Seo

807 - 821

Accelerating Generalized Linear Models with MLWeaving: A One-Size-Fits-All System for Any-precision Learning

Zeke Wang, Kaan Kara, Hantian Zhang, Gustavo Alonso, Ce Zhang, Onur Mutlu

822 - 835

Declarative Recursive Computation on an RDBMS, or, Why You Should Use a Database For Distributed Machine Learning

Dimitrije Jankov, Shangyu Luo, Binhang Yuan, Zhuhua Cai, Jia Zou, Chris Jermaine, Zekai Gao

Volume 12, No. 8

Lei Chen and Fatma Özcan: Front Matter i - vi

836 - 849

Design, Implementation, and Evaluation of Write-Back Policy with Cache Augmented Data Stores

Shahram Ghandeharizadeh, Hieu Nguyen

850 - 863

User Guidance for Efficient Fact Checking

Thanh Tam Nguyen, Hongzhi Yin, Matthias Weidlich, Bolong Zheng, Quoc Viet Hung Nguyen, Bela Stantic

864 - 876

An In-Depth Comparison of s-t Reliability Algorithms over Uncertain Graphs

Xiangyu Ke, Arijit Khan, Leroy Lim

877 - 890

Dynamic Scaling for Parallel Graph Computations

Wenfei Fan, Chunming Hu, Muyang Liu, Ping Lu, Qiang Yin, Jingren Zhou

891 - 905

TopoX: Topology Refactorization for Efficient Graph Partitioning and Processing

Yiming Zhang, Dongsheng Li, Jinyan Wang, Kian-Lee Tan

906 - 919

Multi-Dimensional Balanced Graph Partitioning via Projected Gradient Descent

Dmitrii Avdiukhin, Sergey Pupyrev, Grigory Yaroslavtsev

920 - 932

Efficient Discovery of Sequence Outlier Patterns

Lei Cao, Yizhou Yan, Samuel Madden, Elke Rundensteiner, Mathan Gopalsamy

933 - 947

A Comparative Evaluation of Order-Revealing Encryption Schemes and Secure Range-Query Protocols

Dmytro Bogatov, George Kollios, Leo Reyzin

Volume 12, No. 9

Lei Chen and Fatma Özcan: Front Matter i - vi

948 - 960

k/2-hop: Fast Mining of Convoy Patterns With Effective Pruning

Faisal Orakzai, Toon Calders, Torben Pedersen

961 - 974

Balance-Aware Distributed String Similarity-Based Query Processing System

Ji Sun, Zeyuan Shang, Guoliang Li, Zhifeng Bao, Dong Deng

975 - 988

Fine-Grained, Secure and Efficient Data Provenance on Blockchain Systems

Pingcheng Ruan, Gang Chen, Anh Dinh, Qian Lin, Beng Chin Ooi, Meihui Zhang

989 - 1001

Progressive Top-k Subarray Query Processing in Array Databases

Dalsu Choi, Chang-Sup Park, Yon Dohn Chung

1002 - 1015

Megaphone: Latency-conscious state migration for distributed streaming dataflows

Moritz Hoffmann, Andrea Lattuada, Frank Mcsherry, Vasiliki Kalavri, John Liagouris, Timothy Roscoe

1016 - 1029

From Anomaly Detection to Rumour Detection using Data Streams of Social Platforms

Thanh Tam Nguyen, Matthias Weidlich, Bolong Zheng, Hongzhi Yin, Quoc Viet Hung Nguyen, Bela Stantic

1030 - 1043

Obscure: Information-Theoretic Oblivious and Verifiable Aggregation Queries

Peeush Gupta, Yin Li, Sharad Mehrotra, Nisha Panwar, Shantanu Sharma, Sumaya Almanee

1044 - 1057

Selectivity Estimation for Range Predicates using Lightweight Models

Anshuman Dutt, Chi Wang, Azade Nazi, Srikanth Kandula, Vivek Narasayya, Surajit Chaudhuri

Volume 12, No. 10

Lei Chen and Fatma Özcan: Front Matter i - vi

1058 - 1070

Constrained Shortest Path Query in a Large Time-Dependent Graph

Ye Yuan, Xiang Lian, Guoren Wang, Yuliang Ma, Yishu Wang

1071 - 1084

Finding Theme Communities from Database Networks

Lingyang Chu, Zhefeng Wang, Jian Pei, Yanyan Zhang, Yu Yang, Enhong Chen

1085 - 1098

Ridesharing: Simulator, Benchmark, and Evaluation

James Pan, Guoliang Li, Juntao Hu

1099 - 1112

Distributed Subgraph Matching on Timely Dataflow

Longbin Lai, Zhu Qing, Zhengyi Yang, Xin Jin, Zhengmin Lai, Ran Wang, Kongzhang Hao, Xuemin Lin, Lu Qin, Wenjie Zhang, Ying Zhang, Zhengping Qian, Jingren Zhou

1113 - 1125

Hyper Dimension Shuffle: Efficient Data Repartition at Petabyte Scale in Scope

Shi Qiao, Adrian Nicoara, Jin Sun, Marc Friedman, Hiren Patel, Jaliya Ekanayake

1126 - 1138

Answering Range Queries Under Local Differential Privacy

Graham Cormode, Tejas Kulkarni, Divesh Srivastava

1139 - 1152

Vertex Priority Based Butterfly Counting for Large-scale Bipartite Networks

Kai Wang, Xuemin Lin, Lu Qin, Wenjie Zhang, Ying Zhang

1153 - 1166

Block as a Value for SQL over NoSQL

Yang Cao, Wenfei Fan, Tengfei Yuan

1167 - 1180

Optimal and General Out-of-Order Sliding-Window Aggregation

Kanat Tangwongsan, Martin Hirzel, Scott Schneider

1181 - 1194

Creating Top Ranking Options in the Continuous Option and Preference Space

Bo Tang, Kyriakos Mouratidis, Man Lung Yiu, Zhenyu Chen

1195 - 1207

Ontology-based Entity Matching in Attributed Graphs

Hanchao Ma, Morteza Alipourlangouri, Yinghui Wu, Fei Chiang, Jiaxing Pi

1208 - 1220

Real-time Distributed Co-Movement Pattern Detection on Streaming Trajectories

Lu Chen, Yunjun Gao, Ziquan Fang, Xiaoye Miao, Christian Jensen, Chenjuan Guo

1221 - 1234

iBTune: Individualized Buffer Tuning for Large-scale Cloud Databases

Jian Tan, Tieying Zhang, Feifei Li, Jie Chen, Qixing Zheng, Ping Zhang, Honglin Qiao, Yue Shi, Wei Cao, Rui Zhang

Volume 12, No. 11

Lei Chen and Fatma Özcan: Front Matter i - viii

1235 - 1248

Online Template Induction for Machine-Generated Emails

Michael J. Whittaker, Nick Edmonds, Sandeep Tata, James B. Wendt, Marc Najork

1249 - 1261

Querying Shortest Paths on Time Dependent Road Networks

Yong Wang, Guoliang Li, Nan Tang

1262 - 1275

Example-Driven Query Intent Discovery: Abductive Reasoning using Semantic Similarity

Anna Fariha, Alexandra Meliou

1276 - 1288

Automated Verification of Query Equivalence Using Satisfiability Modulo Theories

Qi Zhou, Joy Arulraj, Shamkant Navathe, William Harris, Dong Xu

1289 - 1302

Towards a Unified Framework for String Similarity Joins

Pengfei Xu, Jiaheng Lu

1303 - 1315

NETS: Extremely Fast Outlier Detection from a Data Stream via Set-Based Processing

Susik Yoon, Jae-Gil Lee, Byung Suk Lee

1316 - 1329

STAR: Scaling Transactions through Asymmetric Replication

Yi Lu, Xiangyao Yu, Samuel Madden

1330 - 1343

Subjective Databases

Yuliang Li, Aaron Feng, Jinfeng Li, Saran Mumick, Alon Halevy, Vivian Li, Wang-Chiew Tan

1344 - 1356

Fast and Robust Distributed Subgraph Enumeration

Xuguang Ren, Junhu Wang, Wook-Shin Han, Jeffrey Xu Yu

1357 - 1370

An Experimental Evaluation of Large Scale GBDT Systems

Fangcheng Fu, Jiawei Jiang, Yingxia Shao, Bin Cui

1371 - 1384

PrivateSQL: A Differentially Private SQL Query Engine

Ios Kotsogiannis, Yuchao Tao, Xi He, Maryam Fanaeepour, Ashwin Machanavajjhala, Michael Hay, Gerome Miklau

1385 - 1398

CAPER: A Cross-Application Permissioned Blockchain

Mohammad Javad Amiri, Divyakant Agrawal, Amr El Abbadi

1399 - 1413

Crossbow: Scaling Deep Learning with Small Batch Sizes on Multi-GPU Servers

Alexandros Koliousis, Pijika Watcharapichat, Matthias Weidlich, Luo Mai, Paolo Costa, Peter Pietzuch

1414 - 1426

Finding Attribute-Aware Similar Region for Data Analysis

Kaiyu Feng, Gao Cong, Christian S. Jensen, Tao Guo

1427 - 1441

Intermittent Query Processing

Dixin Tang, Zechao Shang, Aaron J. Elmore, Sanjay Krishnan, Michael J. Franklin

1442 - 1457

Hillview: A trillion-cell spreadsheet for big data

Mihai Budiu, Parikshit Gopalan, Lalith Suresh, Udi Wieder, Han Kruiger, Marcos K. Aguilera

1458 - 1470

Embedded Functional Dependencies and Data-completeness Tailored Database Design

Ziheng Wei, Sebastian Link

1471 - 1484

Ocean Vista: Gossip-Based Visibility Control for Speedy Geo-Distributed Transactions

Hua Fan, Wojciech Golab

1485 - 1498

An IDEA: An Ingestion Framework for Data Enrichment in AsterixDB

Xikui Wang, Michael Carey

1499 - 1512

DimmStore: Memory Power Optimization for Database Systems

Alexey Karyakin, Kenneth Salem

1513 - 1525

Generating Application-specific Data Layouts for In-memory Databases

Cong Yan, Alvin Cheung

1526 - 1538

Rewriting of Plain SO Tgds into Nested Tgds

Rihan Hai, Christoph Quix

1539 - 1552

Blockchain Meets Database: Design and Implementation of a Blockchain Relational Database

Senthil Nathan, Chander Govindarajan, Adarsh Saraf, Manish Sethi, Praveen Jayachandran

1553 - 1567

An Intermediate Representation for Optimizing Machine Learning Pipelines

Andreas Kunft, Asterios Katsifodimos, Sebastian Schelter, Sebastian Bress, Tilmann Rabl, Volker Markl

1568 - 1582

Accelerating Raw Data Analysis with the ACCORDA Software and Hardware Architecture

Yuanwei Fang, Chen Zou, Andrew Chien

1583 - 1596

Comparing Synopsis Techniques for Approximate Spatial Data Analysis

A. B. Siddique, Ahmed Eldawy, Vagelis Hristidis

1597 - 1609

BlockchainDB - A Shared Database on Blockchains

Muhammad El-Hindi, Carsten Binnig, Arvind Arasu, Donald Kossmann, Ravi Ramamurthy

1610 - 1623

Efficient Task-Specific Data Valuation for Nearest Neighbor Algorithms

Ruoxi Jia, David Dao, Boxin Wang, Frances Ann Hubis, Nezihe Merve Gürel, Bo Li, Ce Zhang, Costas J. Spanos, Dawn Song

1624 - 1636

Distributed Implementations of Dependency Discovery Algorithms

Hemant Saxena, Lukasz Golab, Ihab F. Ilyas

1637 - 1650

Rethinking Database High Availability with RDMA Networks

Erfan Zamanian, Xiangyao Yu, Michael Stonebraker, Tim Kraska

1651 - 1663

Motivo: Fast Motif Counting via Succinct Color Coding and Adaptive Sampling

Marco Bressan, Stefano Leucci, Alessandro Panconesi

1664 - 1678

Arx: An Encrypted Database using Semantically Secure Encryption

Rishabh Poddar, Tobias Boelter, Raluca Ada Popa

1679 - 1691

Efficient Knowledge Graph Accuracy Evaluation

Junyang Gao, Xian Li, Yifan Ethan Xu, Bunyamin Sisman, Xin Luna Dong, Jun Yang

1692 - 1704

Optimizing Subgraph Queries by Combining Binary and Worst-Case Optimal Joins

Amine Mhedhbi, Semih Salihoglu

1705 - 1718

Neo: A Learned Query Optimizer

Ryan C. Marcus, Parimarjan Negi, Hongzi Mao, Chi Zhang, Mohammad Alizadeh, Tim Kraska, Olga Papaemmanouil, Nesime Tatbul

1719 - 1732

Efficient Algorithms for Densest Subgraph Discovery

Yixiang Fang, Kaiqiang Yu, Reynold Cheng, Laks V.s. Lakshmanan, Xuemin Lin

1733 - 1746

Plan-Structured Deep Neural Network Models for Query Performance Prediction

Ryan C. Marcus, Olga Papaemmanouil

1747 - 1761

SLOG: Serializable, Low-latency, Geo-replicated Transactions

Kun Ren, Dennis Li, Daniel J. Abadi

1762 - 1777

GRAIL: Efficient Time-Series Representation Learning

John Paparrizos, Michael Franklin

Volume 12, No. 12

Lei Chen and Fatma Özcan: Front Matter i - xiii

1778 - 1781

GALO: Guided Automated Learning for re-Optimization

Guilherme Damasio, Spencer Bryson, Vincent Corvinelli, Parke Godfrey, Piotr Mierzejewski, Jaroslaw Szlichta, Calisto Zuzarte

1782 - 1785

Synergistic Graph and SQL Analytics Inside IBM Db2

Yuanyuan Tian, Sui Jun Tong, Mir Hamid Pirahesh, Wen Sun, En Liang Xu, Wei Zhao

1786 - 1789

Cleanits: A Data Cleaning System for Industrial Time Series

Xiaoou Ding, Hongzhi Wang, Jiaxuan Su, Zijue Li, Jianzhong Li, Hong Gao

1790 - 1793

ITAA: An Intelligent Trajectory-driven Outdoor Advertising Deployment Assistant

Yipeng Zhang, Zhifeng Bao, Songsong Mo, Yuchen Li, Yanghao Zhou

1794 - 1797

SystemER: A Human-in-the-loop System for Explainable Entity Resolution

Kun Qian, Lucian Popa, Prithviraj Sen

1798 - 1801

Buckle: Evaluating Fact Checking Algorithms Built on Knowledge Bases

Viet-Phi Huynh, Paolo Papotti

1802 - 1805

A Query System for Efficiently Investigating Complex Attack Behaviors for Enterprise Security

Peng Gao, Xusheng Xiao, Zhichun Li, Kangkook Jee, Fengyuan Xu, Sanjeev R. Kulkarni, Prateek Mittal

1806 - 1809

CAPE: Explaining Outliers by Counterbalancing

Zhengjie Miao, Qitian Zeng, Chenjie Li, Boris Glavic, Oliver Kennedy, Sudeepa Roy

1810 - 1813

BlackMagic: Automatic Inlining of Scalar UDFs into SQL Queries with Froid

Karthik Ramachandra, Kwanghyun Park

1814 - 1817

ProgressiveDB - Progressive Data Analytics as a Middleware

Lukas Berg, Tobias Ziegler, Carsten Binnig, Uwe Röhm

1818 - 1821

doppioDB 2.0: Hardware Techniques for Improved Integration of Machine Learning into Databases

Kaan Kara, Zeke Wang, Ce Zhang, Gustavo Alonso

1822 - 1825

COVIZ: A System for Visual Formation and Exploration of Patient Cohorts

Cícero A. L. Pahins, Behrooz Omidvar-Tehrani, Sihem Amer-Yahia, Valérie Siroux, Jean-Louis Pepin, Jean-Christian Borel, João Comba

1826 - 1829

PRIMAT: A Toolbox for Fast Privacy-preserving Matching

Martin Franke, Ziad Sehili, Erhard Rahm

1830 - 1833

NashDB: Fragmentation, Replication, and Provisioning using Economic Methods

Ryan Marcus, Chi Zhang, Shuai Yu, Geoffrey Kao, Olga Papaemmanouil

1834 - 1837

Flash in Action: Scalable Spatial Data Analysis Using Markov Logic Networks

Ibrahim Sabek, Mashaal Musleh, Mohamed F. Mokbel

1838 - 1841

I Can't Believe It's Not (Only) Software! Bionic Distributed Storage for Parquet Files

Lucas Kuhring, Zsolt Istv√°n

1842 - 1845

VISE: Vehicle Image Search Engine with Traffic Camera

Hyewon Choi, Erkang Zhu, Arsala Bangash, Renée J. Miller

1846 - 1849

WiClean: A System for Fixing Wikipedia Interlinks Using Revision History Patterns

Stephan Goldberg, Tova Milo, Slava Novgorodov, Kathy Razmadze

1850 - 1853

SparkCruise: Handsfree Computation Reuse in Spark

Abhishek Roy, Alekh Jindal, Hiren Patel, Ashit Gosalia, Subru Krishnan, Carlo Curino

1854 - 1857

In-database Distributed Machine Learning: Demonstration using Teradata SQL Engine

Sandeep Singh Sandha, Wellington Cabrera, Mohammed Al-Kateb, Sanjay Nair, Mani Srivastava

1858 - 1861

SHOAL: Large-scale Hierarchical Taxonomy via Graph-based Query Coalition in E-commerce

Zhao Li, Xia Chen, Xuming Pan, Pengcheng Zou, Yuchen Li, Guoxian Yu

1862 - 1865

DPSAaS: Multi-Dimensional Data Sharing and Analytics as Services under Local Differential Privacy

Min Xu, Tianhao Wang, Bolin Ding, Jingren Zhou, Cheng Hong, Zhicong Huang

1866 - 1869

PriSTE: Protecting Spatiotemporal Event Privacy in Continuous Location-Based Services

Yang Cao, Yonghui Xiao, Li Xiong, Liquan Bai, Masatoshi Yoshikawa

1870 - 1873

Datalignment: Ontology Schema Alignment Through Datalog Containment

Daniel Deutch, Evgeny Marants, Yuval Moskovitch

1874 - 1877

IHCS: An Integrated Hybrid Cleaning System

Congcong Ge, Yunjun Gao, Xiaoye Miao, Lu Chen, Christian S. Jensen, Ziyuan Zhu

1878 - 1881

CAPRIO: Graph-based Integration of Indoor and Outdoor Data for Path Discovery

Constantinos Costa, Xiaoyu Ge, Panos K. Chrysanthis

1882 - 1885

HERMIT in Action: Succinct Secondary Indexing Mechanism via Correlation Exploration

Yingjun Wu, Jia Yu, Yuanyuan Tian, Richard Sidle, Ronald Barber

1886 - 1889

DISPERS: Securing Highly Distributed Queries on Personal Data Management Systems

Julien Loudet, Iulian Sandu-Popa, Luc Bouganim

1890 - 1893

Stateful Functions as a Service in Action

Adil Akhter, Marios Fragkoulis, Asterios Katsifodimos

1894 - 1897

Demonstration of Krypton: Optimized CNN Inference for Occlusion-based Deep CNN Explanations

Allen Ordookhanians, Xin Li, Supun Nakandala, Arun Kumar

1898 - 1901

LensXPlain: Visualizing and Explaining Contributing Subsets for Aggregate Query Answers

Zhengjie Miao, Andrew Lee, Sudeepa Roy

1902 - 1905

Juneau: Data Lake Management for Jupyter

Yi Zhang, Zachary G. Ives

1906 - 1909

ApproxML: Efficient Approximate Ad-Hoc ML Models Through Materialization and Reuse

Sona Hasani, Faezeh Ghaderi, Shohedul Hasan, Saravanan Thirumuruganathan, Abolfazl Asudeh, Nick Koudas, Gautam Das

1910 - 1913

Flare & Lantern: Efficiently Swapping Horses Midstream

Grégory Essertel, Ruby Y. Tahboub, Fei Wang, James Decker, Tiark Rompf

1914 - 1917

Trinity: An Extensible Synthesis Framework for Data Science

Ruben Martins, Jia Chen, Yanju Chen, Yu Feng, Isil Dillig

1918 - 1921

PSynDB: Accurate and Accessible Private Data Generation

Zhiqi Huang, Ryan Mckenna, George Bissias, Gerome Miklau, Michael Hay, Ashwin Machanavajjhala

1922 - 1925

FishStore: Fast Ingestion and Indexing of Raw Data

Badrish Chandramouli, Dong Xie, Yinan Li, Donald Kossmann

1926 - 1929

Spade: A Modular Framework for Analytical Exploration of RDF Graphs

Yanlei Diao, Pawel Guzewicz, Ioana Manolescu, Mirjana Mazuran

1930 - 1933

Making an RDBMS Data Scientist Friendly: Advanced In-database Interactive Analytics with Visualization Support

Joseph Vinish D’silva, Florestan De Moor, Bettina Kemme

1934 - 1937

UDAO: A Next-Generation Unified Data Analytics Optimizer

Khaled Zaouk, Fei Song, Chenghao Lyu, Arnab Sinha, Yanlei Diao, Prashant Shenoy

1938 - 1941

AggChecker: A Fact-Checking System for Text Summaries of Relational Data Sets

Saehan Jo, Immanuel Trummer, Weicheng Yu, Xuezhi Wang, Cong Yu, Daniel Liu, Niyati Mehta

1942 - 1945

GRANO: Interactive Graph-based Root Cause Analysis for Cloud-Native Distributed Data Platform

Hanzhang Wang, Phuong Nguyen, Jun Li, Selcuk Kopru, Gene Zhang, Sanjeev Katariya, Sami Ben-Romdhane

1946 - 1949

Dietcoin: Hardening Bitcoin Transaction Verification Process For Mobile Devices

Davide Frey, Marc X. Makkes, Pierre-Louis Roman, François Taïani, Spyros Voulgaris

1950 - 1953

Raptor: Large Scale Analysis of Big Raster and Vector Data

Samriddhi Singla, Ahmed Eldawy, Rami Alghamdi, Mohamed F. Mokbel

1954 - 1957

Data Civilizer 2.0: A Holistic Framework for Data Preparation and Analytics

El Kindi Rezig, Lei Cao, Michael Stonebraker, Giovanni Simonini, Wenbo Tao, Samuel Madden, Mourad Ouzzani, Nan Tang, Ahmed K. Elmagarmid

1958 - 1961

Tuplex: Robust, Efficient Analytics When Python Rules

Leonhard F. Spiegelberg, Tim Kraska

1962 - 1965

Ease.ml/ci and Ease.ml/meter in Action: Towards Data Management for Statistical Generalization

Cedric Renggli, Frances Ann Hubis, Bojan Karlaš, Kevin Schawinski, Wentao Wu, Ce Zhang

1966 - 1969

PivotE: Revealing and Visualizing the Underlying Entity Structures for Exploration

Han Xueran, Jun Chen, Jiaheng Lu, Yueguo Chen, Xiaoyong Du

1970 - 1973

Speedup Your Analytics: Automatic Parameter Tuning for Databases and Big Data Systems

Jiaheng Lu, Yuxing Chen, Herodotos Herodotou, Shivnath Babu

1974 - 1977

TextCube: Automated Construction and Multidimensional Exploration

Yu Meng, Jiaxin Huang, Jingbo Shang, Jiawei Han

1978 - 1981

The Ever Evolving Online Labor Market: Overview, Challenges and Opportunities

Sihem Amer-Yahia, Senjuti Basu Roy

1982 - 1985

Machine Learning Meets Big Spatial Data

Ibrahim Sabek, Mohamed F. Mokbel

1986 - 1989

Data Lake Management: Challenges and Opportunities

Fatemeh Nargesian, Erkang Zhu, Renée J. Miller, Ken Pu, Patricia C. Arocena

1990 - 1993

Combating Fake News: A Data Management and Mining Perspective

Laks V.s. Lakshmanan, Michael Simpson, Saravanan Thirumuruganathan

1994 - 1997

Personal Database Security and Trusted Execution Environments: A Tutorial at the Crossroads

Nicolas Anciaux, Luc Bouganim, Philippe Pucheral, Iulian Sandu Popa, Guillaume Scerri

1998 - 2009

SAP HANA goes private - From Privacy Research to Privacy Aware Enterprise Analytics

Stephan Kessler, Jens Hoff, Johann-Christoph Freytag

2010 - 2021

Guided automated learning for query workload re-optimization

Guilherme Damasio, Vincent Corvinelli, Parke Godfrey, Piotr Mierzejewski, Alex Mihaylov, Jaroslaw Szlichta, Calisto Zuzarte

2022 - 2034

Procella: Unifying serving and analytical data at YouTube

Biswapesh Chattopadhyay, Priyam Dutta, Weiran Liu, Ott Tinn, Andrew Mccormick, Aniket Mokashi, Paul Harvey, Hector Gonzalez, David Lomax, Sagar Mittal, Roee Ebenstein, Nikita Mikhaylin, Hung-Ching Lee, Xiaoyan Zhao, Tony Xu, Luis Perez, Farhad Shahmohammadi, Tran Bui, Neil Mckay, Selcuk Aya, Vera Lychagina, Brett Elliott

2035 - 2046

A Lightweight and Efficient Temporal Database Management System in TDSQL

Wei Lu, Zhanhao Zhao, Xiaoyu Wang, Haixiang Li, Zhenmiao Zhang, Zhiyu Shui, Sheng Ye, Anqun Pan, Xiaoyong Du

2047 - 2058

Native Store Extension for SAP HANA

Reza Sherkat, Colin Florendo, Mihnea Andrei, Rolando Blanco, Adrian Dragusanu, Amit Pathak, Pushkar Khadilkar, Neeraj Kulkarni, Christian Lemke, Sebastian Seifert, Sarika Iyer, Sasikanth Gottapu, Robert Schulze, Chaitanya Gottipati, Nirvik Basak, Yanhong Wang, Vivek Kandiyanallur, Santosh Pendap, Dheren Gala, Rajesh Almeida, Prasanta Ghosh

2059 - 2070

AnalyticDB: Real-time OLAP Database System at Alibaba Cloud

Chaoqun Zhan, Maomeng Su, Chuangxian Wei, Xiaoqiang Peng, Liang Lin, Sheng Wang, Zhe Chen, Feifei Li, Yue Pan, Fang Zheng, Chengliang Chai

2071 - 2081

Tunable Consistency in MongoDB

William Schultz, Tess Avitabile, Alyson Cabral

2082 - 2093

TitAnt: Online Real-time Transaction Fraud Detection in Ant Financial

Shaosheng Cao, Xinxing Yang, Cen Chen, Jun Zhou, Xiaolong Li, Yuan Qi

2094 - 2105

AliGraph: A Comprehensive Graph Neural Network Platform

Rong Zhu, Kun Zhao, Hongxia Yang, Wei Lin, Chang Zhou, Baole Ai, Yong Li, Jingren Zhou

2106 - 2117

Customizable and Scalable Fuzzy Join for Big Data

Zhimin Chen, Yue Wang, Vivek Narasayya, Surajit Chaudhuri

2118 - 2130

QTune: A Query-Aware Database Tuning System with Deep Reinforcement Learning

Guoliang Li, Xuanhe Zhou, Shifu Li, Bo Gao

2131 - 2142

Experiences with Approximating Queries in Microsoft's Production Big-Data Clusters

Srikanth Kandula, Kukjin Lee, Surajit Chaudhuri, Marc Friedman

2143 - 2154

Constant Time Recovery in Azure SQL Database

Panagiotis Antonopoulos, Peter Byrne, Wayne Chen, Cristian Diaconu, Raghavendra Thallam Kodandaramaih, Hanuma Kodavalla, Prashanth Purnananda, Adrian-Leonard Radu, Chaitanya Sreenivas Ravella, Girish Mittur Venkataramanappa

2155 - 2169

Yugong: Geo-Distributed Data and Job Placement at Scale

Yuzhen Huang, Yingjie Shi, Zheng Zhong, Yihui Feng, James Cheng, Jiwei Li, Haochuan Fan, Chao Li, Tao Guan, Jingren Zhou

2170 - 2182

Choosing A Cloud DBMS: Architectures and Tradeoffs

Junjay Tan, Thanaa Ghanem, Matthew Perron, Xiangyao Yu, Michael Stonebraker, David Dewitt, Marco Serafini, Ashraf Aboulnaga, Tim Kraska

2183 - 2194

S3: A Scalable In-memory Skip-List Index for Key-Value Store

Jingtian Zhang, Sai Wu, Zeyuan Tan, Gang Chen, Zhushi Cheng, Wei Cao, Yusong Gao, Xiaojie Feng

2195 - 2205

DDSketch: A Fast and Fully-Mergeable Quantile Sketch with Relative-Error Guarantees

Charles Masson, Jee E. Rim, Homin K. Lee

2206 - 2217

A Distributed System for Large-scale n-gram Language Models at Tencent

Qiang Long, Wei Wang, Jinfu Deng, Song Liu, Wenhao Huang, Fangying Chen, Sifan Liu

2218 - 2229

A Morsel-Driven Query Execution Engine for Heterogeneous Multi-Cores

Kayhan Dursun, Carsten Binnig, Ugur Cetintemel, Garret Swart, Weiwei Gong

2230 - 2241

Smile: A System to Support Machine Learning on EEG Data at Scale

Lei Cao, Wenbo Tao, Sungtae An, Jing Jin, Yizhou Yan, Xiaoyu Liu, Wendong Ge, Adam Sah, Leilani Battle, Jimeng Sun, Remco Chang, Brandon Westover, Samuel Madden, Michael Stonebraker

2242 - 2253

Updating Graph Databases with Cypher

Alastair Green, Paolo Guagliardo, Leonid Libkin, Tobias Lindaaker, Victor Marsault, Stefan Plantikow, Martin Schuster, Petra Selmer, Hannes Voigt

2254 - 2262

Adapting TPC-C Benchmark to Measure Performance of Multi-Document Transactions in MongoDB

Asya Kamsky

2263 - 2272

Cloud native database systems at Alibaba: Opportunities and Challenges

Feifei Li

2273 - 2274

In-Memory for the masses: Enabling cost-efficient deployments of in-memory data management platforms for business applications

Alexander Boehm

2275 - 2286

Couchbase Analytics: NoETL for Scalable NoSQL Data Analysis

Murtadha Al Hubail, Ali Alsuliman, Michael Blow, Michael Carey, Dmitry Lychagin, Ian Maxon, Till Westmann

2287 - 2289

Performance in the spotlight

Adrian Coyler

2290 - 2299

Integration of Large-Scale Data Processing Systems and Traditional Parallel Database Technology

Azza Abouzied, Daniel J. Abadi, Kamil Bajda-Pawlikowski, Avi Silberschatz

2300 - 2307

PNUTS to Sherpa: Lessons from Yahoo!’s Cloud Database

Brian F. Cooper, P.p.s. Narayan, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, Ramana Yerneni

2308 - 2308

What I probably did right and what I think I could have done better

Wang-Chiew Tan

2309 - 2322

Enabling Data Science for the Majority

Aditya Parameswaran

2323 - 2324

Opportunities for Data Management Research in the Era of Horizontal AI/ML

Theodoros Rekatsinas, Sudeepa Roy, Manasi Vartak, Ce Zhang, Neoklis Polyzotis

Volume 12, No. 13

Lei Chen and Fatma Özcan: Front Matter i - vi

2325 - 2338

Strong consistency is not hard to get: Two-Phase Locking and Two-Phase Commit on Thousands of Cores

Claude Barthels, Ingo Müller, Konstantin Taranov, Gustavo Alonso, Torsten Hoefler

2339 - 2352

Discovery and Ranking of Embedded Uniqueness Constraints

Ziheng Wei, Uwe Leck, Sebastian Link

2353 - 2365

Online Density Bursting Subgraph Detection from Temporal Graphs

Lingyang Chu, Yanyan Zhang, Yu Yang, Lanjun Wang, Jian Pei

2366 - 2378

Progressive Indexes: Indexing for Interactive Data Analysis

Pedro Holanda, Stefan Manegold, Hannes Mühleisen, Mark Raasveldt

2379 - 2392

Distributed Edge Partitioning for Trillion-edge Graphs

Masatoshi Hanai, Toyotaro Suzumura, Wen Jun Tan, Elvis Liu, Georgios Theodoropoulos, Wentong Cai

2393 - 2407

Optimal Column Layout for Hybrid Workloads

Manos Athanassoulis, Kenneth B√∏gh, Stratos Idreos

2408 - 2421

Selecting Data to Clean for Fact Checking: Minimizing Uncertainty vs. Maximizing Surprise

Stavros Sintos, Pankaj K. Agarwal, Jun Yang

PVLDB is part of the VLDB Endowment Inc.

Privacy Policy