ACM SIGMOD Anthology VLDB dblp.uni-trier.de

Using Probabilistic Information in Data Integration.

Daniela Florescu, Daphne Koller, Alon Y. Levy: Using Probabilistic Information in Data Integration. VLDB 1997: 216-225
@inproceedings{DBLP:conf/vldb/FlorescuKLP97,
  author    = {Daniela Florescu and
               Daphne Koller and
               Alon Y. Levy},
  editor    = {Matthias Jarke and
               Michael J. Carey and
               Klaus R. Dittrich and
               Frederick H. Lochovsky and
               Pericles Loucopoulos and
               Manfred A. Jeusfeld},
  title     = {Using Probabilistic Information in Data Integration},
  booktitle = {VLDB'97, Proceedings of 23rd International Conference on Very
               Large Data Bases, August 25-29, 1997, Athens, Greece},
  publisher = {Morgan Kaufmann},
  year      = {1997},
  isbn      = {1-55860-470-7},
  pages     = {216-225},
  ee        = {db/conf/vldb/FlorescuKLP97.html},
  crossref  = {DBLP:conf/vldb/97},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}

Abstract

The goal of a mediator system is to provide users a uniform interface to the multitude of information sources. To translate user queries, given in a mediated schema, to queries on the data sources, mediators rely on explicit mappings between the contents of the data sources and the meanings of the relations in the mediated schema.

Thus far, contents of data sources were described qualitatively. In this paper we describe the use of quantitative information in the from of probabilistic knowledge in mediator systems. We consider several kinds of probabilistic information: information about overlap between collections in the mediated schema, coverage of the information sources, and degrees of overlap between information sources. We address the problem of ordering accesses to multiple information sources, in order to maximize the likelihood of obtaining answers as early as possible. We describe a declarative formalism for specifying these kinds of probabilistic information, and we propose algorithms for ordering the information sources. Finally, we discuss a preliminary experimental evaluation of these algorithms on the domain of bibliographic sources available on the WWW.

Copyright © 1997 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.


Online Paper

ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 1 Issue 5, VLDB '89-'97" and ... DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...

Printed Edition

Matthias Jarke, Michael J. Carey, Klaus R. Dittrich, Frederick H. Lochovsky, Pericles Loucopoulos, Manfred A. Jeusfeld (Eds.): VLDB'97, Proceedings of 23rd International Conference on Very Large Data Bases, August 25-29, 1997, Athens, Greece. Morgan Kaufmann 1997, ISBN 1-55860-470-7
Contents CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML

References

[ACPS96]
Sibel Adali, K. Selçuk Candan, Yannis Papakonstantinou, V. S. Subrahmanian: Query Caching and Optimization in Distributed Mediator Systems. SIGMOD Conference 1996: 137-148 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[AKS96]
Yigal Arens, Craig A. Knoblock, Wei-Min Shen: Query Reformulation for Dynamic Information Integration. J. Intell. Inf. Syst. 6(2/3): 99-130(1996) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[BDMS94]
C. Mic Bowman, Peter B. Danzig, Udi Manber, Michael F. Schwartz: Scalable Internet Discovery: Research Problems and Approaches. Commun. ACM 37(8): 98-107(1994) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Buc85]
...
[CGMH+94]
Sudarshan S. Chawathe, Hector Garcia-Molina, Joachim Hammer, Kelly Ireland, Yannis Papakonstantinou, Jeffrey D. Ullman, Jennifer Widom: The TSIMMIS Project: Integration of Heterogeneous Information Sources. IPSJ 1994: 7-18 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[EW94]
Oren Etzioni, Daniel S. Weld: A Softbot-Based Interface to the Internet. Commun. ACM 37(7): 72-76(1994) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[FRV96]
...
[Hec96]
...
[IC93]
Yannis E. Ioannidis, Stavros Christodoulakis: Optimal Histograms for Limiting Worst-Case Error Propagation in the Size of Join Results. ACM Trans. Database Syst. 18(4): 709-748(1993) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[KLP97]
Daphne Koller, Alon Y. Levy, Avi Pfeffer: P-CLASSIC: A Tractable Probablistic Description Logic. AAAI/IAAI 1997: 390-397 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[KW96]
Chung T. Kwok, Daniel S. Weld: Planning to Gather Information. AAAI/IAAI, Vol. 1 1996: 32-39 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[LRO96]
Alon Y. Levy, Anand Rajaraman, Joann J. Ordille: Querying Heterogeneous Information Sources Using Source Descriptions. VLDB 1996: 251-262 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Pea88]
...
[TRV97]
...
[Ull97]
Jeffrey D. Ullman: Information Integration Using Logical Views. ICDT 1997: 19-40 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML

Copyright © Fri Mar 12 17:22:55 2010 by Michael Ley (ley@uni-trier.de)