| Abstract | | When information sources are unreliable, information networks
have been used in data mining literature to uncover facts from
large numbers of complex relations between noisy variables. The
approach relies on topology analysis of graphs, where nodes
represent pieces of (unreliable) information and links represent
abstract relations. Such topology analysis was often empirically
shown to be quite powerful in extracting useful conclusions from
large amounts of poor-quality information. However, no
systematic analysis was proposed for quantifying the accuracy of
such conclusions. In this paper, we present, for the first time, a
Bayesian interpretation of the basic mechanism used in fact-finding
from information networks. This interpretation leads to a direct
quantification of the accuracy of conclusions obtained from
information network analysis. Hence, we provide a general
foundation for using information network analysis not only to
heuristically extract likely facts, but also to quantify, in an
analytically-founded manner, the probability that each fact or
source is correct. Such probability constitutes a measure of
quality of information (QoI). Hence, the paper presents a new
foundation for QoI analysis in information networks, that is of
great value in deriving information from unreliable sources. The
framework is applied to a representative fact-finding problem, and
is validated by extensive simulation where analysis shows
significant improvement over past work and great correspondence with
ground truth. |
| Authors | | Dong Wang, Tarek Abdelzaher, Hossein Ahmadi, Jeff Pasternack, Dan Roth, Manish Gupta, Jiawei Han, Omid Fatemieh, and Hieu Le |