Eur. Phys. J. B 38, 211-221 (2004)
DOI: 10.1140/epjb/e2004-00114-1
Correlated topologies in citation networks and the Web
F. MenczerSchool of Informatics and Departments of Computer Science and Physics, Indiana University, Bloomington, IN 47408, USA fil@indiana.edu
(Received 5 November 2003 / Received in final form 26 February 2004 / Published online 14 May 2004)
Abstract
Information networks such as the scientific literature and
the Web have been studied extensively by different communities
focusing on alternative topological properties induced by citation
links, textual content, and semantic relationships. This paper reviews
work that brings such different perspectives together in order to build
better search tools and to understand how the Web's scale free topology
emerges from author behavior. I describe three topologies induced by
different classes of similarity measures, and outline empirical data
that allows us to quantify and map their correlations. The data is
also used to study a power law relationship between the content
similarity between two documents and the probability that they are
connected by citations or hyperlinks. Such
finding has led to a remarkably powerful growth model for information
networks, which simultaneously predicts the distribution of degree and
the distribution of content similarity across pairs of documents -
Web pages connected by links and scientific articles connected by
citations.
89.20.Hh - World Wide Web, Internet.
89.75.-k - Complex systems.
© EDP Sciences, Società Italiana di Fisica, Springer-Verlag 2004



Document 
