spacer
EDP Sciences Journals List
Home arrow Document
 
 

|   Abstract  |   PDF (685.2 KB)  |   References  |

Eur. Phys. J. B 38, 211-221 (2004)
DOI: 10.1140/epjb/e2004-00114-1

Correlated topologies in citation networks and the Web

F. Menczer

School of Informatics and Departments of Computer Science and Physics, Indiana University, Bloomington, IN 47408, USA

fil@indiana.edu

(Received 5 November 2003 / Received in final form 26 February 2004 / Published online 14 May 2004)

Abstract
Information networks such as the scientific literature and the Web have been studied extensively by different communities focusing on alternative topological properties induced by citation links, textual content, and semantic relationships. This paper reviews work that brings such different perspectives together in order to build better search tools and to understand how the Web's scale free topology emerges from author behavior. I describe three topologies induced by different classes of similarity measures, and outline empirical data that allows us to quantify and map their correlations. The data is also used to study a power law relationship between the content similarity between two documents and the probability that they are connected by citations or hyperlinks. Such finding has led to a remarkably powerful growth model for information networks, which simultaneously predicts the distribution of degree and the distribution of content similarity across pairs of documents - Web pages connected by links and scientific articles connected by citations.

PACS
89.20.Hh - World Wide Web, Internet.
89.75.-k - Complex systems.

© EDP Sciences, Società Italiana di Fisica, Springer-Verlag 2004