Topics in the structure and search of large networks

Student thesis: Doctoral ThesisDoctor of Philosophy

Abstract

Through the development of the topic of Web Science there has been interest in the
evolution of networks such as the WWW and online social networks almost as ecospheres
in a biological sense. However much of the value of the web comes from our ability to
search and index it rapidly through the development of ranking and retrieval algorithms
such as that oered by Google. Our thesis examines properties of online social networks,
and in particular Twitter, whose properties have not been examined to date. However
one major problem is the very large size of these networks, and limited amount of
resources that are usually available when accessing them for research purposes.
We show that through the use of random walks, we are able to quickly discover important
portions of such networks and estimate interesting properties, while keeping
computational costs low.
Our thesis focuses on the following:
1. To study how to crawl massive social networks representatively with limited resources.
This will allow users to get a meaningful snapshot of the network. Our
methodology for this is:
(a) To investigate this problem experimentally and theoretically on articial networks
simulated in a controlled environment.
(b) To investigate on real networks by comparing our limited designed crawls,
with data we have obtained giving the complete structure of e.g. the Twitter
network.
2. To investigate the graph theoretic structure of the networks, to the following ends:
(a) To devise representative methods of generating articial networks as given in
1(a), both experimentally, and supported by theory.
(b) To relate this structure to improving the design of algorithms for information
retrieval, ranking home pages, viral advertising etc.
3. To investigate the existence of patterns in user behavior in social networks in order
to:
(a) Detect user groups with similar interests/behaviors.
(b) Recommend activities to users based on activities of similar users.
Date of Award2014
Original languageEnglish
Awarding Institution
  • King's College London
SupervisorColin Cooper (Supervisor) & Tomasz Radzik (Supervisor)

Cite this

'