Abstract
Through the development of the topic of Web Science there has been interest in theevolution of networks such as the WWW and online social networks almost as ecospheres
in a biological sense. However much of the value of the web comes from our ability to
search and index it rapidly through the development of ranking and retrieval algorithms
such as that oered by Google. Our thesis examines properties of online social networks,
and in particular Twitter, whose properties have not been examined to date. However
one major problem is the very large size of these networks, and limited amount of
resources that are usually available when accessing them for research purposes.
We show that through the use of random walks, we are able to quickly discover important
portions of such networks and estimate interesting properties, while keeping
computational costs low.
Our thesis focuses on the following:
1. To study how to crawl massive social networks representatively with limited resources.
This will allow users to get a meaningful snapshot of the network. Our
methodology for this is:
(a) To investigate this problem experimentally and theoretically on articial networks
simulated in a controlled environment.
(b) To investigate on real networks by comparing our limited designed crawls,
with data we have obtained giving the complete structure of e.g. the Twitter
network.
2. To investigate the graph theoretic structure of the networks, to the following ends:
(a) To devise representative methods of generating articial networks as given in
1(a), both experimentally, and supported by theory.
(b) To relate this structure to improving the design of algorithms for information
retrieval, ranking home pages, viral advertising etc.
3. To investigate the existence of patterns in user behavior in social networks in order
to:
(a) Detect user groups with similar interests/behaviors.
(b) Recommend activities to users based on activities of similar users.
Date of Award | 2014 |
---|---|
Original language | English |
Awarding Institution |
|
Supervisor | Colin Cooper (Supervisor) & Tomasz Radzik (Supervisor) |