Online
Social networks enable us to express ourselves and reach out in an inexpensive
and extremely convenient manner. Twitter recently announced a staggering 200
million active users! Facebook claims to have 1 billion. Such widespread use of
online social networks provides researchers with a firehose of well organized
data. Now, what does a researcher do with data? No points for guessing. He/she
analyses it!
Some research
which I will talk about, analyse it to make predictions about various things -
Movie Box-Office, Stock Prices, Trending Topics, Election Results, etc.
However
before we discuss about it, I would like to point out to you that all of this
research assumes that buzz on the Web reflects popularity and buzz in the real
world. To what extent this premise is valid, is another interesting research
topic, but let's not go into it right now.
Lets back to
the topic of predicting the future. [1] describes a procedure to identify
trends through semantic social network analysis.
A "Web
Buzz Index" is calculated for "concepts" which are input in the
form of phrases. "Concepts" may be names of politicians, brands, or
general topics of interest. To do this, they extend the concept of betweenness
centrality of actors in social networks to semantic networks of concepts.
Trends are measured by tracking a concept’s relative importance in a relevant
information sphere - Web, blog, or online forums. Betweenness centrality of the
concept is used as a representative of the relative importance of a concept in
the information sphere.
To build the
semantic social network in an information sphere, "degree of separation
search" is used. Degree of separation search works by building a two mode
network map displaying the linking structure of a list of Web sites or blog
posts returned in response to a search query, or the links among posters
responding to an original post in an online forum.
Degree of
separation search can be employed to compare the relative importance of various
concepts. The figure below shows the comparison of relative importance of the
concepts “gun control”, “abortion”, “gay marriage”, and “Iraq war”. So, The
idea is that the importance of an individual concept depends on the linking
structure of the temporal network and the betweenness of the other concepts in
the network.
Further
analysis which [1] presents is the social network of blog posts right after the
US presidential elections Nov 4, 2008. The blogs talking about McCain form a
far more compact cluster, at the very bottom with a tightly interlinked
structure. The democratic blogs, linking to Obama, are much wider spread out,
and also exhibit fewer interconnecting links, reflecting the wider political
interests of the voters supporting Obama.
In [1], the
researchers have applied their complete procedure collected data over 213 days
(April, 1st 2008 until October, 30th 2008) on 21 stock titles on Yahoo!
Finance. The results (for stock prices of Goldman Sachs) as we can see below,
show a promising correlation between the web buzz and real world stock prices.
Thus, it is
evident from the results of the existing research, that social networks can be
used as an indicator of the future. The analysis of sentiments can reveal the
results of people driven events like the success of a movie, elections, market
fluctuations, and what not!
Further
Reading:
1.
Gloor, Peter A., et al. "Web
Science 2.0: Identifying Trends through Semantic Social Network Analysis” Computational
Science and Engineering, 2009. CSE'09. International Conference on. Vol.4.
IEEE, 2009
2.
Asur, Sitaram, and Bernardo A. Huberman. "Predicting
the Future With Social Media" arXiv preprint arXiv:1003.5699
(2010)
3.
Yu, Sheng, and Subhash Kak. "A Survey of
Prediction Using Social Media" arXiv preprint arXiv:1203.1647
(2012)
4.
Wasserman, Stanley, and Katherine Faust. Social network analysis: Methods
and applications. Vol. 8. Cambridge university press, 1994.
5.
Data
And Text Mining Of Financial Markets Using News and Social Media
(Dissertation by Zhichao Han)
No comments:
Post a Comment