Saturday, 2 February 2013

Influence maximization in a social network

    In this blog post, I will be discussing about influence maximization in social networks. So, first let us consider influence in networks. Let me take an example, suppose in a network the members can have two possible opinions and also it is assumed that we can bribe the members to support our opinion, so the question is which members will be more preferred to be bribed to cause maximum support in our favour. Here comes the concept of maximum influence using which members who have maximum influence on the whole network will be selected  to be bribed.
  Now, social network acts as a medium for spreading influence among its members. By the term influence means using opinions, ideas,etc. Main motives of influence spreading are primarily: Product marketing, innovation spreading. The basic steps for maximizing influence can be grouped in the following way:
a. Some basic models of influence for the network.
b. Obtaining data or influence between nodes known as inter-personal influence( f(S): influence of a set of nodes S).
c. Some algorithmic formulations to maximize the spreading of influence.

 Two basic models of influence:
    i. Linear Threshold Model.
    ii. Independent Cascade Model.
Linear Threshold Model: In Linear threshold model a node u is influenced by each neighbor w according to a weight b(u,w) such that sum[b(u,w)] <=1 for all neighbors w of u. The next steps for this model are:
i.   Each node chooses a threshold uniformly at random from the interval[0,1]. Let us denote this threshold by thres(u). This attribute thres(u) represents the fractions of  neighbors of u that has to be active for u to be active.
ii.  We start with an intial set of active nodes. And in next step it follows that the nodes that were active in the (i-1)th step remains active in the ith step and we activate all nodes u such that
sum[b(u,w)]  >= thres(u) for all active neighbors w of u.
Independent Cascade Model:  Here again we start with an initial set of active nodes and then it does the following steps:
i.   When node u becomes active in ith step, its inactive neighbors in the ith step get a chance to get active. There is a probability p(u,w) that the inactive node w will be active. Node w will be active in (i+1)th step only if the node u suceeds in ith step. The model runs until no more activations are possible.
Algorithm for influence maximization:
    a. Algorithm: Greedy Algorithm
    b. Performance: Slighly better than 63% due to (1-1/e) approximation.  
Approximation Algorithm for influence maximization: 
Domingos-Richardson optimization problem: The influence of a set of nodes A is  f(A) which is the expected number of active nodes at the end of the process where A is the initial active set of nodes. The influence maximization problem is defined as follows:
     Given a paramter k, find a k-node set of maximum influence. For the two influence models described above, it is NP-hard to determine the optimum for influence maximization.
     One solution is that influence maimization can be approximated within a factor of (1- 1/e) in both the models, where e is base of natural logarithm. This performance gurantees slighly better than 63%. This performance is obtained by a greedy hill-climbing strategy. The steps of the greedy algorithm are as follows:
     i. We start with an empty set S.
     ii. While no more activations possible
       Add node v to S that maximizes f(S+v)- f(S)
     This greedy algorithm has (1-1/e) approximation. Its time complexity is O(kmns) where m is the number of edges, n is the number of nodes, and s is the number of times the model is simulated.

No comments:

Post a Comment