Sunday, 3 March 2013

Privacy Leaks and Idea of Social Spheres

As with growing online social networking sites (SNS), people want their freedom to express their opinions and share their views with friends or, everyone. However, as with the real world there exists problems with SNS too. People don't want others to use their personal information in a way that they find, violating their personal space, e.g. SNS may help advertisers to target users based on their profile and activity information. Not every user would agree to allow SNS to reveal their information to advertisers. More extreme examples may include spams, sexual predators and stalkers.

Though, many SNS reveal sensitive information about users and their ties, through anonymization for data mining to be used by advertisers, app developers or in research, people have developed de-anonymization algorithm using only the anonymized network topology, e.g. [1] a third of the users who can be verified to have accounts on both Twitter, a popular microblogging service, and Flickr, an online photo-sharing site, can be re-identified in the anonymous Twitter graph with only a 12% error rate.

People may not realize what they share can lead to, e.g. on Twitter, a large number of users who unwittingly post sensitive information. Tweets telling about people's vacation plans may reveal their planned schedule publicly. The z-score of ratio of vacation tweets in the period of January to September 2010. Four vacation spikes are seen at the beginning of January, the end of May, beginning of July, and end of August, where z-score indicates how many standard deviations an observation is above or below the mean.

Well, people are even more out of control when blogging drunk as shown by the comparison between drunk and sober tweets. For drunk tweets, one can perform content analysis to find out all private topics revealed.


Another category can relate to people tweeting about what diseases surround them, using words like “disease”,“syndrome”, and “disorder”. Now, such information leaks may be divided into two types of leaks, status leaks and conversation leaks. Further leaks can be classified into primary (user tweeting about herself) and secondary (tweeting about another person).

Above histograms indicate that there are more secondary leaks about Cancer or HIV relative to Diabetes, i.e. though people don't want to reveal their private information about these two diseases, they get revealed through their friends or family.

How to visualize & protect the privacy leaks through social spheres?

We can define something called profile distance of two users, i & j in an N-dimensional profile space P, having a profile vector p for every user given by
p = (e(1), ..., e(N)), where e(i) is a profile parameter, as :
And the personal sphere of a user will not contain user k if

where r is a radius of the sphere quantifying the requirement conforming to the this model. The N-dimensional vector space can be mapped to 2-dimensional space using MDS, multi-dimensional scaling preserving the distances for correct visualization.

Now, a user might want to share sensitive information within this sphere but not caring that this information can continue through their friends' spheres outside the user's original sphere, e.g. through Facebook apps users share their information though as per the privacy setting, but its not easy to see exactly what information will be shared among whom.

To see which application's ideal privacy requirements match with that of a user's privacy profile settings, one can define an application vector and see its distance from the user profile in the N-dimensional space, e.g. making four clusters of applications including Farmville, Phrases, Texas, HoldEm Poker, and other, thereby, taking r = 3, a typical facebook recommended user profile and a restrictive user profile (friends only) are matched.

As seen, the Are You Interested? type applications are far outside the sphere of a restrictive user implying that the future application developers should set app's privacy requirements in such a way that it has a close euclidean distance from even the restrictive users since with growing privacy and security concerns among the SNS, number of such users, who care deeply about their privacy, is likely to increase.

However, a single personal sphere doesn't match the different social spheres of people in real life where we have different circles related to friends, family, or profession.

If we don't take care of this fact, treat all online friends equal in the same personal sphere of a user, it will lead to what's called social tension among the actually different social spheres of users. Users may not realize the actual size of listeners when they share their views or pictures, they may also not notice that unlike the real world, posts, tweets or any other shared data are persistent staying for a long time in the SNS for people to view later.

This kind of social tension arises due to intermixing of strong and weak ties, e.g. people might post relating to their friends but not intended for their family members leading to tension between the family members and user's friends when viewed by the family. Higher tension leads to lesser growth of network even though the users activity may be good enough. Some "cautious" users who care more about their regularity in the network won't be affected by high or low tension, keeping their activity always on top but since most users are not cautious, high tension will have an overall effect of reduction in the network growth rate.

Idea of creating different social spheres online is being used widely by Google+ enabling users to put any other users in different social circles of Friends, Family, Acquaintances, Following and more user created circles, whereas Facebook automatically creates several lists, namely, Close Friends, Acquaintances, Work (based on your employment listing), School (both high school and college), Family and Your city, providing the option to send a post just to a particular list. Not only this, Facebook enables the user to set extended privacy settings or, to share within secret groups.

References :
[1] De-anonymizing Social Networks by Arvind Narayanan and Vitaly Shmatikov, The University of Texas at Austin
[2] Loose Tweets: An Analysis of Privacy Leaks on Twitter, Huina Mao, Xin Shuai, Apu Kapadia, School of Informatics and Computing, Indiana University
[3] Measuring Profile Distance in Online Social Networks, Niklas Lavesson
and Henric Johnson, School of Computing, Blekinge Institute of Technology
[4] Paul Adams @ padday, vtm2010-100701010846-phpapp01
[5] The Problem of Conflicting Social Spheres: Effects of Network Structure on Experienced Tension in Social Network Sites, Jens Binder, Andrew Howes, Alistair Sutcliffe, Manchester Business School, University of Manchester

No comments:

Post a Comment