Wednesday, March 11, 2015

Privacy in Public Space ?

In his novel  Dans le cafe de la  jeunesse perdue   Patrick Modiano (Nobel Prize for literature 2014 ) sketches Louki .  Never wanting to attract attention, at Cafe Conde', she always  stayed in the background or hid herself among the crowd. But alas!  in the rare and random group photos taken in cafe,  she stands out because of her photogenic face. When she goes missing , Mr Caislly a detective  is put on the job to trace her.   At one point  Mr Caisley   feels morally bound to respect  Louki's right to remain untraced.  The story is set in the 1960's, when it was easy to remain hidden from public view. Fast forward to modern times. Do we have the privilege to remain anonymous? Can we wish to be among the "Forgotten People"? Well,   I may be  a total stranger to my next door neighbor but I leave digital footprints all over.   Mobile phone, credit cards, browsing habits. everything leaves a mark. Oh! No not just marks, I generate   a pattern of  my behavioral traits- restaurants I frequent, favorite holiday spots,   reading/browsing  habits, favorite cab service.......So much so that  often I receive  helpful prompts : here is a book you may like or We haven't seen you for so long........ Such personalized messages are just one part, I might have inadvertently checked a tiny box somewhere sometime to "be in the loop".  

But what about anonymized or ghost data? We are talking about  millions of such  digital footprints  devoid of users' identification marks.  Because of ease of data collection, consolidation, filtration and processing  such ghost data( or metadata) , constitute rich gold mine  for market research, R&D activities  and  also for meaningful studies in sociology behavioral science, public health, etc.  How harmless are such  ghost datasets ? Not quite that is what Anthony Tocker found out.  Curious brains can mix/ match/superimpose ghost data with other pieces of  information freely available in the public domain  and breathe life into skeletons.  For example when New York City Taxi and Limousine Commission put out rather innocuous details  of millions of trips  for the year 2013 on the web, (of course without   passenger details)  it was not at all difficult for Anthony Tockar   to couple trip info with freely available public domain data  and   thus de-anonymize the riders. As simple as  putting 2 and 2 together, but in a rather complex way.  

de Montjoye, at the   computational privacy  at the Massachusetts Institute of Technology, prefers to describe the process  as  Correlation Attack.  de Montjoye and his team decided to analyze credit card data in the same way. They collected  ghost data for credit card transactions over  a span of 3 months for  1.1 million people  and 10,000 shops  sans user name, sans  card number , sans shop name  and sans  time of transaction.    Armed with just the  details of   amount spent, shop type,  date stamp. and a code for each person, the team demonstrated that " 4 spatiotemporal  points are enough to uniquely reidentify  90% of individuals."  and  that " women are more identifiable than men in the credit card meta data." 

Efforts are on to fix such loopholes. Anonymous search engine duckduckgo doesn't capture user's ip address or store search history.  TrackMeNot  an add-on to Firefox or Chrome throws the tracker off the track and  into a Daedalian  labyrinth   with no Ariadne around to help. Researchers at Duke University have developed CacheCloak,  a program  which   camouflages mobile location.  Are we getting paranoid about our privacy?  Not necessarily. In my mobile contact list there are very few without a face; some have  uploaded their entire family! More interestingly there are monthly if not weekly updates.  Facebook too  tells the same  story; people are eager to talk about themselves, share their experiences- holidays, parties, workplace, constant updates...... Taking advantage of the millions of authenticated photos in their treasury, Facebook is all set to roll out DeepFace, a face recognition program, which can automatically tag a face with 97.25% accuracy. 

Perhaps  Digital Age has precipitated our eternal paradox: to  be unique and yet ordinary. 

Tailpiece:  
Stars, hide your fires;
Let not light see my black and deep desires                                                                           (Macbeth Act 1, Scene 4)

References:
.
1. Dans le Cafe de la jeunesse perdue : Patrick Modiano,  Gallimard 2007
2.Differential Privacy : The basics
3.Riding with the stars passenger privacy in the nyc taxicab dataset
4. Unique in Shopping mall : On the reidentifiability of credit card meta data 
de Montjoye et al Science 30 Jan. 2015, Vol. 347, issue 6221, pages 536-539
5. Hiding in plain sight, & Camouflaging searches in a sea of fake queries.
    Jia You  ibid page 500 & 502
6. Facebook will soon be able to ID you in any photo
6. Deep Face:Closing the gap to human level performance in face verification