Academia.eduAcademia.edu

Multiple data owners: who’s doing what with your data?

2013

The Future Internet offers increasing opportunities for participation by private individuals, natural persons in legal terms 1. Personal access devices have not been confined to office-based personal computers for some time, and continue to evolve: computer systems grew smaller and more compact with a demand for increased portability, and personal communication devices (mobile phones) grew in storage and processing capacity as well as going beyond telecommunications to the web (smart phones) for the two to converge in tablet-type devices. On the one hand, this allows for extensive and pervasive connectivity all day, every day, for access to data and information systems, to communicate with friends, with colleagues and with businesses and government, as well as to share with the world or worlds what us going for the individual or in an individual's reaction to events or to others: the social network. On the other, this poses increasing challenges for personal privacy as well as freedom. Personal data associated with individuals should be treated with care, it can be assumed; but what happens when the data subjects themselves release such data via social networking sites (SNS)? In this report, relevant legislation surrounding the treatment of personal data is presented and reviewed. Interactions of individuals (data subjects) with online services is described against the legislative background and summary conclusions and recommendations are made directed at FI Users, FI Providers and Service and application developers. The report is divided into the following sections: Background: the legal perspective on protecting personal data outlines the legal framework in Europe for the protection of personal data, summarising the various sections of the Data Protection Directive for how such data should be handled. The reality: should we be nervous? discusses how legislation is implemented and lists areas such as unauthorised disclosure and sharing in terms of particular cases against well-known service providers. User perceptions: trust briefly reviews user attitudes to online services and how their personal data are protected. User confidence: the public domain outlines the legal basis for treating data which have been made public (such as varying sharing on public websites); and finally User profiles and data mining: derivative works looks how personal data shared via social networking sites along with records of online activity and behaviours can be used to build up profiles of end users which could well provide an unwanted perspective on a given individual. So the intention in this overview is to bring together legislative, subjective and service-oriented aspects of personal data usage as it stands today with some indicators of the challenges for those building as well as using the Future Internet.

© 2013 University of Southampton IT Innovation Centre Brian Pickering, Callum Beamish, Clare Hooper, Mike Surridge {jbp|cab|cjh|[email protected]} Multiple data owners: who’s doing what with your data? The Future Internet offers increasing opportunities for participation by private individuals, natural persons in legal terms1. Personal access devices have not been confined to office-based personal computers for some time, and continue to evolve: computer systems grew smaller and more compact with a demand for increased portability, and personal communication devices (mobile phones) grew in storage and processing capacity as well as going beyond telecommunications to the web (smart phones) for the two to converge in tablet-type devices. On the one hand, this allows for extensive and pervasive connectivity all day, every day, for access to data and information systems, to communicate with friends, with colleagues and with businesses and government, as well as to share with the world or worlds what us going for the individual or in an individual’s reaction to events or to others: the social network. On the other, this poses increasing challenges for personal privacy as well as freedom. Personal data associated with individuals should be treated with care, it can be assumed; but what happens when the data subjects themselves release such data via social networking sites (SNS)? In this report, relevant legislation surrounding the treatment of personal data is presented and reviewed. Interactions of individuals (data subjects) with online services is described against the legislative background and summary conclusions and recommendations are made directed at FI Users, FI Providers and Service and application developers. The report is divided into the following sections: Background: the legal perspective on protecting personal data outlines the legal framework in Europe for the protection of personal data, summarising the various sections of the Data Protection Directive for how such data should be handled. The reality: should we be nervous? discusses how legislation is implemented and lists areas such as unauthorised disclosure and sharing in terms of particular cases against well-known service providers. User perceptions: trust briefly reviews user attitudes to online services and how their personal data are protected. User confidence: the public domain outlines the legal basis for treating data which have been made public (such as varying sharing on public websites); and finally User profiles and data mining: derivative works looks how personal data shared via social networking sites along with records of online activity and behaviours can be used to build up profiles of end users which could well provide an unwanted perspective on a given individual. So the intention in this overview is to bring together legislative, subjective and service-oriented aspects of personal data usage as it stands today with some indicators of the challenges for those building as well as using the Future Internet. 1 A living person as is commonly understood; the yet unborn and dead are not afforded the same legal status. Page 1 of 17 © 2013 University of Southampton IT Innovation Centre Background: the legal perspective2 on protecting personal data The FI offers unprecedented access to data sources, with smart devices and sensors able to collect information dynamically to be transferred on for aggregation and interrogation as well as services and applications dedicated to interactions between users and whole communities both locally and across whole regions and continents. In Issue such an environment, the issue of personal data – not just personal identifiers such as name, address and so forth, but also location, online activity and behaviours – becomes increasingly relevant. How the data should be dealt with must first consider existing legislation and its implications for service developers, infrastructure providers and of course end users themselves. The Data Protection Directive (DPD)3 establishes a framework which aims to protect the “fundamental rights and freedoms of natural persons, and in particular their right to privacy with respect to the processing of personal data”[our italics]. Under Article 2(a) the concept of “personal data” is defined as “any information relating to an identified or identifiable natural person ('data subject'); an identifiable person is one who can be identified, directly or indirectly, in particular by reference to an identification number or to one or more factors specific to his physical, physiological, mental, economic, cultural or social identity”. The processing of such personal data is a major focus of the DPD and is described at Article 2(b) as “any operation or set of operations which is performed upon personal data, whether or not by automatic means, such as collection, recording, organization, storage, adaptation or alteration, retrieval, consultation, use, disclosure by transmission, dissemination or otherwise making available, alignment or combination, blocking, erasure or destruction”. This is particularly relevant to all FI ecosystem stakeholders, and especially to the service and application developers and providers, as well as the infrastructure owners who provide the resource to run those services and applications. With a relational framework established between data subject (as defined above), data controller (the person or organisation deciding how and why data should be processed) and the data processor (the person or organisation actually carrying out any such processing), the DPD stipulates particular principles that must be adhered to when data are being processed, as found under Article 6, and require that personal data must be: (i) “processed fairly and lawfully; (ii) collected for specified, explicit and legitimate purposes and not further processed in a way incompatible with those purposes […]; Throughout this document, and for the purposes of simplicity, we have concentrated principally on EU law, or that of its member states. 3 Council Directive 95/46/EC of 24 October 1995 on the protection of individuals with regard to the processing of personal data and on the free movement of such data (1995) OJ L 281/31 2 Page 2 of 17 © 2013 University of Southampton IT Innovation Centre (iii) adequate, relevant and not excessive in relation to the purposes for which they are collected and/or further processed; (iv) accurate and, where necessary, kept up to date; every reasonable step must be taken to ensure that data which are inaccurate or incomplete, having regard to the purposes for which they were collected or for which they are further processed, are erased or rectified; (v) kept in a form which permits identification of data subjects for no longer than is necessary for the purposes for which the data were collected or for which they are further processed […]”. And it is on those terms that consent should usually be sought: the data subject needs to know what the data will be used for, and have a right to access the data to check validity and currency. At the EU level, the data protection framework provided for in the DPD does not distinguish between personal data that is in the public domain until that data falls under Article 8 as “special categories of data”. Article 8(1) establishes that no processing of personal data which reveals “racial or ethnic origin, political opinions, religious or philosophical beliefs, trade-union membership” shall occur and nor shall “processing of data concerning health or sex life”. This statement is then subject to certain exclusions listed in Article 8(2) as “(a) the data subject has given his explicit consent to the processing of those data […]; or (b) processing is necessary for the purposes of carrying out the obligations and specific rights of the controller in the field of employment law in so far as it is authorized by national law providing for adequate safeguards; or (c) processing is necessary to protect the vital interests of the data subject or of another person where the data subject is physically or legally incapable of giving his consent; or (d) processing is carried out in the course of its legitimate activities with appropriate guarantees by a foundation, association or any other non-profit-seeking body with a political, philosophical, religious or trade-union aim and on condition that the processing relates solely to the members of the body or to persons who have regular contact with it in connection with its purposes and that the data are not disclosed to a third party without the consent of the data subjects; or (e) the processing relates to data which are manifestly made public by the data subject or is necessary for the establishment, exercise or defence of legal claims.” [our italics] This is the only part of the Directive which distinguishes between data not in the public domain and data which is. There is no explicit EU law on what “manifestly made public by the data subject” means, but it is generally understood as requiring “a deliberate act by the data subject, disclosing the data to the public”. For example, therefore, video surveillance would not be considered appropriate, but an interview to media, or publication on a public internet page would make data public.4 4 Kotschy in Alfred Büllesbach et. al., Concise European IT Law, (2010, 2nd Edn, Kluwer Law International) 62 Page 3 of 17 © 2013 University of Southampton IT Innovation Centre Following the absence of discussion regarding public domain throughout the rest of the Directive, it is clear that where personal data exists in the public domain and is subsequently processed, this processing is still subject to the Directive’s principles and requirements. Indeed, if one looks to the UK Information Commissioner’s Office (ICO), they noted in their online code of practice5 that:  “People may post their personal details in such a way that they become publicly visible – for example through a social networking or recruitment site. Wherever the personal data originates, you still have an overarching duty to handle it fairly and to comply with the rules of data protection[…]  If you collect information from the internet and use it in a way that’s unfair or breaches the other data protection principles, you could still be subject to enforcement action under the DPA even though the information was obtained from a publicly available source.  It is good practice to only use publicly available information in a way that is unlikely to cause embarrassment, distress or anxiety to the individual concerned. You should only use their information in a way they are likely to expect and to be comfortable with. If in doubt about this, and you are unable to ask permission, you should not collect their information in the first place.” [our italics] Thus, if personal data is being processed that was collected from the public domain it must adhere to the data protection principles6 (see (i) to (v) above). In essence, therefore, the final point of the ICO guidance is most important. The use must be “unlikely to cause embarrassment, distress or anxiety to the individual concerned” and a use of “their information in a way they are likely to expect and to be comfortable with”. This guidance implies an element of consent is granted through a data subject making that data publicly available, making the processing “lawful”, and the controller must then ensure that the use does not cause embarrassment, distress or anxiety which meets the “fairness” requirement. So nominally, FI users/participants should be adequately protected from embarrassment and the misuse of their data, not least since the natural or legal person7 responsible for controlling and processing any data collected by a device and then manipulated in any specific way must gain consent and may only use the data for an appropriate and identified purpose. Any additional processing is subject to strict control and possibly legal sanction. CONCLUDING REMARKS AND RECOMMENDATIONS 5 UK ICO, ‘Personal information online code of practice’ (July 2010) <http://www.ico.org.uk/~/media/documents/library/Data_Protection/Detailed_specialist_guides/personal_inf th ormation_online_cop.pdf> accessed 15 July 2013. 6 DPD (n3) Article 6(1). 7 By contrast to a natural person, a legal person is an entity that has acquired the status of a person through law. This is useful for example as it enables companies to enter into contracts. In a business to consumer contract, the business will be a legal person Page 4 of 17 © 2013 University of Southampton IT Innovation Centre There is sufficient legislation in place to protect personal data. Data collection Users should be with consent, appropriate to the intended processing, and protected from inaccuracy as well as disclosure8. The Data Protection Framework would class application and service providers either as data controllers or data processors. Either way, they are expected to Providers act according to their responsibilities in respect of personal data. Specifically, they should seek consent from the data subjects, as well as retain and processed data only within the limits of the original purpose. FI Service and Application Developers 1) Be clear on your rôle as data processor or controller; and make users aware 2) Ensure explicit consent is requested for any data processing undertaken 3) Treat any consent forms in much the same way as a provider might manage SLAs: there should be dynamic compliance checking to the terms agreed to under data subject consent as an integral part of the execution of the application or service you run. The reality: should we be nervous? Within an FI ecosystem of competing requirements and drivers, the question is whether the legislation is adequate and whether indeed the authorities are willing to prosecute transgression. What is more, is it Issue enough to expect users to agree to long-winded and complicated terms and policies thereby relieving developers and providers of the burden of appropriate data handling? The most recent case for Google sees end users under siege again: those who correspond with Gmail users should “have no expectation of privacy”9. The end user licence agreement (EULA) presented to Gmail account holders bounds their own rights to privacy in what they send out. Invoking the “third party doctrine” for non-Gmail users, though, really should be challenged10, not least on the basis that Google’s intrusions are so extensive that implied consent is probably not enough. However, there is a precedent for challenging such terms in the courts, even for Gmail users themselves (see below and the probation imposed on Twitter). Notwithstanding specific differences across different jurisdictions, there is generally sufficient protection for individuals. Beyond the legal background, though, most providers will request some kind of consent via a licence. This presupposes that users look at the terms and conditions and the privacy policies of the particular service they wish to use. For common utilities11, their terms and policies may be summarised as follows: 8 Although legislation does exist to protect user data, it is far from unusual for providers to fall foul of such legislation. 9 http://techland.time.com/2013/08/14/google-says-gmail-users-have-no-legitimate-expectation-of-privacy/ 10 http://www.theverge.com/2013/8/14/4621474/yes-gmail-users-have-an-expectation-of-privacy 11 We looked at Amazon Web Services, Doodle, Dropbox, Evernote, Google, GoToMeeting, Podio, Prezi, Scribd, SkyDrive, Skype, SlideShare. Page 5 of 17 © 2013 University of Southampton IT Innovation Centre (a) None of them claimed ownership over content submitted to their services, but all required some form of licence to be granted. (b) None of the tools or utilities impose terms which are particularly different to the others; however Google is arguably granted the most wide reaching rights through requiring rights to be granted to allow not only for operation as do other providers, but also for promotion, improvement and development. (c) Some tools and utilities do not explicitly state which rights are granted and under what type of licence; most notably Dropbox and Evernote. This is possibly due to them wanting their terms to be seen as “user friendly” and absent of legalese12. (d) Generally speaking data deletion is up to the users: users should manually delete all data from the service/platform before cancelling. Users must also be aware of what data they share since this may still exist after cancellation of accounts. (e) Whichever utility or tool is used, users are often required to check terms and policy changes regularly, since some providers will not alert them to any such changes. Apart from (b), these terms in connection with the consent requested, it would seem reasonable to expect that data are therefore protected. This is not always the case, though. A 2012 study13 revealed some disconcerting cases of a number of different service and application providers: UNAUTHORISED DATA COLLECTION Apple It was found that iPhones had secret files tracking location without the owner’s permission or indeed knowledge14. They denied any wrong-doing, insisting that this was for legitimate purposes to help identify subsets of locations within larger databases to benefit the individual user. They promised to encrypt the data and reduce time spent on the device in the future. Carrier IQ15 In 2011, Carrier IQ was found collecting data related to usage and so forth without subscribers’ knowledge nor their ability to opt out. Google During the collection of Street View images, the mobile camera cars “inadvertently” picked up data from unsecured wireless networks for some four years. Google claimed that this was an isolated occurrence involving a single engineer acting with company authorisation. This was contested and the Federal Communications Commission fined them some $25,000 and complained that Google obstructed investigations. Intel In 1999, Intel had to disable a feature on the Pentium III chip after public outcry over what was suspected to be a “super cookie” that could effectively track the user’s surfing activities indefinitely. Path The photo sharing app, Path16, was found to be uploading address books 12 This arguably leaves some ambiguity that may be detrimental to the user. http://news.cnet.com/2300-1023_3-10012162.html 14 news.cnet.com/”http://news.cnet.com/8301-13579_3-20055885-37.html 15 http://www.carrieriq.com/ 13 Page 6 of 17 © 2013 University of Southampton IT Innovation Centre from subscribers’ devices without permission. UNAUTHORISED SHARING AOL 1998: a customer service agent released personal information about a subscriber to the Navy about his sexual orientation which led to his discharge from the forces. 2006: the company published the search history of more than 650,000 users. Even though “anonymised”, the specific search history of individuals could still be tracked. facebook In 2011, the Federal Trade Commission (FTC) claimed that facebook were not even complying with its own rules on data sharing and access, promising to take appropriate steps in future to make things more transparent for subscribers17 Twitter Also in 2011, Twitter were put on probation for 20 years by the FTC and forbidden for: “misleading consumers about the extent to which it protects the security, privacy, and confidentiality of nonpublic consumer information, including the measures it takes to prevent unauthorized access to nonpublic information and honor the privacy choices made by consumers.” (op cit) UNEXPECTED VULNERABILITY Microsoft In 1999, Microsoft had to black out Hotmail for some 12 hours after discovering that subscriber accounts could be accessed by anyone with a web browser. Sony Protection software (a “rootkit”) to avoid illegal copying installed itself when a CD was played, which rendered the machine vulnerable to malware. (2005) UNEXPECTED SHARING Yahoo! Yahoo! co-operated with the Chinese authority and released IDs of political dissidents who were subsequently imprisoned. All of these cases, with the possible exception of Yahoo!, should be covered by the appropriate legislation: the data subjects should know what data are being collected, how they are being used, and who has access to them. Clearly, the terms and policies of providers cannot necessarily be taken as the final arbiter in such cases: both Twitter and facebook have been prosecuted for misleading terms or even failing to comply with their own terms. CONCLUDING REMARKS AND RECOMMENDATIONS Users 16 17 Users need to review and maintain currency with the terms and conditions and privacy policies of the providers whose services they depend on. http://www.path.com http://ftc.gov/opa/2011/11/privacysettlement.shtm Page 7 of 17 © 2013 University of Southampton IT Innovation Centre Despite their terms and conditions, providers have been successfully prosecuted for a number of different failings, including unauthorised Providers collection, use and disclosure of personal data, even deliberately overcomplicating the terms of service they impose. 1) FI Service and Application 2) Developers 3) Services should use specific terms and conditions appropriate to their purpose; terms should be simple to understand; users should be alerted to changes and what that means to them Services should not involve hidden data collection Services should not disclose or share data without explicit consent. User perceptions: trust For the FI to deliver on the promise of economic growth and expansion, as well as online activity to become the norm and available to all, there is a significant and fundamental question: will the end-users, the Issue consumers of the services driving the Digital Agenda’s virtuous cycle, be prepared to embrace the new technologies and engage wholeheartedly in that context? For the FI to succeed, users must trust that they will be dealt with appropriately. So, there is legal protection under the Data Protection Directive and clear legal precedent that data misuse, as well as obviously misleading terms, will be challenged. That being said, the question is whether or not individuals do actually feel they can trust the services they depend on. The 2012 results18 for the annual survey of some ten thousand adults into how they rate organisations in terms of their “[commitment] to protecting the privacy of their personal information” reveals some interesting trends. In summary, and ignoring what organisations were mentioned and indicators of rankings between them, the following results may be cited:   Attitudes to privacy  78% of respondents continue to see the protection of their personal information as instrumental in building and maintaining trust19. Important measures related to trust: o 73% : security over personal information o 59% : no data sharing without consent o 59% : the ability to be forgotten [sic] o 55% : the right to revoke consent. 49% reported receiving one or more data breach notifications within the previous two years, of whom 70% said it reduced their level of trust. From 25 categories, healthcare, consumer products and Most trusted sectors banking are the most trusted in terms of privacy; Internet and social media, charities and toys are the 18 http://www.ponemon.org/local/upload/file/2012%20MTC%20Report%20FINAL.pdf Nevertheless, 63% admit sharing sensitive personal information with organisations they didn’t know or trust, of whom 60% justified it on the basis of convenience (i.e. making a purchase). 19 Page 8 of 17 © 2013 University of Southampton IT Innovation Centre least trusted.  Effects of technology   User control   59% believe privacy rights undermined by disruptive technologies (social media, smart mobile devices, geo-tracking tools) 55% claim that privacy has been diminished because of government intrusions 35% believe they maintain control of their own information. This figure has been going down for some seven years. 61% (the highest) believe identity is the main concern related to privacy, while 56% cite an increase in government surveillance. 32% do not rely on policies when making trust View of policies judgements, of whom 60% claim the policies are too long or contain too much legalese. Irrespective of individual conclusions, perhaps the most interesting results in the present context suggest that social media and the Internet in particular lack trust. This clearly does not stop subscribers using them20, even though long-term engagement via SNS can be significantly less satisfying than direct social interaction21; nor institutions and agencies forcing contact online22. Looking further, though, it is interesting to note: 1) Users are concerned about a loss of control, and yet the DPD and the explicit requirement for consent associated with specific data handling, should provide such control; and 2) Users bemoan government intrusion23. However, this is treated as an exception in the DPD24,25. Dutton and his colleagues have noted that Internet users with increasing experience and familiarity develop appropriate trust levels online, and decide for themselves what they can and should not do: that is, irrespective of the regulatory framework, experienced users will make their own decisions 20 The current top three include facebook (1,000M subscribers), Twitter (500M) and Google+ (500M). See http://news.cnet.com/8301-1023_3-57525797-93/facebook-hits-1-billion-active-user-milestone/; and also http://news.discovery.com/tech/apps/top-ten-social-networking-sites.htm on site popularity, and http://social-networking-websites-review.toptenreviews.com/ on site rankings, including interestingly “security”. 21 http://uk.news.yahoo.com/facebook-social-network-linked-unhappiness-215939039.html#m6L7M0L 22 Such as the US immigration authority (http://travel.state.gov/visa/forms/forms_1342.html) ; the Digital Agenda for Europe is also seeking to encourage online participation (http://ec.europa.eu/digitalagenda/en/scoreboard, and especially Pillars IV, VI and VII). 23 See recent discussion about NSA (eg http://www.theguardian.com/world/2013/jul/25/justice-departmentcase-nsa-collection) and the case of Prism (http://www.bbc.co.uk/news/technology-23051248) 24 Article 13, for instance, provides for member states to circumvent requirements on data privacy on the basis of national security i.a. 25 The disclosure of Yahoo! cited earlier was generally frowned upon in the US courts, yet highlights some level of ambiguity in government position on “surveillance” and overriding basic privacy rights. This was also seen in government attitudes to the Arab Spring versus the London Riots motivated by police shooting of Mark Duggan. Page 9 of 17 © 2013 University of Southampton IT Innovation Centre and judgements26. It is nonetheless true that users are, and should be, nervous about information being collected about them, especially now that the various technologies and services they use readily allow data to be collected and analysed27. The SESERV project (http://www.seserv.org) concluded with a specific recommendation on this point: service and application providers as well as government and other agencies should not “let the ease of collecting user data be done to such an extent as to let the user feel under surveillance or threat (or more simply put off)”28. The problem is exacerbated though by the fact that the data which are collected are not necessarily just personal information covered by the DPD: they may also include online activity (searches and so forth) which can easily be cross-correlated with the personal data available through SNS29. Finally, in this section, consider the provision of the DPD on retaining data. Article 6, section (iv) talks about data being kept “up to date” and goes on to stipulate that inaccurate or incomplete data should be erased or rectified (see above). From a user perspective, it would be tempting to assume that this actually means that as the data subjects they have the right to remove those data when they no longer want them30. However, as an Austrian student discovered, this is not necessarily the case, with facebook retaining all his personal data even those he thought he had deleted. This led to a request under the Freedom of Information act which revealed the extent of the problem31. Users do remain uncomfortable about privacy and the protection of their personal data, therefore. In particular, the capabilities of disruptive technologies (smart phones, SNS, etc.) as well as government surveillance rate high on the list of concerns. Despite legislation which outlines appropriate use and attempts to limit processing, as well as clear indications of a will to prosecute where necessary, users still complain of a loss of control and concerns over intrusion. CONCLUDING REMARKS AND RECOMMENDATIONS Users need to monitor their own use of sites and services if they feel under Users threat. Provision is made for protection of their private information, but they should be vigilant themselves. Care needs to be taken to avoid a perception of snooping. Increased transparency about data handling would help build and maintain trust Providers among users. Government agencies should be particularly mindful of user concerns. 1) Make it easy for users to view their personal data, modify them and/or FI Service remove them; and 2) Make sure that any such activity is applied across all data you hold; Application 3) Allow users to view before and after “states” (i.e. to alleviate fears of Developers surveillance) 26 Dutton, W.H. and Shepherd, A. (2003) “Trust in the Internet: The Social Dynamics of Experience Technology”, The Oxford Internet Institute, available from: http://www.oii.ox.ac.uk/resources/publications/RR3.pdf 27 See the summary results from the Ponemon survey above. 28 Recommendation 9 http://www.scribd.com/doc/105908010/D3-1-2-v2-pdf 29 Krishnamurthy, B. and C.Wills (2010) On the Leakage of Personally Identifiable Information Via Online Social Networks. ACM SIGCOMM Computer Communication Review, Volume 40, Number 1, pp. 112 – 117. 30 See also the “ability to be forgotten” in the Ponemon trust survey. 31 http://europe-v-facebook.org/EN/en.html Page 10 of 17 © 2013 University of Southampton IT Innovation Centre User confidence: the public domain One of the major new technologies to emerge late in the 90s but increasingly in the first decade of this millennium is social media. End users are now able to interact with others – family, friends, associates, people with similar experiences and interests – via the Internet. Lives are increasingly lived out in part at least through social networks with Issue tweets for example keeping anyone and everyone abreast of what is happening to, for and with anyone else. Constant sharing of personal experience in this way begs the question of whether such information and data should not be treated as public property the processing of which does not require regulation. When is my personal data no longer my property? The SESERV recommendation referred to previously continues with: “[…] citizens have a responsibility to careful about what information they share online, and regulators need to educate the public about and enforce data protection laws.”32 putting a logically founded onus on users to take on their fair share. If there are terms and policies, however long-winded and linguistically unfamiliar, they should be read; if users have concerns, they should raise them, and ask for disclosure of what is personal to them33. There is, however, another angle to this: in disclosing information about myself in an SNS, do I have a right to privacy or is that information now public? The concept of the “public domain” has been called a “multifaceted and multidimensional concept with no definite definition”,34 indeed its remit changes depending on its subject. The concept itself may refer to: 1) public domain in relation to data and information and how this interacts with data protection; and 2) the broad area of intellectual property, which itself has different conceptions of public domain depending on the type of IPR. In the present context, we will consider (1) above only since we are more concerned with personal data and its disclosure than issues around the protection of creativity associated with content generated by users. European Union Level At the EU level, the data protection framework provided for in the DPD does not distinguish between personal data that is in the public domain until that data falls under Article 8 as “special categories of data”. 32 Loc cit This may not always be straight-forward: going through all terms and policies online may be too timeconsuming, or when wishing to use a service for the first time motivation may be low. 34 Tshimanga Kongolo, ‘Intellectual property and misappropriation of the public domain’ (2011) 33(12) EIPR 780 33 Page 11 of 17 © 2013 University of Southampton IT Innovation Centre Article 8(1) establishes that no processing of personal data which reveals “racial or ethnic origin, political opinions, religious or philosophical beliefs, trade-union membership” shall occur and nor shall “processing of data concerning health or sex life”. This statement is then subject to certain exclusions listed in Article 8(2) as “(a) the data subject has given his explicit consent to the processing of those data, except where the laws of the Member State provide that the prohibition referred to in paragraph 1 may not be lifted by the data subject's giving his consent; or (b) processing is necessary for the purposes of carrying out the obligations and specific rights of the controller in the field of employment law in so far as it is authorized by national law providing for adequate safeguards; or (c) processing is necessary to protect the vital interests of the data subject or of another person where the data subject is physically or legally incapable of giving his consent; or (d) processing is carried out in the course of its legitimate activities with appropriate guarantees by a foundation, association or any other non-profit-seeking body with a political, philosophical, religious or trade-union aim and on condition that the processing relates solely to the members of the body or to persons who have regular contact with it in connection with its purposes and that the data are not disclosed to a third party without the consent of the data subjects; or (e) the processing relates to data which are manifestly made public by the data subject or is necessary for the establishment, exercise or defence of legal claims.” [our italics] This is the only part of the Directive which distinguishes between data not in the public domain and data which is. There is no explicit EU law on what “manifestly made public by the data subject” means, but it is generally understood as requiring “a deliberate act by the data subject, disclosing the data to the public”. For example, therefore, video surveillance would not be considered a conscious action to disclose information, but an interview to media, or publication on a public internet page would make data public.35 Following the absence of discussion regarding public domain throughout the rest of the Directive, it is clear that where personal data exists in the public domain and is subsequently processed, this processing is still subject to the Directive’s principles and requirements [our emphasis]. Indeed, if one looks to the UK ICO, they noted in their online code of practice36 that:  “People may post their personal details in such a way that they become publicly visible – for example through a social networking or recruitment site. Wherever the personal data originates, you still have an overarching duty to handle it fairly and to comply with the rules of data protection…  If you collect information from the internet and use it in a way that’s unfair or breaches the other data protection principles, you could still be subject to enforcement action under the DPA even though the information was obtained from a publicly available source. 35 Kotschy in Alfred Büllesbach et. al., Concise European IT Law, (2010, 2nd Edn, Kluwer Law International) 62 UK ICO, ‘Personal information online code of practice’ (July 2010) <http://www.ico.org.uk/~/media/documents/library/Data_Protection/Detailed_specialist_guides/personal_inf th ormation_online_cop.pdf> accessed 15 July 2013. 36 Page 12 of 17 © 2013 University of Southampton IT Innovation Centre  It is good practice to only use publicly available information in a way that is unlikely to cause embarrassment, distress or anxiety to the individual concerned. You should only use their information in a way they are likely to expect and to be comfortable with. If in doubt about this, and you are unable to ask permission, you should not collect their information in the first place.” [our italics] Thus, if personal data is being processed that was collected from the public domain it must adhere to the data protection principles.37 That is the data must be: “(a) processed fairly and lawfully; (b) collected for specified, explicit and legitimate purposes and not further processed in a way incompatible with those purposes… (c) adequate, relevant and not excessive in relation to the purposes for which they are collected and/or further processed; (d) accurate and, where necessary, kept up to date; every reasonable step must be taken to ensure that data which are inaccurate or incomplete, having regard to the purposes for which they were collected or for which they are further processed, are erased or rectified; (e) kept in a form which permits identification of data subjects for no longer than is necessary for the purposes for which the data were collected or for which they are further processed…” In essence, therefore, the final point in the UK ICO guidance is most important of the three: that the use must be “unlikely to cause embarrassment, distress or anxiety to the individual concerned” and a use of “their information in a way they are likely to expect and to be comfortable with”. However, that said, all should be taken into account when considering the processing of publicly available data. This guidance implies an element of consent is granted through a data subject making that data publicly available, making the processing “lawful”, and the controller must then ensure that the use does not cause embarrassment, distress or anxiety which meets the “fairness” requirement. Individual jurisdictions vary in their interpretation of what constitutes “public domain”, if at all. However, from a UK perspective, case law has shown that the determination of “public domain” falls to the question of whether the particular information was “’realistically’ accessible to members of the public or only ‘in theory’”.38 Further, information would not be in the public domain if it required an unrealistic “specialised knowledge and persistence” to find it. So the average member of public must be able to find the information fairly easily.39 The legislation in this area is not as clear cut. Nevertheless, and in summary, individuals who take steps to publish information in publically accessible areas which requires no significant effort to view, then they may be assumed to have released it into the public domain. That being said, there is still an obligation on others to process such personal information with care, specifically to avoid embarrassment and not in a way that would be unexpected by the individual concerned. Anecdotal cases of embarrassment caused by making public responses best kept within the close circle of family and friends appear with continued regularity, including: 37 DPD Article 6(1). Mosley v News Group Newspapers Ltd [2008] EWHC 687 (QB) 39 Attorney General v Greater Manchester Newspapers [2001] EWHC QB 451 38 Page 13 of 17 © 2013 University of Southampton IT Innovation Centre     Political gesturing40 Social insensitivity41 The long-term effects of past indiscretion42 Unexpected consequences o of heroic intervention43, or o public insensitivity44. But in all such cases, it could be argued that those posting such content should have had an expectation that they were making it public: these were “deliberate act[s] by the data subject, disclosing the data to the public”. There are positives and negatives here. Richard Branson, for instance, the importance of going public in this way: “Embracing social media isn’t just a bit of fun, it is a vital way to communicate, keep your ear to the ground and improve your business”45 though this does have a knock on effect. Take employment, for instance:    prospective candidates are encouraged to avoid specific types of activity (inappropriate comments or photos, dishonesty, and so forth)46 employers do use sites to screen candidates47 facebook is particularly important, it appears48 and so on. Indeed the non-use of social media may be taken as suspicious49 . So online activity is now part of individual life and will presumably continue to be so50. This does not, however, mean that all personal details posted on such sights can be used arbitrarily, even though they may be regarded as in the public domain. Any derivative processing may only be done within the constraints of the DPD, and should certainly not cause embarrassment or go beyond what might be expected by the data subject themselves. CONCLUDING REMARKS AND RECOMMENDATIONS There is no reason not to use social media to share content, and indeed it may be a necessary or expected part of everyday life. Notwithstanding any Users specific settings or other facilities offered by the site in question, that content becomes public domain. Users must take responsibility for what they share in this way; authorities (and providers) should really help users 40 http://www.dailymail.co.uk/news/article-2082527/Diane-Abbott-Twitter-race-row-MP-faces-calls-resignracist-tweet.html 41 http://www.bbc.co.uk/news/uk-england-essex-23164829 42 http://uk.news.yahoo.com/teen-crime-commissioner-offensive-tweet-row-081528626.html#Tzh3SvT 43 http://news.sky.com/story/1064049/shark-wrestler-grandad-disgusted-by-sacking 44 http://www.bbc.co.uk/news/uk-england-coventry-warwickshire-11068063 45 http://www.linkedin.com/today/post/article/20121019130632-204068115-why-aren-t-more-businessleaders-online 46 http://blog.reppler.com/2012/06/ 47 http://blog.reppler.com/2012/07/ 48 http://blog.reppler.com/2012/03/13/can-your-facebook-profile-predict-job-performance/ 49 http://blog.reppler.com/2012/08/ 50 See also: http://www.seserv.org/Studying-the-Future-Internet/doesyourbosstweet Page 14 of 17 © 2013 University of Southampton IT Innovation Centre understand the implications of posting content. Users sharing content and personal information, for instance via a SNS, should still be respected under the DPD. Their content and personal Providers information should not be shared, disclosed, or otherwise processed if it might cause embarrassment or alternatively in a way that the data subject could not reasonably have expected. FI Service and Application Developers 1) Alert users to the scope of sharing when they post content and personal information (i.e. extend the Are you sure? type warnings to include You are now releasing this information to these people). 2) Restrict access to personal data: a. 3rd parties can only view but not copy b. APIs should disallow extraction without alert to data subjects c. In-house analytics should not be used51. User profiles and data mining: derivative works With the technologies of the FI making ever more data available and end users increasingly willing or required to share personal data online, the next major challenge for the FI is the regulation of how such data is Issue processed and interrogated to reveal even more about the lives and preferences of those online than they themselves might have wanted to share. What will the status be if we let the power of data analytics loose on data otherwise shared for different purposes? So far in this section, we have considered issues about privacy around personal data, user perceptions of how much protection they are given even when they choose themselves to make the data public. Finally, we should consider the additional and perhaps more recent issue of how those personal data are manipulated to generate a derivative set of “meta data”. As highlighted previously, it is not difficult to match online activity with personal details29 and thereby potentially gain access to much more information than the data subject might have intended or expected. Irrespective of individual consent, there are two particular problems here: 1) In consenting to the limited use of personal data in different contexts, albeit in some cases embedded with the terms and policies of individual platforms, the user may not have intended the data to appear or be used together; 2) And more worryingly, data mining may be used to infer things which are not in fact true, or certainly would not have been released to anyone by the individual data subject. In the first case, the data subject retains some level of control: they are still able to give or refuse consent for certain types of information or personal data. Even so, there is little protect against the 51 Further processing in this way may well go beyond reasonable and expected additional processing, and as such may well go against the original terms of the consent provided. More important and potentially disturbing, though, is that further analyses may reveal hidden or unknown characteristics of users which they would certainly never have agreed to release. Page 15 of 17 © 2013 University of Southampton IT Innovation Centre possibility of jigsaw attacks52 where a so-called motivated intruder might use legitimately available data sources and match one against the other. There is nothing that can be done against this, unless previously collected personal data have been successfully anonymised prior to release and in order to reduce the possibility of other datasets being linked to it. The second case is perhaps more disturbing. Take the case of a retailer who simply through the analysis of sales established who may or may not be pregnant and targeted marketing material at those customers. When one of them turned out to be a teenage girl who had not yet decided to tell her family of her pregnancy, the whole family suffered53. There is nothing really in the legislation to protect against this, assuming she had released her name and address during her purchasing and had agreed to the information being used for marketing purposes, a fairly common request especially for online services and retailers. The basic business model of SNS relies on precisely this kind of analysis: using the personal data that subscribers provide, along with monitoring their activities – their Likes and Dislikes, who they interact with, and the topics that motivate their participation – the site can provide targeted marketing and charge premium rates to do so. It is one thing to be irritated by such marketing, but quite another to discover prospective employers or even government agencies, such as the police and intelligence services, can do the same sort of analysis if they so desire. Innocent information from photos, for instance, including not only the subscriber or other members of the social networking site but also other people, could be used to identify the movements of individuals who had never provider consent in the sites end-user licence agreement. As such, data mining in this way could have a very serious and detrimental effect. On the one hand, it could reveal information that the individual would never have disclosed of their own volition. More disturbingly, if the analysis is incorrect, it will reveal information about an individual which isn’t even correct54,55. In some sense, the creation of additional metadata about an individual is legitimate, if ethically questionable: analysis could be covered under a usage agreement. In addition, the personal data used could be said to be in the public domain as previously discussed. However, and as outlined in the DPD, due care has to be exercised where embarrassment might result, and where the user would not have expected such additional analyses to take place. Significantly, though, perhaps it could be argued that information about pregnancy, sexual orientation or health which have clearly emerged and been disclosed in this way come under the category sensitive data and therefore may not be released without the explicit consent associated with them from the data subject. CONCLUDING REMARKS AND RECOMMENDATIONS Users The encouragement to be careful about what individuals reveal about 52 See http://www.ico.org.uk/~/media/documents/library/Corporate/Research_and_reports/anonymisation_cop_dr aft_consultation.ashx and https://www.ico.org.uk/~/media/documents/library/Data_Protection/Practical_application/anonymisation_co de.ashx 53 Kashmir Hill, 2012. “How Target figured out a teen girl was pregnant before her father did,” Forbes (16 February), at http://www.forbes.com/sites/kashmirhill/2012/02/16/how-target-figured-out-a-teen-girl-waspregnant-before-her-father-did/, cited in Obler, Welsh and Cruz “The danger of big data: Social media as computational social science” http://firstmonday.org/ojs/index.php/fm/article/view/3993/3269 54 Current accuracy rates may be no better than 70% http://www.theguardian.com/news/datablog/2013/jun/10/social-media-analytics-sentiment-analysis 55 http://cacm.acm.org/magazines/2013/5/163753-discrimination-in-online-ad-delivery/fulltext provides a disturbing example of the use of derivative analyses. Page 16 of 17 © 2013 University of Southampton IT Innovation Centre themselves is more acute when we consider: i. Data from different sources may be able to be cross-matched and reveal more about the users than they originally and separately intended; ii. Metadata can be derived from personal data along with online activity that may reveal information about the user that may cause embarrassment or worse, whether or not it is correct. There is also no clear legal protection against this. Irrespective of consent and usage licences, providers have an ethical obligation to be sensitive about the data they derive from an original source. Providers Care should be taken to protect the individual even in the face of market pressures. FI Service and Application Developers 1) Despite the lack of clarity on the legal status of derived metadata, care should be taken if those data could be construed as sensitive data. 2) Disclosure of personal data even with consent should be done on a caseby-case basis and in consideration of the possibility of jigsaw attacks. 3) Users should be allowed to review and modify any derived metadata as part of the administrative components of any ongoing service or application. Page 17 of 17