Welcome in the new year! After bit longer than usual break I’m back to editing and creating this blog. Depending on how much time I will have available this post might be a beginning of a longer series of articles on data mining. On the other hand I’m afraid that afford to write on the subject on series will be hindered by need to comment on current / ongoing affairs. Anyway enjoy.
While it seems that today most attention is concentrated on government surveillance programs, it is important not to forget about less secret and probably much more prevalent problem of data mining methods used officially by both private enterprises and government institutions. Modern technology of data collection and analysis enables mining on global scale, making world of online user data borderless. Not even mentioning problems of copyright law regarding the fact that crawlers have to access and collect data from ie web pages, there is myriad of issues regarding privacy and ‘ownership’ of personal information. As reported by New York Times and Wall Street Journal specific profiles, based on data like estimated salary, history of sales, ads clicked and page history are sold almost every second to highest bidder. Furthermore it turns out that anonimisation of data is much more complex and less airtight than it seems. In many cases data miners collects data that are not directly personal information, but due to its nature – like e-mail – address they are just as good for identification as actual personal data. Research mentioned in WSJ article suggests that 56% of websites leaks that kind of information. What’s even more ‘like’ buttons from facebook or twitter track website activity even if user do not click on them.
In terms of EU law there are three main pieces of legislation that are instrumental to ensuring safeguards in terms of data protection. First of all EU Carter of Fundamental Rights in articles 7 and 8 sets rights to private life and right to protection of personal data. However the more specialized and more relevant is Directive 95/46/EC also known as Data Protection Directive. It aims to set complete framework regarding collection and processing of data. Worth emphasizing is definition of ‘personal data’ from article 2 which states that this term relates to any information that can directly or indirectly lead to identification of natural person. Such statement sets wide spectrum of data, however due to earlier mentioned concerns it seems more than necessary. Even more important is construction of particular articles. Article 7 states that ‘Member States shall provide that personal data may be processed only if:’ – combined with other provisions it means that data processing is not allowed unless specific justification and requirements are met. First of all principle of transparency have to be ensured. Article 7 enumerates situation where data can be legally obtain – and basically the core statement is “the data subject has unambiguously given his consent; or”. The ‘ors’ mentioned are concerned mainly with possible legal obligations of subject. Still, there is also provision of carrying with collection in the public interests. These kind of rules are always problematic due to extremely broad spectrum they cover. After all there is hardly any aim of public authorities that cannot be taken over ‘public interests’. No less important is article 12 – right to access. It enables data subject to access collected information at any time, and furthermore to enforce erasing, blocking or changing any of those that are incomplete or inaccurate. What’s more subject should also have access to the mechanism of collecting the data and means of precessing. Proportionality is also emphasized in the directive. Article 6 deals with the quality of data, which has to be , among others, collected in necessary scope, accurate, adequate and relevant to the purpose of the collection, kept up to date. In case of especially personal informations – political beliefs, ethnic origins, sex life, trade-union membership and so on collection is generally forbidden under the article 8. Of course there are various exception from these rule – apart from the explicit consent given by the subject, general trend is similar to the ‘normal’ requirements for data mining. Mainly they deal with keeping vital interest of the subject and contract obligation. Another big exception is collecting data for the purpose of medical care providers and for the purpose of preventive medicine. There is no reason to comment on all of the articles. The general framework set by the directive seems to be well set and quite comprehensive. Devil as always is in details – does collecting information on sex life or political affiliations wouldn’t fall under collection of internet traffic data – certainly knowledge of web pages visited would provide quite a lot information about the subjects. These concerns are especially relevant today when scale of the government surveillance is becoming more and more visible. Another important aspect is consent – we gave those away pretty much all the time. Every social media provider surely in their terms of service requires users to agree on some form of data collection. But what kind of choice to we have if every service available enforces these kind of provisions.
Final piece of legislation is Regulation (EC) No 45/2001 that is concerned with collection of data by [European] Community institutions and bodies and free movement of such data. Many of the solutions are parallel to those from the directive, others are tailored to suit specifics of the Community institutions and exchange of data. In 2001 also position of European Data Protection supervisor was created in order to control European institutions and bodies in terms of data collection and processing. Especially interesting is that in accordance to Article 27(1) of Regulation (EC) No 45/2001 all processing operations likely to present specific risks to the rights and freedoms of data subjects by virtue of their nature, their scope or their purposes” are to be prior checked by the EDPS. This however is a topic for post on its own.