Entries by Nicolò Shargool

Scraping and Generative Artificial Intelligence: the Data Protection Autority’s Notice

Automated online data collection, commonly known as web scraping, has become a widespread practice in many sectors for data analysis and the development of applications based on generative artificial intelligence (GIA). However, this practice raises important legal issues, especially in relation to the protection of personal data. Recently, the Italian data protection authority (Garante per la protezione dei dati personali) issued specific guidelines that provide guidance on measures to be taken to mitigate the risks associated with web scraping. This article examines the new guidelines in detail, exploring the legal implications and best practices for compliance.

What is Web Scraping?

Web scraping is the process of automatically extracting data from websites using specific software, known as a scraper. These programmes can automatically browse web pages, collect structured and unstructured data, and save it for further analysis. Web scraping can be performed through various methods, including:

  • HTML parsing: Parsing the HTML code of web pages to extract specific information.
  • APIs: Use of programming interfaces to access data offered by websites.
  • Bots: Automated programmes that simulate human navigation to collect data.

Risks Associated with Web Scraping

Although it may have legitimate applications, such as collecting information for market analysis, it is often associated with less legitimate uses, such as the theft of personal data for commercial or even fraudulent purposes. The indiscriminate use of web scraping may in fact entail various legal and security risks such as:

  • breach of privacy: the collection of personal data without consent may violate privacy regulations, such as the GDPR.
  • Abuse of Terms of Service: Many websites prohibit web scraping in their terms of service, and violating these terms may lead to legal action.
  • Data security: Bulk data collection may expose information to security risks, such as unauthorised access or malicious use of data.

The Autoruty’s Notice

The Garante per la protezione dei dati personali (Italian Data Protection Authority) has recently published a document providing guidance on how to manage the risks associated with web scraping. The notice focuses on several aspects that revolve around the protection of personal data and compliance with existing regulations. Below are the main recommendations:

  • Creation of Restricted Areas: one of the measures suggested is the creation of restricted areas on websites, accessible only after registration. This practice reduces the availability of personal data to the general public and can act as a barrier against indiscriminate access by bots. This will also make it possible to monitor who accesses the data and to what extent, improving traceability and accountability. On the other hand, it is crucial that the collection of data for registration is proportionate and respects the principle of data minimisation.
  • Clauses in the Terms of Service: the inclusion of specific clauses in the Terms of Service explicitly prohibiting the use of web scraping techniques is another effective tool. These clauses can act as a deterrent and provide a legal basis for taking action against those who violate these conditions.
  • Network Traffic Monitoring: implementing monitoring systems to detect anomalous data flows can help prevent suspicious activities. Adopting measures such as rate limiting makes it possible to limit the number of requests coming from specific IP addresses, helping to reduce the risk of excessive or malicious web scraping.
  • Technical interventions on bots: The document also suggests the use of techniques to limit access to bots, such as implementing CAPTCHAs or periodically modifying the HTML markup of web pages. These interventions, although not decisive, may make scraping more difficult.

Conclusions

The Data Protection Authority’s statement represents a significant step forward in regulating the use of web scraping and the protection of personal data. For operators of websites and online platforms, it is crucial to take the recommended measures to ensure regulatory compliance and protect users’ personal data.

Compliance with data protection regulations is not only a legal obligation, but also a key element in building and maintaining user trust. Companies must be proactive in adopting data protection best practices and monitoring regulatory developments.

Contact us

If you have questions or need legal assistance with regard to web scraping and data protection, our firm is at your disposal. Contact us for a personal consultation and to find out how we can help you navigate the complex landscape of privacy regulations.

Lean patent strategies: doing more with less

In previous articles, we have discussed how in the fast-paced world of start-ups, intellectual property (IP) is a crucial factor for success. Patents, in particular, are often considered the Holy Grail for protecting innovative inventions and gaining a competitive advantage.

However, the reality for start-ups is often more complex, as registering and maintaining patents can require large sums of money, creating a significant hurdle for start-ups with limited resources. Moreover, not all patents translate into real strategic value and only a small percentage achieve significant market impact. The patenting process is often intricate and requires specific legal skills that are not always readily available within a start-up.

In addition to this, concrete information on the actual return on investment (ROI) of patents is often scarce, making it difficult to assess their real value.

It therefore becomes crucial for start-ups to distinguish between ‘strategic’ and less significant patents. Patents with high strategic value feed into the company’s strategy, enabling it to implement key strategies and position itself distinctively in the market. In addition, they generate tangible returns, translating into significant revenue increases or preventing competitors from entering the market. They also provide a solid competitive advantage, distinguishing the company from competitors and protecting the innovations that make it unique.

To overcome challenges and optimise resources, start-ups can adopt a ‘lean’ approach to patenting, based on principles of efficiency and foresight.

This approach involves the targeted protection of inventions with the highest strategic and commercial potential, avoiding dispersing resources on less promising ideas. In addition, a step-by-step approach allows – where national law permits – to start with provisional patent applications to obtain initial cost-effective protection, postponing final registration to a later date.

Continuous evaluation and review of the patent strategy in light of market feedback, business developments and the latest industry trends is another key component of the lean approach. In fact, the technology landscape is constantly evolving, with new inventions and patents emerging frequently, and, therefore, continuous evaluation allows one to stay abreast of the latest industry trends and identify new patenting opportunities. Consider a start-up operating in the field of biotechnology: as medical research advances and new therapies are developed, it needs to constantly monitor the latest innovations to be aware of what can and cannot be patented in order to protect its competitive advantage.

Finally, external collaboration with qualified IP experts allows the patenting process to be optimised by utilising their specific knowledge and expertise.

Adopting a lean approach to patenting offers start-ups a number of concrete benefits, including cost reduction, increased agility and value maximisation. By focusing on patents with the highest potential return on investment, start-ups can optimise their limited resources. A flexible approach allows them to adapt their patent strategy to changing market needs and new opportunities. By protecting the inventions most critical to business success, start-ups can increase their value and attract interested investors. Innovative start-ups wishing to exploit the full potential of intellectual property should not let challenges deter them. By adopting a lean patent strategy based on an accurate assessment of strategic value, a focused and flexible approach and efficient use of resources, start-ups can successfully navigate the sea of IP, gain a lasting competitive advantage and maximise value for their stakeholders.

What do investors look for in a start-up’s IP assets?

In the dynamic world of start-ups, venture capital or VC funds (discussed in this article) play a key role in supporting innovation and growth. When they evaluate a start-up, one of the key factors they consider is the portfolio of intellectual property assets the start-up owns and the strategic management of the same.

 But what are these investors really looking for when examining a start-up’s IP strategy? Let’s dig deeper to understand their expectations and what drives them to invest.

Protection of Innovations and Attraction of Investments:

First, let us assume that venture capital funds understand the intrinsic value of intellectual property for a start-up. This includes patents, trademarks, copyrights and trade secrets. They know that a sound IP strategy not only protects innovations and brand identity, but can also create a sustainable competitive advantage in the market. When evaluating a start-up, VCs look for signs that demonstrate an awareness of the importance of IP protection.

One of the main objectives of VC funds is to finance start-ups that show significant potential for growth and success in the market. A solid IP strategy is a crucial indicator of this potential. Investors are more inclined to fund start-ups with a solid IP portfolio and a well-delineated Innovation Plan because they are aware that industrial property rights, if the result of sound decisions, almost always lead to a competitive advantage for the target start-up. In addition, a sound IP strategy can attract investment due to the monetisation opportunities offered through licensing or direct sales.

Development of an Organised Intellectual Property Strategy:

Venture capital funds value start-ups that have a clear, well-structured IP strategy. This means identifying innovative assets by listing all the innovations developed by the company, including know-how and brands, and prioritising protection as IP protection can be overly expensive and end up clipping the wings, as well as the cash, of start-ups, especially early-stage ones. A start-up that demonstrates an organised IP strategy also shows a predisposition to manage and protect its key assets without diverting resources from other uses.

Seeking Priority, Freedom Implementation and Risk Mitigation:

Venture capital funds are aware of the legal risks that can arise from a poorly managed IP strategy. Therefore, they look for start-ups that have verified that the IP assets they boast are indeed their own to ensure that they do not infringe the IP rights of others. All too often, start-ups enter the market using – unknowingly – already registered trademarks, with disastrous consequences: the need for rebranding, litigation and reputational damage.

A start-up that demonstrates that it has mitigated legal risks through thorough research on its IP portfolio is more attractive to investors, as this reduces the potential for costly litigation.

Monitoring and Continuous Protection of Intellectual Property:

Another key aspect that VCs take into account is the start-up’s protection strategy for its ‘family jewels’. A virtuous start-up should, within its means, enhance its IP portfolio by actively monitoring and protecting its IP over time. This includes regularly monitoring the market for potential infringements and taking the necessary actions to enforce its rights. A start-up that demonstrates an ongoing commitment to the protection of its IP indicates to investors a level of responsibility and care that is critical to long-term success.

Team Training and Legal Advice:

Finally, venture capital funds value start-ups that understand the importance of IP and have a properly trained and informed team on this topic. In addition, consulting an IP lawyer can help start-ups navigate the complex legal landscape and develop effective IP strategies.

In conclusion, to attract the financial backing of venture capital funds, a start-up needs to demonstrate that it has a sound, organised and well-structured IP strategy. Investors look for signs of awareness of the value of IP, protection of innovations, mitigation of legal risks, continuous monitoring and team training. A start-up that meets these expectations will be more attractive to investors and have a better chance of obtaining the financial backing it needs to grow and succeed in the market.

MANAGING INTELLECTUAL PROPERTY IN EARLY STAGE START-UPS

In the dynamic and competitive environment of start-ups, intellectual property plays a key role in securing innovation and protecting investments. These young companies often rely on innovative ideas and cutting-edge technologies to differentiate themselves in the market and gain competitive advantage. In this context, industrial property rights assume crucial importance, allowing start-ups to protect their intellectual and exclusive assets, such as patents, trademarks and designs. This protection not only ensures the value of their creations, but can also be instrumental in attracting investment and strategic partnerships. Thus, the proper management of intellectual property becomes a key element in the growth and success strategy of start-ups.

The main forms of intellectual and industrial property protection:

Intellectual property protection is crucial for start-ups, as it ensures the protection of their innovations and the valorisation of their intangible assets. The most common forms of protection include:

  • Patents: These give the holder the exclusive right to exploit an invention for a fixed period, usually 20 years. Patents can be assigned to process, product or design inventions.
  • Copyright: This protects literary, artistic and creative works such as books, music, films, software and other original works. It grants the creator the exclusive right to reproduce, distribute and commercially exploit the work.
  • Trade marks: Trade marks identify and distinguish a company’s products or services from those of its competitors. They can take forms such as distinctive words, logos, slogans or sounds and protect a company’s identity and reputation.
  • Industrial designs: These protect the aesthetic or ornamental appearance of a product, such as shape, colours, contours or texture. They are used to protect the visual appearance of products, such as clothing, furniture or jewellery.
  • Trade secrets: These include confidential information, such as formulas, processes, marketing strategies or customers, that give a company a competitive advantage. Protecting trade secrets involves keeping them confidential and taking security measures, including through confidentiality agreements, to prevent their unauthorised disclosure.

These forms of protection allow start-ups to protect and enhance their creations, ensuring competitive advantage and sustainability in the market.

The importance of the Innovation Plan and how to realise it:

The Innovation Plan is a crucial element in the journey of start-ups as it not only maximises the value of intellectual property, but also effectively manages the risks associated with its infringement. This strategic plan involves a series of methodical and articulated steps:

  1. Identification of innovative assets: The first step is to list all innovations developed by the company, including know-how and brands. Very often, especially in early stage start-ups, founders make use of the services of third parties which, however, are not regularised with contracts containing clauses on the transfer of the intellectual property developed, leaving the ownership of the same in the hands of third parties and not the start-up. The process of identifying assets can also be facilitated, not only through the assistance of consultants experienced in the protection of IP assets, but also through the implementation of IP courses for employees and the establishment of an effective communication path between the people who realise innovation and the decision makers. In addition, it could be useful to set up a reward system of incentives to encourage innovation to go alongside the more classic deterrent system, characterised by the provision of confidentiality agreements (Non Disclosure Agreements or NDAs) to be signed by employees, collaborators and strategic partners.
  2. Establish protection priorities: As IP protection can be costly, it is essential to carefully select the assets to be protected. For each technology or innovation, a decision must be made whether it is better to file a patent application, keep the invention secret and protect it through NDAs or publish it. This decision should be based on a detailed assessment of the following factors:
  • Uniqueness: Assess the likelihood of a patent application being granted through an analysis of prior art and patentability requirements, such as novelty, inventive step and industrial application.
  • Distinctiveness: Check whether infringement can be easily detected, especially in the case of software, to determine whether patenting is the best solution.
  • Possible alternatives on the market: Examine whether the patent protects the best method of making a successful product, making imitation by competitors impossible.
  • Product value: Assess how much the invention contributes to the overall development of the product and how strategic it is for the company’s business.

Overall assessment: The final decision on which assets to protect should be the result of an overall assessment, which is not based on a predefined score, but on a weighted analysis of the factors listed above. The objective is to create an effective plan that allows the start-up to convert intangible assets into IP, maximising value for the company and minimising the risks of infringement and litigation. 

Such an activity, if carried out efficiently and without wasting resources, brings with it the positive advantage of attracting investors and strategic partnerships that can provide funds and business opportunities to the start-up. We will discuss this issue in one of our next articles.

T&C may appl-AI (v.2.0)
A short guide to the terms of use of generative AI for content creators

Exactly one year ago, as the world was being rocked by the launch of ChatGPT-3, we published on our blog a brief overview of the intellectual property licensing clauses on content generated by the most widely used generative AI tools. Today, 12 months after the last article and in light of the numerous copyright lawsuits that have begun to rage in the world of generative AI, we have decided to update that overview.

In this context, the importance of a thorough understanding of the Terms and Conditions (T&Cs) applicable to generative AIs is becoming increasingly evident. Our short guide will provide an up-to-date analysis of the T&Cs surrounding the use of generative AI, with a particular focus on the legal implications arising from recent lawsuits, such as the one between the New York Times and OpenAI. The fluidity and speed with which these issues evolve requires constant updating of knowledge to avoid possible legal complications.

Continuing where we left off last year, we will explore licensing clauses, creative authorship, liability and the dynamics of collaboration between artificial intelligence and human authors. This guide aims to be an essential tool for navigating the complex legal landscape of generative AI, providing practical advice and points of reflection for those venturing into the world of collaborative content creation with ‘intelligent machines’.

  • OPENAI – DALL-E; CHATGPT (31 January 2024)

These two models developed by the start-up OpenAI probably need no introduction. ChatGPT is a conversational model capable of holding complex conversations, providing information and writing texts using natural language; Dall-E is an artificial intelligence tool capable of generating images from text descriptions.

Content created through these two popular tools is subject to the same licence, issued by OpenAI.

In this case, just as a year ago, the licence states that “the User is the owner of all the Input and, subject to the User’s compliance with these Terms, OpenAI assigns to the User all its right, title and interest in and to the Output.”  However, we still find the same exceptions to the exclusivity of this licence, as OpenAI very generically reserves the right to “use the Content as necessary to provide and maintain the Services, comply with applicable law and enforce our policies. You are responsible for the Content, including ensuring that it does not violate any applicable law or these Terms.

A significant novelty with respect to last year, also resulting from the event that had seen the blocking of ChatGPT in Italy by a measure of the Italian Data Protection Authority, is certainly the introduction of the user’s option to prevent OpenAi, by means of an Optout, from training its model on the content – be it input or output – entered into and generated by the platform.

  • MIDJOURNEY (22 December, 2023)

Another popular artificial intelligence tool capable of generating images from textual descriptions.  

The old terms and conditions of Midjourney stipulated that, according to the licence, the user was the owner of all resources created with the services. However, there was an important exception for non-paid users, who received a Creative Commons Non-Commercial 4.0 Attribution International Licence on the final output. This meant that content could only be used if certain requirements were met, including mentioning the authorship of the work, providing a link to the licence and indicating any modifications. In addition, the use had to be non-commercial. 

Midjourney’s new terms and conditions state that the user is the owner of the resources created with the services to the fullest extent permitted by applicable law. However, there are some exceptions, such as the subjection of the user’s ownership to the contractual obligations and rights of third parties. Furthermore, if the user is a company with a turnover of more than USD 1,000,000 per year, it is necessary to subscribe to a ‘Pro’ or ‘Mega’ plan to own the resources created. Finally, if you ‘enlarge’ the images of others, they remain the property of the original creators. 

These new terms reflect a more specific and detailed approach than the old ones, with a focus on conditions of use for corporate users and respect for the rights of third parties.

Further, in Midjourney’s new terms and conditions, it is specified that by using the Services, you grant Midjourney, its successors and assigns a perpetual, worldwide, non-exclusive, sub-licensable, royalty-free, irrevocable copyright license. This license allows Midjourney to reproduce, prepare derivative works of, publicly display, publicly perform, sublicense, and distribute the text and image submissions you submit to the Services, as well as any Assets you produce through the Service. Importantly, this licence survives termination of this Agreement by any party for any reason. This update emphasizes the fact that Midjourney acquires broad rights to the assets created by users through the Services, even after termination of the Agreement.

  • STABLE DIFFUSIONS WEB (20 August2022)

Stable Diffusion is a deep machine learning model published in 2022, mainly used to generate detailed images from text descriptions.

In this case, Art. 6 of the Licence merely states that “the Licensor does not claim any rights to the Output generated by the user using the Model. The user is responsible for the generated output and its subsequent use.” The user is therefore granted the availability of the generated content. However, there are some exceptions. In fact, in the next sentence, the licence states that “no use of the output may contravene the provisions of the Licence (Annex A)” referring to a list of uses of the output that are unlawful because they are potentially harmful to third parties.

In conclusion, it is worth emphasising how crucial it is for content creators to fully understand the legal landscape surrounding the use of generative AI, ensuring a harmonious collaboration that respects the rights of all parties involved, thus avoiding civil liability for copyright infringement of third parties.

Pseudonymisation and anonymisation: the blurred line between personal and non-personal data

In the context of the General Data Protection Regulation (GDPR), Article 4(5) defines pseudonymisation as the processing of personal data in such a way that it can no longer be attributed to a specific data subject without the use of additional information. It is essential to note that this additional information must be stored separately and subject to technical and organizational measures to ensure that such personal data is not attributed to an identified or identifiable natural person.

Contrary to a common perception, pseudonymisation should not be regarded solely as a technological aspect, but rather as an operational and organizational strategy. In fact, the GDPR, in recital 29, recognises the possibility of pseudonymisation measures with the capacity for general analysis within the same controller, provided that the necessary technical and organizational measures are taken and that the additional information for attributing personal data to a specific data subject is stored separately.

Conceptual and Legal Foundations of Pseudonymisation and Anonymisation

The conceptual elaboration reveals that pseudonymisation is not an isolated concept, but rather an integral part of an orchestral complex of measures aimed, on the one hand, at protecting the data of the data subject and, on the other hand, at facilitating the circulation of data by safeguarding compliance with data protection obligations by data controllers.

In this context, discerning between pseudonymisation and anonymisation is of crucial importance. In short, while pseudonymisation allows the information to be reconstructed, anonymisation renders the data unconstructable.  This principle is clearly stated in Recital 26, which excludes the application of data protection principles to anonymous information, i.e. information that does not relate to an identified or identifiable natural person or to personal data rendered sufficiently anonymous to prevent or no longer allow the identification of the data subject.

But how do we determine whether a piece of data is pseudonymous or anonymous? Here again, we are helped by recital 26 of the GDPR, which states that to establish the identifiability of a person, account should be taken of all the means, such as identification, which the controller or a third party may reasonably use to identify that natural person directly or indirectly. 

Judgment T-557-20 of the European Court of First Instance on Pseudonymisation and Anonymisation of Data

The recent judgment delivered by the European General Court on 26 April 2023, in the context of Case T-557-20, represents a significant milestone in the legal understanding of anonymisation and pseudonymisation practices. Moving away from the previous orientation of the Article 29 Working Party (now replaced by the European Data Protection Board), which postulated a more restrictive approach, the General Court adopted a more nuanced and relativist perspective.

The Court’s decision emphasized the need to carefully consider the specific circumstances when assessing the identifiability of data. In the present case, concerning the transmission of shareholder and creditor comments by the Single Resolution Committee (CRU) to third parties, the General Court rejected the idea that the possibility of automatic re-identification qualifies the data as personal. In particular, the General Court concluded that, despite the fact that the CRU had access to additional data for identification purposes, the transmitted comments and alphanumeric codes had to be qualified as anonymous data by consistently applying a principle that is contained in Recital 26 of the GDPR and Recital 16 of Regulation 1725/18 such that if personal data have been rendered sufficiently anonymous that the data subject cannot or can no longer be identified, data protection principles do not apply.

This change of course represents a significant departure from previous restrictive interpretations, emphasising the need to carefully assess the actual identifiability of data in specific contexts. The European Court’s ruling has significantly influenced the legal landscape with regard to anonymisation and pseudonymisation techniques, raising crucial questions about the practical application of these concepts in the current regulatory context.

Conclusions and Key Role of Pseudonymisation and Anonymisation Techniques

In conclusion, the proper implementation of pseudonymisation and anonymisation techniques is imperative to ensure user privacy, especially in sensitive sectors such as health and finance. The technologies used must comply with legal principles, and the choice between pseudonymisation and anonymisation should be guided by specific needs and the required reversibility. A thorough understanding of these concepts and their accurate implementation are crucial to address the legal and regulatory challenges related to the protection of personal data.

In this context, the ruling of the European Court of First Instance not only provides a crucial clarification of the distinction between anonymous and pseudonymous data, but also raises important reflections on the future of data protection practices. The decision emphasizes the importance of taking a contextual and circumstantial approach when assessing the anonymisation and pseudonymisation of data. It defines that, in order to determine whether information constitutes personal data, it is necessary to put oneself from the perspective of the recipient, assessing whether the possibility of combining the information transmitted with any additional information held by the third party is a reasonably feasible means of identifying data subjects.

This new orientation of the Luxembourg courts may influence the way organizations implement data protection measures. A careful analysis of the specific circumstances therefore becomes crucial to determine whether data can indeed be considered anonymous, even when they are associated with alphanumeric codes or other identifiers.

Compensation for Damages for Unlawful Processing of Personal Data

The judgment of the Court of Cassation, Cass. civ., Sec. I, Ord. 12-05-2023, No. 13073, addresses a case in which a municipality was ordered to compensate damages caused to an employee as a result of unlawful processing of her personal data. This judgment raises important questions regarding compensation for damages resulting from breaches of data protection regulations, in particular Regulation (EU) 2016/679, known as GDPR.

The Case

In the case at hand, the municipality had accidentally published on its institutional website a determination regarding the garnishment for a certain amount of a municipal employee’s salary, thus violating the data protection rules of the GDPR. Upon discovering the error, the municipality had admitted that the disclosure of the data had occurred accidentally, and promptly took steps to remove the data in little more than 24 hours.

Nevertheless, the Court of First Instance had found that the municipality was liable and ordered it to pay damages. The Court of Appeal upheld that judgment, which, in turn, was appealed by the municipality before the Supreme Court.

The Supreme Court’s ruling, rejecting the Municipality’s petition, emphasised that the non-pecuniary damage that can be compensated in cases of personal data breaches is determined by the infringement of the fundamental right to the protection of personal data, enshrined both in the Constitution and in the GDPR. Recalling that the GDPR, in Article 82, states that anyone who suffers material or immaterial damage caused by a breach of the provisions of the regulation has the right to obtain compensation for the damage from the data controller or processor.

The Legal Change

Prior to the entry into force of Regulation (EU) 2016/679, in our legal system, the issue of civil liability arising from the unlawful processing of personal data found its regulation in Article 15 of Legislative Decree No. 196 of 30 June 2003 (Personal Data Protection Code). This stipulated that anyone who caused damage to others due to the processing of personal data had to pay compensation pursuant to Article 2050 of the Civil Code. Non-pecuniary damage was also compensable in the event of a breach of Article 11.

With the entry into force of the GDPR, the legislation has changed, introducing more uniform rules for liability in case of unlawful processing of personal data. The new legislation stipulates that anyone who suffers material or immaterial damage caused by a breach of the regulation has the right to obtain compensation from the data controller or processor. However, these entities may be exempted from liability if they prove that the damaging event is not attributable to them “in any way.”

The Responsibility of the Controller vs. the Responsible Party

The liability of the owner and the liability of the liable party arise from different facts. The data controller is the one who determines the purposes and means of the processing and is liable for the damage caused by his processing that violates the regulation. Moreover, according to the ermellini’s maxim, ‘the data controller is always obliged to compensate for the damage caused to a person by a processing that does not comply with the regulation itself, and may be exonerated from liability not simply if he has taken action (as is his duty) to remove the unlawfully exposed data, but only ‘if he proves that the damaging event is in no way attributable to him’.

The data controller, on the other hand, processes personal data on behalf of the data controller and is liable only if he has not fulfilled the obligations of the regulation specifically addressed to data controllers or has acted contrary to the instructions of the data controller.

The Seriousness of the Damage

As regards compensation for non-pecuniary damage resulting from an infringement of the fundamental right to the protection of personal data, the conditions of the seriousness of the injury and the seriousness of the damage must be met. The violation of data protection requirements may be considered unjustifiable, and therefore compensable, only if it has appreciably offended the scope of the right itself. Therefore, the mere violation of the formal prescriptions on the processing of data may not give rise to damage, whereas a violation that concretely offends the actual scope of the right to privacy always leads to compensation.

The burden of proof for proving non-pecuniary damage is on the injured party, while the data controller must prove that it has taken adequate measures to avoid the damage.

The Principle of Accountability

The entry into force of the GDPR introduced the principle of accountability, which requires the data controller to take responsibility for striking a balance between opposing interests, with full autonomy of judgement. Accountability requires the controller to modulate the concrete implementation of the principles enshrined in the legislation, in the abstract, and to document how it has implemented the regulatory provisions.

In conclusion, Regulation (EU) 2016/679 has redefined the legal framework for the processing of personal data, introducing more uniform rules on responsibility and accountability. These regulations place significant emphasis on the protection of personal data and compensation for damages in case of breaches. The Supreme Court’s ruling reinforces the importance of these rules and the need for organisations to comply with them in order to avoid litigation and damages. The protection of personal data is a crucial issue in today’s digital society and requires attention and compliance from all actors involved.

The Italian Data Protection Authority sanctions web scraping: the case of the portal Trovanumeri.com

The Garante Privacy recently banned web scraping and sanctioned the portal Trovanumeri.com for raking up online users in order to create lists. These violations involved as many as 26 million users, causing great concern for the protection of personal data. Concerns that culminated in a measure issued on 17 May by the Garante, which prohibited the website operator from creating and disseminating a telephone directory obtained through web scraping, a technique that consists in extracting data from one or more websites using special software programmes.

THE ISSUE

In this particular case, numerous reports were submitted to the Garante Privacy concerning the unauthorised publication of names, addresses and telephone numbers of individuals without their consent. Moreover, according to the reports, in some cases, the publication also concerned personal data of persons who had special confidentiality requirements concerning their telephone number and home address: some complainants had in fact represented that they were holders of confidential telephone numbers, i.e. not published in the general telephone directory.

Finally, several subjects complained that no indication (not even the information required by law) of the owner of the site could be found in the website and in the brief privacy policy published therein, thus making it impossible to identify the data controller.

THE VIOLATIONS 

Dissemination of personal data in the absence of an appropriate legal basis and processing in breach of the law

The processing consisting in the de facto creation of a telephone directory was deemed by the Data Protection Authority to be in breach of the law, resulting in the dissemination of personal data on the Internet in the absence of a suitable legal basis. It is important to emphasise that it is not legitimate to form a telephone directory, whether online or on paper, with data that are not taken from authorised sources, such as telephone operators’ databases. Only such a source can guarantee the correctness and up-to-dateness of the data, as well as document the willingness of those concerned to make them public.

Investigations revealed that the trovanumeri.com website also made reverse search available, but did not allow users to give free and specific consent for this functionality. The consent flag was in fact pre-selected and not modifiable, thus violating the requirements of the law in force.

It is also important to emphasise that the owner of the site had stated that the data on its websites had been collected through autonomous user input or through web scraping, i.e. through an automated process of searching for personal data on the web. This technique, however, had already been deemed unlawful by the Data Protection Authority in a ruling sanctioning the unlawfulness of the use of data collected through web scraping for purposes incompatible with the original purpose. Therefore, data acquired and processed without the consent of the data subjects and without a valid legal basis constitute a breach of privacy law.

Failure to respect data subjects’ rights, inadequate information and absence of safeguards

The reports received highlighted not only the unauthorised dissemination of data, but also the impossibility for data subjects to exercise their right to erasure and, potentially, other data protection rights. In fact, the website did not contain any information on the data controller and no contact channels with the data controller were available. 

Non-compliance with the processing Injunction

Finally, despite the prohibition ordered by the Garante Privacy, the Trovanumeri.com portal continued to operate and make available online numerous personal data. This non-compliance with the ban was further challenged as a breach of the provisions of the regulator.

CONCLUSIONS AND CORRECTIVE MEASURES TAKEN

The processing of personal data by Trovanumeri.com was found to be unlawful and to have numerous profiles of illegality. Even if some of the violations can be corrected, the main violation concerning the absence of an appropriate legal basis is sufficient to invalidate the entire processing. Therefore, the corrective measures taken must address the underlying issue and ensure that personal data are processed in compliance with privacy legislation.

In conclusion, the Trovanumeri.com portal case highlighted the importance of personal data protection and the negative consequences of unauthorised web scraping. The Garante Privacy has adopted sanctioning measures to ensure that users’ rights are respected and that data are processed in compliance with the law. This case is a reminder to companies and websites that process personal data, underlining the importance of regulatory compliance and respect for users’ privacy.

The Potential and Challenges of Copyright Law in the Age of AI

The Role of Text and Data Mining in the Data Economy

As of 12 December 2021, the Copyright Act (L. 22 April 1941 No. 633) has incorporated two specific provisions, set out in Articles 3 and 4 of the Copyright Directive 2019/790/EU, relating to Text and Data Mining (TDM) – an automated method of analysing digital content. This practice has become central in multiple sectors of the data economy, from pharmaceutical research to the application of Artificial Intelligence (AI) and Big Data. Let us therefore examine below, the introduction of the new Articles 70-ter and 70-quater, l. 633/1941 (hereinafter also ‘Copyright Law’) into the Copyright Law, 

Definition and importance of the TDM for the European Union

TDM, defined by Art. 70ter of the Act as “any automated technique aimed at analysing large amounts of text, sound, images, data or metadata in digital format with the purpose of generating information, including patterns, trends and correlations”, is crucial for the advancement of the data economy and, consequently, for the growth of the European Union’s digital single market.

TDM’s interference with copyright

However, automated data mining – a typical TDM activity – may interfere with copyright and related rights. Indeed, the TDM process usually involves the temporary reproduction of the sources used, which could include protected works or significant parts of the databases used. This could be a violation of the exclusive right of reproduction under Section 13 of the Copyright Act and could also contradict a database creator’s right to prohibit the extraction or reuse of the entire database or a substantial part of it.

Copyright reform in the European Union

Despite these challenges, the European Union has decided to reform the sector by introducing exceptions and limitations to copyright that are mandatory for every Member State. These were implemented in Art. 70ter and 70quater of the Copyright Act. These provisions, slavishly transposing the content of Art. 3 and 4 of the new Copyright Directive, allow the extraction of data from sources and databases to which one has legal access, without any need for authorisation by the holders of copyright or sui generis rights to the databases.

Differences between Art. 70ter and 70quater

 However, the two regulations just mentioned have different scopes of application. Whereas Art. 70ter applies exclusively to extraction for scientific purposes by research organisations and cultural heritage institutions, Art. 70quater allows the extraction of text and data in general, by anyone, even for profit.

Protection of digital database rights

This scenario complicates the protection of exclusive rights to digital databases, with a greater impact on the sui generis right of the database creator than on copyright. However, there are measures that can be taken to protect databases, including limiting access and using the opt-out option provided by Section 70quater of the Copyright Act. This option allows right holders to reserve the use of reproduced works and materials in the context of text and data mining, unless expressly stated.

Use of the opt-out option

Despite the uncertainty of how to properly exercise the opt-out, there are several tools that can be used. For example, software can technically recognise an opt-out expressed in the terms of use of a site, which could be considered an appropriate way to express the reservation mentioned in Article 70quater. Moreover, the use of IT tools such as a robots.txt file could provide more effective protection for right holders.

The balance between innovation and copyright protection

In conclusion, while text and data mining represents a huge opportunity for the advancement of research and the development of the data economy, it is important that copyright and related rights are adequately protected. This requires a careful balance between the need to protect intellectual property and the importance of maintaining the competitiveness of the European market. The recently introduced provisions in the Copyright Act are an important step in this direction, but it is crucial that the remaining issues are resolved to ensure the effective protection of copyright in the age of text and data mining.

The block (and unblocking) of ChatGPT in Italy: causes, changes and solutions adopted.

ChatGPT is a language model developed by OpenAI based on the GPT-4 architecture. It is designed to understand and generate text in a similar way to humans, making it possible to create smooth and coherent conversations. However, on 30 March 2023, the use of ChatGPT was blocked in Italy due to concerns about user privacy and data protection. In this article, we will explore the reasons for the block, the changes requested by the Garante Privacy to OpenAI and the solutions that have been implemented to solve the problem and protect the privacy of Italian citizens.

The ChatGPT blockade in Italy

The blocking of ChatGPT in Italy, self-imposed by OpenAI itself, had been caused by a measure of the Garante (Italian Data Protection Authority) that had ordered the platform to temporarily restrict the processing of Italian users’ data until it complied with Italian and European privacy regulations. The Garante, in an emergency measure, had found that the use of ChatGPT could violate privacy regulations, such as the European Union’s General Data Protection Regulation (GDPR), which provides for strict protection of individuals’ personal data.

The reason for the blockade

In its decision of 30 March, the Garante per la Protezione dei Dati Personali had identified several reasons for concern regarding the use of ChatGPT in the country. Among these, the main ones were:

  • the lack of information to users and all stakeholders whose data are collected by OpenAI, 
  • the absence of a legal basis justifying the massive collection and storage of personal data for the purpose of ‘training’ the algorithms underlying the operation of the platform;
  • incorrect processing of personal data due to the plaintiff’s inaccurate information provided by ChatGPT 
  • the absence of any filter for verifying the age of users, which exposed minors to answers that were totally unsuited to their level of development and self-awareness.

Required changes to OpenAI

To address these concerns, the Garante requested OpenAI to make a number of changes and interventions to the platform on which ChatGPT operates in order to ensure greater protection of users’ privacy. Among the main changes, the Garante requested to:

  1. Set up an information notice on the site to explain data processing and the rights of data subjects, including non-users of ChatGPT.
  2. Provide a tool to exercise the right to object to the processing of data for algorithm training.
  3. Allow the correction or deletion of inaccurate personal data through a tool on the site.
  4. Insert a link to the information during registration, visible before completing the process.
  5. Change the legal basis of data processing for algorithm training from contract to consent or legitimate interest.
  6. Provide a means to exercise the right to object to the processing of data for algorithm training, if based on legitimate interest.
  7. Implement an age gate for Italian users, excluding minors.
  8. Submit a plan to the Supervisor for the adoption of age verification tools by 31 May 2023, with implementation by 30 September 2023.
  9. Promote an information campaign by 15 May 2023, agreed with the Garante, to inform about data collection and the tools available to delete personal data.

Changes implemented by OpenAI

In response to the Garante’s requests, OpenAI implemented a number of changes to ChatGPT to ensure greater privacy protection for Italian users. Among the main changes adopted are:

  1. The provision of information accessible to both European and non-European users and non-users concerning the processing of personal data for algorithm training and the right to object to such processing.
  2. The expansion of the data processing information for users by making it accessible in the registration mask before a user registers for the service.
  3. The right to object to the processing of personal data for algorithm training can also be exercised by non-users resident in Europe by providing an easily accessible, online form.
  4. The introduction of a welcome screen when ChatGPT is reactivated in Italy, with references to the new privacy policy and how personal data are processed for algorithm training.
  5. The provision was made for those concerned to have any information they considered to be incorrect deleted. In addition, however, OpenAI declared itself technically unable to correct the errors.
  6. Explaining, in the user information, the legal basis for the processing of personal data for algorithm training and the proper functioning of the service.
  7. The implementation of a form allowing all European users to exercise their right to object to the processing of their personal data and thus be able to exclude conversations and their history from the training of their algorithms.
  8. The inclusion in the welcome screen reserved for Italian users who are already registered a button through which, in order to re-access the service, they will have to declare that they are of age or over 13 and, in this case, have parental consent.
  9. Inclusion of the date of birth request in the service registration mask, with a block on registration for users under 13 years of age and the need to confirm parental consent for users over 13 years of age but under 18.

The above actions were welcomed by the Garante, which suspended the personal data processing restriction order against OpenAI and, at the same time, reopened the platform to Italian users.