Technology Sovereignty

Global Center for Development and Strategy

Home
Research
Topics

Technology Sovereignty

Privacy in the public: Analysing the EU framework to outline approaches for regulating AI personal data scraping

관리자
2025.12.08.
77

Abstract:

AI models developed using scraped personal data pose an inherent risk of en-masse shadow profiling to the subjects, harming their privacy, autonomy, and dignity. This paper argues that the protection of public personal data is essential to mitigate AI-scraping risks, noting that the EU is among the few to confer such protection. The GDPR regulates both public and non-public personal data similarly but contains exemptions from notice provisions in the case of legitimate interest-based processing. This exemption contributes to the information asymmetry between stakeholders who enforce anti-scraping covenants i.e., data subjects and platforms, versus scrapers. Limited supervisory powers and the lack of other mechanisms to address the problems of enforcing privacy laws in public data contribute to the GDPR’s inefficiency in controlling AI harms. The AI Act strives to plug in GDPR loopholes via reporting obligations on general-purpose AI providers to disclose the sources of their training data. Other jurisdictions could consider the principles and mechanisms of the EU regime as a guide to regulate public data scraping.

DOI: https://doi.org/10.1016/j.clsr.2025.106150

Attachments: Privacy_in_the_public_Analysing_the_EU_framework_to_outline_approaches-images-0.jpg 884.9 KB