The European Union recently introduced the AI Act, a new governance framework that obliges organizations to increase transparency about the training data of their artificial intelligence systems.
If this legislation goes into effect, it could penetrate the defenses that many in Silicon Valley have built against such detailed scrutiny of AI development and deployment processes.
Since the public release of OpenAI’s ChatGPT, backed by Microsoft 18 months ago, there has been a significant increase in interest and investment in AI production technologies. These applications, capable of writing text, creating images and producing audio content at record speeds, have attracted attention. However, the surge in AI activity accompanying these changes raises an interesting question: How do AI developers actually source the data needed to train their models? Is it through the use of unauthorized copyrighted material?
Application of AI Law
The EU AI Law, which is to be phased in over the next two years, aims to address these issues. New laws take time to integrate and a phased rollout allows regulators the necessary time to adjust to the new laws and businesses to adjust to their new obligations. However, the application of some rules remains questionable.
One of the most controversial sections of the law requires organizations that develop general-purpose AI models like ChatGPT to provide “detailed summaries” of the content used to train them. The newly formed Office of Artificial Intelligence has announced that it plans to release a standard for organizations to follow in early 2025, following stakeholder consultation.
AI companies have strongly resisted disclosing their training data, describing that information as trade secrets that would give competitors an unfair advantage if made public. The level of detail required in these transparency reports will have significant implications for both smaller AI startups and big tech companies like Google and Meta, which have placed AI technology at the heart of their future operations.
In the past year, several leading tech companies — Google, OpenAI, and Stability AI — have faced lawsuits from creators who claim their content was used without permission to train AI models. But under increasing scrutiny, some tech companies have, in the past couple of years, pierced their own corporate veil and negotiated content licensing deals with individual media outlets and websites. Some creators and lawmakers remain concerned that these measures are not sufficient.
The divide of European legislators
In Europe, the differences between legislators are stark. Dragos Tudorache, who led the drafting of the AI law in the European Parliament, argues that AI companies should be forced to open source their datasets. Tudorache emphasizes the importance of transparency so that creators can determine whether their work has been used to train AI algorithms.
In contrast, under the leadership of President Emmanuel Macron, the French government has privately opposed rules that could hamper the competitiveness of European AI startups. French Finance Minister Bruno Le Maire stressed the need for Europe to be a global leader in artificial intelligence, not just a consumer of American and Chinese products.
The AI Act recognizes the need to balance the protection of trade secrets with the facilitation of rights for parties with legitimate interests, including copyright holders. However, achieving this balance remains a significant challenge.
Different industries differ on this matter. Matthieu Riouf, CEO of image editing company Photoroom, compares the situation to cooking practices, arguing that there is a secret part of the recipe that the best chefs would not share. It represents just one instance in the list of possible scenarios where this type of crime could be rampant. However, Thomas Wolf, co-founder of one of the world’s leading AI startups, Hugging Face, argues that while there will always be an appetite for transparency, it doesn’t mean the entire industry will adopt a transparency-first approach.
A series of recent controversies has demonstrated just how complicated it all is. OpenAI unveiled the latest version of ChatGPT at a public meeting, where the company was heavily criticized for using a synthetic voice that sounded almost identical to that of actress Scarlett Johansson. These examples show the potential for AI technologies to violate personal and property rights.
Throughout the development of these regulations, there has been intense debate about their potential impact on future innovation and competitiveness in the world of artificial intelligence. In particular, the French government urged that innovation, not regulation, should be the starting point, given the risks of regulating aspects that are not fully understood.
How the EU regulates AI transparency could have major implications for tech companies, digital creators and the overall digital landscape. Policymakers thus face the challenge of fostering innovation in the dynamic field of artificial intelligence while guiding it toward safe, ethical decisions and preventing intellectual property infringement.
In short, if passed, the EU AI Law would be an important step towards greater transparency in AI development. However, the practical implementation of these regulations and their industry effects could be a long way off. Moving forward, especially at the dawn of this new regulatory paradigm, the balance between innovation, the development of ethical AI, and the protection of intellectual property will remain a central and contested issue for stakeholders on all sides.
See also: Apple is reportedly getting free access to ChatGPT
Want to learn more about AI and big data from industry leaders? Checkout AI & Big Data Expo takes place in Amsterdam, California and London. The comprehensive event is co-located with other top events including; Intelligent Automation Conference, BlockX, Digital Transformation Weekand Cyber Security & Cloud Expo.
Explore other upcoming corporate tech events and webinars powered by TechForge here.