The Next Alexa? Surfing the IP Challenges for Artificial Intelligence

  • Posted

With some users no longer on speaking terms with their personal voice assistants such as Alexa and Siri, who is responsible for work created by Artificial Intelligence systems? If you own the work input into an Artificial Intelligence system do you own the work output too?

Artificial intelligence (AI) covers an increasingly broad range of different technical disciplines, including machine learning, neural networks, natural language processing, speech and audio recognition, computer vision and emotion recognition. Unifying these disciplines is an attempt to combine aspects of human cognition to assist in everyday professional and social tasks.

Through our experience we have found the main issues surrounding the legalities of AI are:

  1. Contractual Issues;
  2. Intellectual Property Issues; and
  3. Infringement.

Contractual issues for an Artificial Intelligence businesses

Contracts at the development and testing stages, through to mass marketing and consumer sales are similar to many other technology and IP licensing agreements. However, one feature that, if overlooked, can bring an AI project crashing down is training data.

Developers and AI suppliers often collaborate with an organisation that holds the type of data which the AI developer needs. Rather than pay to licence the training data to teach the AI system, the developer might grant the organisation access to the developed AI solution.

Training data – can you use it?

Online security and usage of consumers’ data is a hot topic for all businesses. It poses a particularly high hurdle for technology companies collaborating on AI systems. Training data that comprises real customer data is subject to data protection requirements in the UK. Under the General Data Protection Regulations, the organisation providing the data will almost always be a controller and the AI developer will be the corresponding processor.

Data Protection

To maximise GDPR compliance there are two options for using training data:

  1. Anonymise the data: this means deleting all identifying details that could link back to the original customer of the organisation . Although the simplest option, in practice this may mean deleting the data on which the AI needs to be trained; or
  2. Agree a written database licence: for AI development this should make clear how the data was collected. Including how the customers were informed of its proposed use as training data by the third party processor.

Intellectual property rights in works created by AI

If the organisation that owns the training data obtains rights in the intellectual property that is developed through the collaboration, or if their background intellectual property is incorporated as part of the finished product, who owns the copyright, database right or other IP rights in this baseline data?

‘You don’t own me’- ownership of copyright

In the absence of an agreement to the contrary, the default position is that ownership of the copyright in source codes, algorithms or other data structures is defined by reference to their human author. However, a fundamental feature of many AI systems is that the underlying logic is developed by the system itself as a result of a training process.

Under copyright law in England, it is not possible for the AI system to be the legal author of this work. Authorship is important as the author is deemed to be the first owner of the work. So how do you establish authorship and ownership of copyright in the code or data that arises as a result of this training process?

Can a human author be determined?

Whilst ownership can be agreed, licensed, assigned, or sold, authorship cannot be altered by agreement – it is a matter of status.

Attributing authorship will depend on a number of factors including implementation and involvement.

Computer-generated works

Where no human author can be identified, the next stage would be to identify who made the arrangements necessary for the creation of the work. In practice, this can also be difficult to attribute, particularly where different parties have contributed through a collaboration.

Is it your ‘own intellectual creation’?

Case law provides that for a work to be original (and copyrightable) it must be its author’s ‘own intellectual creation’. This threshold for human-authored work is relatively low. However this test is a greater hurdle for a computer program generated as a result of a trained AI system. Whether the form in which the program is expressed demonstrates human intellectual creation or skill, judgment and labour may be difficult to prove.

Applying this test in practice, the Next Rembrandt project trained a deep learning algorithm on Rembrandt paintings and asked it to produce a new painting replicating the artist’s style and subject matter. Researchers in Australia made similar progress with sonnet-writing AI trained on Shakespeare works. Does this creative data meet the test for originality?

AI is not a sentient being so is unable to exercise skill, labour and judgment or engage in intellectual creation. Any originality in the work must come from a human. Where there is no identifiable original content in the work produced by the AI system per se, there could be originality in the skill, labour and judgment expended in the process of training the AI to create certain types of work – such as Rembrandt project. A good analogy is Naruto, the famous selfie-taking macaque monkey. Although Naruto pressed the shutter button on the camera, any originality in the photograph was attributed to the skill, labour and judgment that the human photographer, David Slater, used when setting up the camera.


AI systems present a unique set of issues for technology companies to overcome. Based on our experience the most common issues relate to:

Copyright infringement

In the UK, copying for the sole purpose of research for a non-commercial purpose is permitted. The EU similarly allows member states to make exceptions to copyright protection where the use is for the sole purpose of scientific research. Whilst this may offer protection for researchers, using such data as training data will likely cross over from research into commercial endeavours relatively early on in the development process. therefore not be protected from being an infringing act.


The ‘black box problem’ arises from the way in which some AI systems store their decision-making algorithms. This is often not in a form that is easily understood by a human. A human would be able to reverse engineer human-created source codes to work out why a particular decision was taken. Whereas, an AI neural network is potentially immune to human scrutiny. It is impossible to predict how a system would respond to a certain input without putting it into practice. Whilst this clearly creates added hurdles for establishing authorship, it also raises uncertainty over liability. If it is not possible to decide whether the AI system or the humans associated with it made the decision, then attributing liability becomes impossible.

Resolving this question is becoming increasingly important, as advances continue in self-driving vehicles. For some companies, such as Volvo, the answer is accepting full liability for cars in ‘autonomous mode’. Such indemnities may become standard clauses in many AI licences whilst liability remains a legal grey area.