Part 1 of CIPPIC's Fair Dealing Week Series:

In 2017, the Canadian government prioritized establishing Canada as a global leader in artificial intelligence (AI), developing a national strategy bolstered by significant investments in research and the commercialization of AI. This strategy, the first of its kind, aimed to transform Canada’s strong record in AI research into a driver of the Canadian economy by becoming a world leader in AI research, innovation, and policy.

The strategy fostered the development of a robust AI research hub in Canada. However, the initial success of Canada's AI strategy has not translated into the sought-after success of becoming a clear leader in AI. Instead, Canada has fallen further behind other countries that have since adopted aggressive AI strategies. Moreover, the federal government’s efforts are undermined by the current state of Canadian copyright law, which fosters further regulatory uncertainty at a moment when there is a strong global push for further investment and opportunities in the sector.

One cause of legal uncertainty is copyright liability. Whether data mining amounts to copyright infringement remains an open question in Canada. AI systems learn by analyzing large sets of data that contain human-created works such as text, audio, or images. The reliance on data availability for training creates a copyright infringement liability risk for AI developers and companies.

Copyright exemptions for data mining have been established in other countries to address this uncertainty. Countries such as the United States and Israel have issued guidance extending their fair use provisions to include text and data mining activities. Similarly, countries including Japan, the European Union, the United Kingdom, and Singapore have created specific exemptions to address this uncertainty. For instance, the approach adopted by the EU in Article 4 of its DSM Directive requires member states to enact text and data mining exceptions permitting “reproductions and extractions of lawfully accessible works and other subject matter for the purposes of text and data mining.” Similarly, the approach adopted by Singapore specifically exempts text and data mining in their Copyright Act by exempting copies of works where a “copy is made for the purpose of computational data analysis; or preparing the work or recording for computational data analysis.”

Without a similar exception in Canada, developers may hesitate to take on this risk of liability for copyright infringement. The exceptions absence risks a chilling effect on innovation. The current regulatory uncertainty in Canada creates further knock-on effects that affect AI adoption. First, without the availability of comprehensive data, AI algorithms are more likely to have outcomes that introduce bias and discrimination issues since there is limited training data available. Second, Canadian regulatory uncertainty creates favourable conditions for “big-tech” companies with access to larger data sets or those that can afford to bear the risk of infringement. These conditions negatively impact the viability of smaller competitors and AI start-ups.

The current fair dealing framework provides a possible defence to an enforcement action for some uses of copyrighted works by AI companies without permission. Looking at the current regime, the Canadian approach to fair dealing is relatively narrow: the dealing at issue (1) must be for one of the allowable purposes set out in section 29 of the Copyright Act – notably the section includes exceptions for research, private study, and education; and (2) must be fair, which is a factual determination that looks to the purpose, character, nature, amount, and effect of, as well as alternatives to, the dealing. Since the fair dealing defence is a defence to an enforcement action, and not a safe harbour defence, nobody knows for sure if an activity is free of risk from liability until a court so determines. While the fair dealing framework may allow for a possible defence in an enforcement action, the uncertainty remains as Canadian courts have yet to apply this defence in the context of training AI systems.

To address this uncertainty, a copyright framework that allows text and data mining for commercial use is an important step. The creation of a text and data mining exception would create a safe harbour defense, establishing conditions under which one is free of risk of infringement and ensuring more certainty. Earlier this year, in its submission for the 2024 Consultation on Copyright in the Age of Generative Artificial Intelligence, CIPPIC recommended that where text and data mining is applied to copyrighted material to build a training dataset for generative AI, there should be no claim for copyright infringement if the training data is not reproduced in any resulting generative algorithm. This approach is grounded in the balancing purpose of copyright protection. As the Supreme Court of Canada observed, infringement can occur where a substantial portion of the work is reproduced, with the assessment being whether a the defendant reproduced a substantial portion of the work, which includes the author’s expression of skill and judgement. Machine learning does not reproduce a substantial portion of the work – instead; it uses the text and data mining for informational purposes. Establishing a text and data mining exception for research and commercial uses thus aligns with Canada’s broader approach to copyright infringement.

This opinion was written by Jordan Geist, a third year JD Candidate at the University of Ottawa. The opinion is the author's, and does not necessarily reflect CIPPIC's policy position.