OpenAI found non-compliant in Canadian ChatGPT privacy probe

Canadian privacy regulators found parts of OpenAI’s ChatGPT data collection and use did not comply with private-sector privacy laws.

Canada privacy probe graphic for OpenAI and ChatGPT

Canada’s federal and provincial privacy regulators have found that aspects of OpenAI’s collection, use, and disclosure of personal information through ChatGPT did not comply with applicable private-sector privacy laws, particularly in relation to model training on publicly accessible online data and user interactions.

The joint investigation was conducted by the Office of the Privacy Commissioner of Canada, the Commission d’accès à l’information du Québec, and the privacy commissioners of British Columbia and Alberta.

It examined OpenAI’s GPT-3.5 and GPT-4 models as used in ChatGPT, focusing on whether the company’s handling of personal information from public internet sources, licensed third-party datasets, and user interactions met legal requirements on appropriate purposes, consent, transparency, accuracy, access, retention, and accountability.

The regulators accepted that OpenAI’s overall purposes for developing and deploying ChatGPT were legitimate and appropriate. However, they found that the company’s initial collection of personal information from publicly accessible websites and licensed third-party sources for model training was overbroad and therefore inappropriate, given the scale, sensitivity, and potential inaccuracy of the data involved, as well as the limits of the mitigation measures in place at the time.

The Offices also found that OpenAI failed to obtain valid consent to collect and use personal information from public internet sources to train its models. They concluded that implied consent was not sufficient because the data could include sensitive personal information and because individuals would not reasonably have expected information about them posted online to be scraped and used for AI model training in this way.

On user interactions with ChatGPT, the regulators accepted that using some chat data for model improvement could serve OpenAI’s legitimate purposes. Still, they found that express consent should have been obtained.

They said OpenAI’s safeguards at the time were not strong enough to ensure that sensitive personal information would not be included in training data, and that many users would not reasonably have understood that their conversations could be used to train models or reviewed by human trainers.

The report also found that OpenAI should have obtained express consent for certain disclosures of personal information through ChatGPT outputs, especially where the information was sensitive or fell outside individuals’ reasonable expectations.

While OpenAI had introduced measures to reduce the risk of sensitive disclosures, the regulators said those measures covered a narrower set of information than the broader categories of personal information protected under the relevant privacy laws.

Would you like to learn more about AI, tech, and digital diplomacy? If so, ask our Diplo chatbot!