2024년 9월 3일

[Privacy Lawyer – An Explanation of the Public Personal Information Processing Guidelines for AI Development and Services]

[Privacy Lawyer – An Explanation of the Public Personal Information Processing Guidelines for AI Development and Services]

[Privacy Lawyer – An Explanation of the Public Personal Information Processing Guidelines for AI Development and Services]

Hello. I am Attorney Lee Yeong-kyung from Cheongchul Law Firm.

Today, I would like to detail the 'Guide to Processing Personal Data for AI Development and Services' that was announced by the Personal Information Protection Commission in July 2024. This guide will serve as an important directive in seeking a balance between the advancement of AI technology and the protection of personal data.


[Question]

Please explain the main contents of the 'Guide to Processing Personal Data for AI Development and Services'.


[Answer]

  1. Overview and Target Audience of the Guide

First, let me explain the nature and target audience of this guide. This guide provides criteria for legal interpretation that can be referred to when processing publicly available personal data for AI development and services, and it does not have legal binding force.

The scope of publicly available personal data refers to personal data that can be lawfully accessed by anyone, primarily considering data sets that include personal data collected using web scraping technology for AI training. Additionally, personal data made public by law, included in publications, broadcasting media, etc., may also fall under the application of this guide depending on the case.

During the AI training stage, the process involves the computer identifying statistical correlations such as patterns, structures, and arrangements from vast amounts of data and generating predictions. In the AI service stage, users can enter prompts containing personal data to receive output regarding individuals, and this prompt input and output may be reused for AI training purposes.

The target audience of this guide consists of AI developers and service providers who hold the status of personal data processors under the Personal Information Protection Act.


  1. Legal Basis for Processing Publicly Available Personal Data

The main legal basis for processing publicly available personal data for AI training and services is Article 15, Paragraph 1, Item 6 of the Personal Information Protection Act which concerns legitimate interests. The prerequisites for this provision are as follows:

Prerequisite

Description

Legitimacy of Purpose

There must be a legitimate interest of the data processor.

Necessity of Processing

Processing of personal data must be necessary for achieving legitimate interests, and considerable relevance and reasonableness must be acknowledged.

Balancing of Interests

The legitimate interest of the data processor must clearly take precedence over the rights of the data subject.


Legitimate interest can encompass various layers of benefits, including not only the commercial interests of AI developers and service providers but also the social benefits arising therefrom.

Regarding the necessity of processing, it is particularly noted that large amounts of training data are required for developing large language models (LLM), which currently relies on the utilization of publicly available data on the internet as a practical solution.

In the balancing of interests process, when determining whether the legitimate interests of the data processor take precedence over the rights of the data subjects, the potential infringement on the rights of the data subjects must be thoroughly examined.


  1. Criteria for Safety Measures

AI developers and service providers must take various technical and managerial measures to prevent infringement on the rights of data subjects. These measures can be classified as follows:

Technical Measures

Managerial Measures

Verification and management of the sources of training data collection

Establishment of processing standards for training data and disclosure in privacy policy

Prevention of personal data exposure

Consideration of conducting personal data impact assessments

Safe storage and management of personal data

Establishment and operation of '(proposed) AI Privacy Red Team'

Adding safety measures through fine-tuning

Safety measures according to the characteristics of AI development and distribution such as open source, APIs, etc.

Application of prompt and output filtering


Deletion of specific data from training results (machine learning, etc.)


Let's take a closer look at each measure. For instance, in the case of applying prompt and output filtering, if a user prompts to profile individuals or generate responses that may infringe on privacy, it may be necessary to consider rejecting the generation of such responses or providing pre-determined answers based on the context and intention of the prompts.


  1. Measures to Ensure the Rights of Data Subjects

AI developers and service providers must implement the following measures to guarantee the rights of data subjects:

First, transparency of AI training data must be enhanced. It is recommended to disclose the fact of collecting public datasets, major sources, purposes of processing, etc., in privacy policies, technical documents, FAQs, etc., to support the exercise of rights by data subjects.

Secondly, support should be provided for the exercise of rights by data subjects. AI developers and service providers should strive to ensure that the rights of data subjects to access, rectify, and delete their personal data are guaranteed within a reasonable scope considering time, cost, and technology.

Looking at practical cases, in the case of ChatGPT, users can request to review their personal data held by OpenAI or demand the deletion of exposed personal data. Similarly, in the case of Gemini, if the results of the language model contain personal data, users can request data deletion by clicking "Report Legal Issues".


  1. The Role of AI Companies in Responsible AI Development and Utilization

Lastly, AI companies should fulfill the following roles for responsible AI development and utilization:

First, they need to establish an internal management system related to AI privacy. It is advisable to organize and establish an internal management system centered around a '(proposed) AI Privacy Responsible Organization'. The size and composition of this organization can be determined flexibly according to the conditions of the AI company or institution, with a recommendation to center it around the Chief Privacy Officer (CPO).

Next, the culture of AI privacy protection should be shared and disseminated. The Chief Privacy Officer (CPO) should share the results of assessments and improvement actions regarding AI privacy risks with responsible members internally to ensure that AI privacy protection takes root within the organization.


Cheongchul Law Firm is a law firm specializing in corporate fields established by attorneys from the four major law firms, providing comprehensive solutions related to corporate affairs. If you have any further inquiries, please feel free to contact us by email or phone.

403 Teheran-ro, Gangnam-gu, Seoul, Rich Tower, 7th floor

Tel. 02-6959-9936

Fax. 02-6959-9967

cheongchul@cheongchul.com

Privacy Policy

Disclaimer

© 2025. Cheongchul. All rights reserved

403 Teheran-ro, Gangnam-gu, Seoul, Rich Tower, 7th floor

Tel. 02-6959-9936

Fax. 02-6959-9967

cheongchul@cheongchul.com

Privacy Policy

Disclaimer

© 2025. Cheongchul. All rights reserved

403 Teheran-ro, Gangnam-gu, Seoul, Rich Tower, 7th floor

Tel. 02-6959-9936

Fax. 02-6959-9967

cheongchul@cheongchul.com

Privacy Policy

Disclaimer

© 2025. Cheongchul. All rights reserved