Professor from DEI contributes to the development of a tool for detecting copyright infringement in LLMs

Arlindo Oliveira, professor from DEI, participated in the development of the DE-COP model: Detecting Copyrighted Content in Language Models Training Data, which enables the detection of copyright infringements in Large Language Models (LLMs).

DE-COP aims to address one of the most relevant and difficult questions in the field of AI ethics and transparency: how can we detect if copyrighted content was used in a model’s training data, when that data is not publicly disclosed? To achieve this DE-COP works by probing large language models (LLMs) with multiple-choice questions, where the correct answer is embedded within both exact quotes and paraphrased versions of suspected training content.

Original news article HERE.

(image: INESC-ID)

Tags: