IGF 2021 WS #101 AI Ethical and technical challenges for Content Moderation

Session

Subtheme

(EMERGING AND CROSS-CUTTING) Emerging regulation: market structure, content, data and consumer/users rights regulation

Organizer 1: Caroline Burle, Ceweb.br/NIC.br
Organizer 2: Hartmut Richard Glaser, Brazilian Internet Steering Committee - CGI.br
Organizer 3: Diogo Cortiz da Silva, Network Information Center (NIC.br)
Organizer 4: David Duenas-Cid, Kozminski University (Poland) and University of Tartu (Estonia)
Organizer 5: Bruna Toso de Alcântara, NIC.br/CGI.br

Speaker 1: Emma Llanso, Civil Society, Western European and Others Group (WEOG)
Speaker 2: Nathalia Sautchuk Patricio, Technical Community, Latin American and Caribbean Group (GRULAC)
Speaker 3: Frane Maroevic, Civil Society, Intergovernmental Organization
Speaker 4: Diogo Cortiz da Silva, Technical Community, Latin American and Caribbean Group (GRULAC)

Moderator

David Duenas-Cid, Civil Society, Eastern European Group

Online Moderator

Caroline Burle, Civil Society, Latin American and Caribbean Group (GRULAC)

Rapporteur

Bruna Toso de Alcântara, Technical Community, Latin American and Caribbean Group (GRULAC)

Format

Round Table - U-shape - 90 Min

Policy Question(s)

Content moderation and human rights compliance: How to ensure that government regulation, self-regulation and co-regulation approaches to content moderation are compliant with human rights frameworks, are transparent and accountable, and enable a safe, united and inclusive Internet?
Data governance and trust, globally and locally: What is needed to ensure that existing and future national and international data governance frameworks are effective in mandating the responsible and trustworthy use of data, with respect for privacy and other human rights?

Additional Policy Questions Information: What are the common practices of AI for content moderation on the Web nowadays? What are the impacts for the individuals?
To what extent and how AI for content moderation can threaten complex systems in a society, such democracy, economy and healthcare?
How can technical approaches address those challenges?
How can we ensure AI systems don't violate people's basic rights, such as freedom of speech, when dealing with AI for content moderation?
To what extent the use of data from social media can violate privacy?
To what extent and how Differential Privacy techniques could help us to use data to training AI models for content moderation while preserving privacy?

In this Workshop we intend to discuss how AI models could be applied to do content moderation on the web in order to create a trustworthy Web. There is an emerging area of developing AI applications to detect hate speech, cyber bullying and disinformation in academia, government and private sector. Different companies are creating research projects to deal with those challenges. Facebook, for example, is funding research projects to deal with polarization and disinformation. The UK government also has published a content moderation on the web White Paper to introduce and discuss possible strategies to overcome the threat. However, as mentioned before, most of those techniques rely on data, so there is a potential risk for privacy when trying to do content moderation on the web. It seems controversial, but there are promising privacy techniques to address those challenges and preserve privacy while keeping data useful for AI models.

Another problem we are facing is a kind of gap between the two disciplines. Usually they are different people with distinct technical backgrounds. That is an opportunity to bring together experts who are leading projects in AI to do content moderation on the web to discuss with people who are leading privacy projects. Crossing this gap will benefit the society, because we will find better strategies to fight online attacks while we preserve privacy.

SDGs

9.b
16.6
16.7
16.8
16.a
16.b

Targets: Although the Web began as a platform to share documents, since the early 2000's we are in the era of data on the Web. And therefore the development of the Internet and the Web technologies facilitated the so-called data revolution.

In recent years the development of Artificial Intelligence has drawn attention to issues such as privacy and protection of personal data. Artificial intelligence and privacy are, thus, two major concerns on the web ecosystem today.

Trust is key to promote an open and healthy online space. However, we have experienced in the last years some movements emerging to threaten the original principles of the Web to be an open, collaborative and trustworthy platform. These risks include, but are not limited, to movements who commit cyberbullying and spread hate speech and misinformation, often in a coordinated way.

A toxic space is being created, a platform on which groups of users can feel attacked and violated while others can be manipulated. This situation is jeopardizing the original principles of the Web, and many efforts are being made to combat this threat. The use of AI models seems to be a promising strategy to deal with this problem, but side effects can arise: attack on freedom of expression and privacy. In this workshop we will seek a complex view on the topic and discuss the state of the art to deal with such issues as Differential Privacy.

Description:

Artificial intelligence and privacy are two major concerns on the web ecosystem today. In this workshop we aim to discuss how AI techniques can help us to prevent different types of content moderation on the web, such as hate speech, cyber bullying and disinformation while preserving privacy.
At first glance, this may seem somewhat contradictory and paradoxical. First, because the most common AI techniques rely on data for their training. And if we are talking about content moderation on the web, we are referring mainly to data collection on the Web as training examples for AI models. It is very common, for example, to collect posts from the main social networks that will be noted by researchers, to then be used as AI training. There are also organizations that provide open data to be used in training. In both cases, is privacy being considered an important factor? As we are trying to do content moderation on the web, can we not be violating users' privacy?
Today, some approaches and techniques are being developed to assist in this process. Only anonymizing data does not guarantee privacy, considering several studies and famous cases of re-identification of users when crossing different databases. One technique that shows promise is Differential Privacy to "add noise" to the dataset. This strategy helps to preserve privacy, but may impact the performance of AI models.
It is opportune that at this moment there is a greater integration between the two areas that seem distant.
In the session, we will bring together experts on content moderation on the web and privacy to discuss how those two disciplines could be integrated to create a trustworthy Web, preventing attacks while preserving privacy.

Expected Outcomes

During the session, regarding the Policy Questions, the experts will briefly explore the state of the art of content moderation on the web and how technical arrangements (specially AI) can address those challenges. They will discuss to what extent AI models can be used in this scenario while preventing attacks to freedom of speech and privacy.

Use cases will be discussed among the participants and they will also discuss the challenges of content moderation on the web, the role of AI and Differential Privacy in this process for the next few years and how it will bring a significant change to the Web as we know it. Hence, the workshop may provide a roadmap agreed among workshop participants to open a global debate on the core challenges to enhance AI to prevent content moderation on the web while protecting the rights of people and privacy. The purpose of the workshop is to reach out to different stakeholders in order to disseminate this roadmap.

The session will be moderated by an online moderator and an on-site moderator, who is one of the co-organizers of this workshop.
The speakers will be all online, but one of the co-organizers will be on-site to moderate the on-site participants interaction with the online participantes.

Online Participation

Usage of IGF Official Tool.

Background Paper

Reference Document