Skip to content
This repository has been archived by the owner on Nov 26, 2024. It is now read-only.
/ SexTok Public archive

Code base and Dataset Repo for ACL '23 Findings Paper : It’s not Sexually Suggestive; It’s Educative | Separating Sex Education from Suggestive Content on TikTok videos

Notifications You must be signed in to change notification settings

enfageorge/SexTok

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 

Repository files navigation

SexTok

Title : It’s not Sexually Suggestive; It’s Educative | Separating Sex Education from Suggestive Content on TikTok videos

Enfa George, Mihai Surdeanu, ACL Findings 2023 [bib]

Computational Language Understanding Lab, University Of Arizona

Outline

Paper

Findings of the Association for Computational Linguistics: ACL 2023

Motivation

The current state of adolescent sex education in the United States is often criticized for being fragmented and inadequate, susceptible to political influence, and lacking comprehensive information. Only a small number of states require contraception education, and even fewer cover important topics like gender diversity and consent. This limited focus hampers the effectiveness of sex education programs, as highlighted by the American Academy of Pediatrics.

Credit: Left - Truthdig Right - Cagle

Meanwhile, TikTok, a widely popular app among adolescents and youth, provides a platform for virtual sex education in a convenient, private, and inclusive space for sexual health information. However, these videos often face removal and shadow banning due to inaccuracies in community guidelines enforcement and mass reporting. Creators, especially those from marginalized communities, are disproportionately targeted by mass reporting.

Credit : Mashable - Why is TikTok removing sex ed videos?

The project's goal is to develop a better system for distinguishing between sexual content and sex education, creating a dataset and establishing a baseline for future research.

Data

You can find the data file as a CSV here. The CSV contains the following information - Video Link, Data split, Gender Expression, Label, and Notes, if any. The videos were given as URLs to avoid any potential copyright violation. In the event that any of the videos are taken down, please contact the author for a copy.

Example

Illustrative example

Example

Text

As Description:

  • Educative: Video featuring a man discussing a topic while a prominent illustration of a pns with pearly penile papules serves as the background.
  • Suggestive: The video shows a man holding a pumpkin over his torso while a woman enthusiastically moves her hand inside, exclaiming, "There is so much in there."

As Transcript:

  • Educative: The average banana in the United States is about 5.5 inches long. That’s the perfect size for baking banana bread most of the time because ...
  • Suggestive: You are such a good boy. Daddy’s so proud of you.

Results

Group Acc Micro Macro
P R F1 P R F1
Majority 0.60 0.00 0.00 0.00 0.20 0.33 0.25
All Text 0.68± 0.06 0.76± 0.06 0.50± 0.06 0.60± 0.04 0.71± 0.06 0.63± 0.03 0.64± 0.04
Non-empty Text 0.75± 0.02 0.78± 0.07 0.54± 0.02 0.64± 0.02 0.74± 0.04 0.65± 0.01 0.68± 0.00
Video 0.70± 0.04 0.61± 0.11 0.51± 0.07 0.55± 0.05 0.68± 0.06 0.57± 0.07 0.61± 0.01

Citation

@inproceedings{george-surdeanu-2023-sexually,  
   title = "It{'}s not Sexually Suggestive; It{'}s Educative | Separating Sex Education from Suggestive Content on {T}ik{T}ok videos",  
   author = "George, Enfa  and  
     Surdeanu, Mihai",  
   booktitle = "Findings of the Association for Computational Linguistics: ACL 2023",  
   month = jul,  
   year = "2023",  
   address = "Toronto, Canada",  
   publisher = "Association for Computational Linguistics",  
   url = "https://aclanthology.org/2023.findings-acl.365",  
   pages = "5904--5915",  
   abstract = "We introduce SexTok, a multi-modal dataset composed of TikTok videos labeled as sexually suggestive (from the annotator{'}s point of view), sex-educational content, or neither. Such a dataset is necessary to address the challenge of distinguishing between sexually suggestive content and virtual sex education videos on TikTok. Children{'}s exposure to sexually suggestive videos has been shown to have adversarial effects on their development (Collins et al. 2017). Meanwhile, virtual sex education, especially on subjects that are more relevant to the LGBTQIA+ community, is very valuable (Mitchell et al. 2014). The platform{'}s current system removes/punishes some of both types of videos, even though they serve different purposes. Our dataset contains video URLs, and it is also audio transcribed. To validate its importance, we explore two transformer-based models for classifying the videos. Our preliminary results suggest that the task of distinguishing between these types of videos is learnable but challenging. These experiments suggest that this dataset is meaningful and invites further study on the subject.",  
}  

About

Code base and Dataset Repo for ACL '23 Findings Paper : It’s not Sexually Suggestive; It’s Educative | Separating Sex Education from Suggestive Content on TikTok videos

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published