WARNING: THIS SITE IS A MIRROR OF GITHUB.COM / IT CANNOT LOGIN OR REGISTER ACCOUNTS / THE CONTENTS ARE PROVIDED AS-IS / THIS SITE ASSUMES NO RESPONSIBILITY FOR ANY DISPLAYED CONTENT OR LINKS / IF YOU FOUND SOMETHING MAY NOT GOOD FOR EVERYONE, CONTACT ADMIN AT ilovescratch@foxmail.com
Skip to content

LaclauGPT scripts for multimodal analysis of EP2024 social media videos. Meant to be run as batch jobs on CSC Puhti supercomputer. For research documentation only.

License

Notifications You must be signed in to change notification settings

TomiToivio/LaclauGPT-Multimodal-Analysis

Repository files navigation

LaclauGPT

LaclauGPT is a political science multimodal data collection and analysis pipeline. It is called LaclauGPT as a tribute to Ernesto Laclau.

LaclauGPT is developed by Tomi Toivio for three Helsinki Hub on Emotions, Populism and Polarisation research projects funded by the European Union:

  • CO3 researches the social contract.
  • ENDURE researches the world after the pandemic.
  • PLEDGE researches grievance politics.

The pipeline was used to collect and analyze multimodal social media data related to the 2024 European parliament elections. Data was collected from TikTok and Instagram. Data collection started in 1st of May 2024 and continued until the election day in 9th of June 2024. Collection was based on usernames of official election candidates as well as hashtags and search queries related to the elections. Election data was collected for Bulgaria, Croatia, Finland, France, Germany, Hungary, Portugal, Spain and Sweden. Collected and analyzed data cannot be released yet due to GDPR. This open source version uses dummy data.

LaclauGPT Multimodal Data Analysis

These data analysis scripts are published for research documentation. You probably cannot use these without some modification.

These are used with Ollama running on CSC Puhti supercomputer.

The scripts are submitted as batch jobs in a sequence:

  1. puhti_preprocess.py - This extracts video frames with OpenCV, processes the with EasyOCR and extracts a Whisper transcript of the audio.

  2. puhti_frame.py - This uses Llama to create a multimodal analysis of 1-6 extracted frames.

  3. puhti_summary.py - This creates a Llama summary analysis based on the metadata, Whisper transcript and Llama multimodal analysis results.

  4. puhti_postprocess.py - Create structured version of the summary output.

  5. puhti_populism.py - Analyze the results using the theories of Laclau and Palonen.

Code for the TikTok Scraper used to collect EP2024 data is also available.

About

LaclauGPT scripts for multimodal analysis of EP2024 social media videos. Meant to be run as batch jobs on CSC Puhti supercomputer. For research documentation only.

Topics

Resources

License

Stars

Watchers

Forks

Languages