GitHub - pranshu-saraswat/CodeAlpha-Speech-to-Emotion-Task2

▶️ Project Summary

This project is a web application that recognizes human emotions from speech. It uses a deep learning model (CNN) trained on the RAVDESS dataset to predict emotions from .wav audio files or live microphone input. The entire application is built with Python, using TensorFlow/Keras for modeling, Librosa for audio processing, and Streamlit for the interactive user interface.

✨ Key Features

Live Emotion Prediction: Analyzes audio directly from a microphone.
File-Based Analysis: Predicts emotions from user-uploaded .wav audio files.
Interactive UI: A simple, clean, and user-friendly web interface built with Streamlit.
Deep Learning Model: A Convolutional Neural Network (CNN) built with TensorFlow and Keras for emotion classification.

🚀 Setup and Installation

Follow these steps to set up and run the project on your local machine.

1. Clone the Repository

git clone [https://github.com/your-username/your-repo-name.git](https://github.com/your-username/your-repo-name.git)
cd your-repo-name
2. Create and Activate a Virtual Environment
This project requires a virtual environment to manage dependencies correctly.

macOS/Linux:

Bash

python3 -m venv venv
source venv/bin/activate
Windows:

Bash

python -m venv venv
.\venv\Scripts\activate
3. Install Dependencies
This project requires FFmpeg for audio processing.

Install FFmpeg:

On macOS (using Homebrew): brew install ffmpeg

On Ubuntu/Debian: sudo apt update && sudo apt install ffmpeg

On Windows: Download the executable from the official FFmpeg website and add it to your system's PATH.

Install Python Libraries:
Install all required Python packages using the requirements.txt file.

Bash

pip install -r requirements.txt
4. Run the Streamlit Application
Once the setup is complete, you can run the application.

Bash

streamlit run app.py
The application will open in a new tab in your web browser.

📊 Model Performance and Limitations
The model achieves a test accuracy of approximately 40% on the 8-class RAVDESS dataset. While significantly better than random guessing (12.5%), there are known limitations:

Class Imbalance: A key issue is the model's performance on the 'disgust' emotion. This is a direct result of class imbalance in the RAVDESS dataset, where the 'disgust' category has less than half the samples of most other emotions. This leads to the model being less confident and accurate in predicting this specific emotion.

Language Dependency: The model is trained exclusively on North American English speakers and will not perform accurately on other languages.

🧹 Cleanup and Privacy Note
Privacy: Please note that any absolute file paths used during the initial data exploration phase have been removed from the scripts for user privacy and to ensure the code is system-agnostic.

Virtual Environment (venv): The venv folder, which contains the project-specific Python environment and libraries, has been intentionally removed from this repository to reduce the project size and ensure a clean setup. You must create your own venv as described in the setup instructions.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.DS_Store		.DS_Store
README.md		README.md
app.py		app.py
emotion_features.csv		emotion_features.csv
emotion_model.h5		emotion_model.h5
explore_data.py		explore_data.py
extract_features.py		extract_features.py
label_encoder.pkl		label_encoder.pkl
requirements.txt		requirements.txt
test_mic.py		test_mic.py
train_model.py		train_model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

▶️ Project Summary

✨ Key Features

🚀 Setup and Installation

About

Uh oh!

Releases

Packages

Languages

pranshu-saraswat/CodeAlpha-Speech-to-Emotion-Task2

Folders and files

Latest commit

History

Repository files navigation

▶️ Project Summary

✨ Key Features

🚀 Setup and Installation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages