Intelligent incident management system for automated detection, classification, and resolution tracking.
About The Project • Architecture • Key Features • Quick Start • Usage • API Endpoints
Thala is an intelligent incident management system that automatically:
- Detects incidents from Slack messages, Jira tickets, and emails
- Classifies & predicts severity, category, and likelihood using agent
- Tracks resolutions and links them to original incidents
- Searches similar past incidents using semantic similarity
- Extracts text from image attachments using AWS Textract
- Ingestion: Slack/Jira/Email → Connectors → Kafka
- Classification: Llama 3.3 70b LLM classifies messages (incident, resolution, discussion, unrelated)
- Prediction: AWS Bedrock (llama-3.3-70b) agent predicts category & severity
- Attachment Processing: Images → S3 → Textract → Extracted text → Context
- Storage: Flask API → Elasticsearch (with embeddings for semantic search)
- Resolution Tracking: Links resolution messages to original incidents
- UI: Slack bot commands (/thala latest_issue, /thala search)
- Uses LLM from AWS Bedrock (llama-3.3-70b) to classify messages semantically
- No keyword matching - pure agent understanding
- Types: incident_report, resolution, discussion, unrelated
- Links vague resolutions ("auth issue fixed") to correct incidents
- Uses semantic similarity (embeddings) + conversational context
- Automatically marks incidents as "Resolved" in Elasticsearch
- Downloads images from Slack/Jira attachments
- Uploads to S3 bucket (thala-images)
- Extracts text using AWS Textract
- Adds extracted text to message context for classification
- Category: Database, API, Frontend, Infrastructure, Authentication, etc.
- Severity: Critical, High, Medium, Low
- Likelihood: Likely, Unlikely (for new queries)
- Uses Llama model with few-shot learning
- Finds similar past incidents using vector embeddings
- Prioritizes resolved incidents with complete resolution info
- Returns similarity scores and resolution details
/thala latest_issue [page]- View ongoing incidents (paginated)/thala search <query>- Search similar resolved incidents/thala predict <description>- Predict category/severity/thala- Show help
- Python 3.12+
- Elasticsearch 9.1.5+ (running)
- Kafka (KRaft mode, optional for real-time)
- AWS Account (for S3 + Textract)
Install dependencies:
pip install -r requirements.txt
pip install -r team-thala/src/ui_requirements.txtCreate .env file in the root directory:
GEMINI_API_KEY=
FLASK_API_URL=http://localhost:5000
# Elasticsearch Configuration (if remote, change localhost to your ES host)
ELASTICSEARCH_HOST=https://localhost:9200
SLACK_APP_TOKEN=
JIRA_URL=https://kphotos1803.atlassian.net
JIRA_EMAIL=
JIRA_API_TOKEN=
SLACK_BOT_TOKEN=
SLACK_CHANNEL_ID=
# Kafka Configuration
KAFKA_BOOTSTRAP_SERVERS=localhost:9092
KAFKA_TOPIC_SLACK=thala-slack-events
KAFKA_TOPIC_JIRA=thala-jira-events
# Logging Configuration
LOG_LEVEL=INFO
LOG_FILE=logs/thala_ingestion.log
# Elasticsearch Configuration (if remote, change localhost to your ES host)
AWS_LAMBDA_URL=
# Kafka Configuration
KAFKA_TOPIC_SLACK=thala-slack-events
KAFKA_TOPIC_JIRA=thala-jira-events
SEARCH_BACKEND=opensearch_serverless
AWS_REGION=us-east-2
AWS_ACCESS_KEY_ID=""
AWS_SECRET_ACCESS_KEY=""
AWS_SESSION_TOKEN=""
FunctionUrl= ""
FunctionArn= ""
AWS_BEARER_TOKEN_BEDROCK=""
OPENSEARCH_HOST = ""
KAFKA_BOOTSTRAP_SERVERS=""
REDIS_FALLBACK_ENABLED=true
REDIS_HOST=127.0.0.1
REDIS_PORT=6379
REDIS_LIST_PREFIX=thala:queue:
AWS_REGION=us-east-2
BEDROCK_LLAMA_MODEL_ID=meta.llama3-3-70b-instruct-v1:0
- Create Slack app at https://api.slack.com/apps
- Add Bot Token Scopes:
channels:history,channels:readchat:write,commandsapp_mentions:read,im:historyfiles:read(REQUIRED for attachments)
- Install app to workspace
- Copy Bot Token (xoxb-...) to
.env
See: team-thala/SLACK_FILES_READ_SETUP.md for detailed setup instructions.
python integrated_main.py# Terminal 1: Flask API
python new.py
# Terminal 2: Kafka Consumer
python team-thala/src/kafka_consumer_to_flask.py
# Terminal 3: Slack Connector
python team-thala/src/slack_connector_enhanced.py
# Terminal 4: Slack Bot UI
python team-thala/src/slack_bot_ui.py/thala # Show help and available commands
/thala latest_issue [page] # View ongoing incidents (paginated, 10 per page)
/thala search <query> # Search similar resolved incidentsSlack: "API server is down"
→ LLM from AWS Bedrock (llama-3.3-70b) classifies as "incident_report"
→ It predicts: Category=API, Severity=High
→ Sent to Kafka → Flask → Elasticsearch
→ Tracked in Incident Tracker
→ Available in Slack: /thala latest_issue
Slack: "API issue has been fixed"
→ LLM from AWS Bedrock (llama-3.3-70b) classifies as "resolution"
→ Semantic search finds matching open incident
→ Updates status to "Resolved" in Elasticsearch
→ Logs resolution text, resolved_by, resolved_at
→ Removed from ongoing incidents list
Slack: [Image attachment] "Check this error"
→ Download image from Slack (files_info API)
→ Upload to S3 bucket
→ Extract text using Textract
→ Add extracted text to message context
→ Classify with full context (image + text)
→ Create incident if classified as incident_report
Slack: /thala search "database timeout"
→ Flask API performs semantic search in Elasticsearch
→ Returns similar resolved incidents
→ Prioritizes incidents with complete resolution info
→ Displays in Slack with rich formatting
- Monitors Slack channels for messages
- Classifies messages using LLM from AWS Bedrock (llama-3.3-70b)
- Processes attachments (S3 + Textract)
- Detects resolutions and links to incidents
- Prevents resolution messages from creating new incidents
- Handles vague messages intelligently
- Slack bot with slash commands
- Paginated incident listing
- Semantic search interface
- Rich UI with Slack Block Kit
- Predicts category & severity
- Uses few-shot learning with training examples
- Caches predictions (24h TTL)
- Downloads attachments from Slack/Jira
- Uploads to S3 bucket
- Extracts text using Textract
- Handles image format conversion (PNG → JPEG)
/index- Store incidents in Elasticsearch/search- Semantic similarity search/predict_incident- Predict likelihood/update_status- Mark incidents as resolved/lookup_incident- Find incident by ID
Store new incident in Elasticsearch
{
"texts": ["API server is down"],
"timestamp": "2025-11-01T10:00:00",
"status": "Open",
"source": "slack",
"category": "API",
"severity": "High"
}Semantic similarity search
{
"query": "database connection timeout",
"top_k": 10
}Mark incident as resolved
{
"issue_id": "slack_1234567890",
"status": "Resolved",
"resolution_text": "Fixed connection pool",
"resolved_by": "U08L203J5TK",
"resolved_at": "2025-11-01T10:15:00"
}Find incident by ID
{
"issue_id": "slack_1234567890"
}- Bot Token (xoxb-...): Required for Web API calls (files_info, channels, etc.)
- App Token (xapp-...): Only for Socket Mode (not used currently)
- Use Bot Token in SLACK_BOT_TOKEN environment variable
- Slack app must have
files:readscope - AWS credentials must be configured
- S3 bucket must exist (thala-images)
- Textract must be enabled in AWS region
- No keyword matching - pure semantic understanding
- Links resolutions even if ID not mentioned explicitly
- Uses conversational context (recent incidents)
- Fallback to most recent open incident if no match












