JeffersonLab
diff --git a/‎README.md‎
Lines changed: 2 additions & 40 deletions b/‎README.md‎
Lines changed: 2 additions & 40 deletions
diff --git a/‎blog/2019-05-28-first-blog-post.md‎
Lines changed: 0 additions & 12 deletions b/‎blog/2019-05-28-first-blog-post.md‎
Lines changed: 0 additions & 12 deletions
diff --git a/‎blog/2019-05-29-long-blog-post.md‎
Lines changed: 0 additions & 44 deletions b/‎blog/2019-05-29-long-blog-post.md‎
Lines changed: 0 additions & 44 deletions
diff --git a/‎blog/2021-08-01-mdx-blog-post.mdx‎
Lines changed: 0 additions & 24 deletions b/‎blog/2021-08-01-mdx-blog-post.mdx‎
Lines changed: 0 additions & 24 deletions
diff --git a/‎blog/2021-08-26-welcome/docusaurus-plushie-banner.jpeg‎
-93.9 KB b/‎blog/2021-08-26-welcome/docusaurus-plushie-banner.jpeg‎
-93.9 KB
diff --git a/‎blog/2021-08-26-welcome/index.md‎
Lines changed: 0 additions & 29 deletions b/‎blog/2021-08-26-welcome/index.md‎
Lines changed: 0 additions & 29 deletions
diff --git a/‎blog/authors.yml‎
Lines changed: 0 additions & 25 deletions b/‎blog/authors.yml‎
Lines changed: 0 additions & 25 deletions
diff --git a/‎blog/tags.yml‎
Lines changed: 0 additions & 19 deletions b/‎blog/tags.yml‎
Lines changed: 0 additions & 19 deletions
diff --git a/‎docs/Architecture/Agents/Agent Overview.md‎
Lines changed: 48 additions & 0 deletions b/‎docs/Architecture/Agents/Agent Overview.md‎
Lines changed: 48 additions & 0 deletions
diff --git a/‎docs/Architecture/Agents/Data Ingest Thread.md‎
Lines changed: 61 additions & 0 deletions b/‎docs/Architecture/Agents/Data Ingest Thread.md‎
Lines changed: 61 additions & 0 deletions
@@ -1,41 +1,3 @@
-# Website
+# SMOCS Documentation Website
 
-This website is built using [Docusaurus](https://docusaurus.io/), a modern static website generator.
-
-## Installation
-
-```bash
-yarn
-```
-
-## Local Development
-
-```bash
-yarn start
-```
-
-This command starts a local development server and opens up a browser window. Most changes are reflected live without having to restart the server.
-
-## Build
-
-```bash
-yarn build
-```
-
-This command generates static content into the `build` directory and can be served using any static contents hosting service.
-
-## Deployment
-
-Using SSH:
-
-```bash
-USE_SSH=true yarn deploy
-```
-
-Not using SSH:
-
-```bash
-GIT_USER=<Your GitHub username> yarn deploy
-```
-
-If you are using GitHub pages for hosting, this command is a convenient way to build the website and push to the `gh-pages` branch.
+This website is designed to be a wiki for the SMOCS repo and is built using [Docusaurus](https://docusaurus.io/), a modern static website generator.
@@ -0,0 +1,48 @@
+## Agent Architecture: Multi-Component Systems
+
+### Three-Thread Agent Design
+SMOCS agents represent the most complex architectural pattern, orchestrating three specialized threads that work in concert:
+
+**Data Ingest Thread** (`DataIngestThreadBase` → `KafkaConsumerBase`):
+- Consumes real-time sensor data from Kafka topics
+- Implements agent-specific data parsing and validation
+- Stores structured data to MySQL using the shared database manager
+- Maintains data quality and provides input validation for downstream components
+
+**ML Training Thread** (`MLTrainingThreadBase` → `KafkaProducerBase`):
+- Operates on a timer-based loop rather than message consumption
+- Queries the database for accumulated training data
+- Implements agent-specific model training and evaluation logic
+- Publishes training results and model metadata to Kafka for monitoring
+- Handles model versioning and persistence
+
+**ML Inference Thread** (`MLInferenceThreadBase` → `KafkaStreamingProcessBase`):
+- Consumes real-time data for immediate inference
+- Automatically loads the latest trained models
+- Processes streaming data through the loaded models
+- Publishes inference results and anomaly detections to output topics
+- Provides the real-time decision-making capability of the agent
+
+### Agent Orchestration
+The `AgentBase` class manages the complete agent lifecycle through several key phases:
+
+**Initialization**: Creates database connections, registers the agent in the system, and prepares component configurations.
+
+**Component Creation**: Uses abstract factory methods to instantiate the three specialized threads, allowing concrete agent implementations to define their specific component types.
+
+**Thread Management**: Launches each component in its own thread with proper daemon configuration and maintains thread health monitoring.
+
+**Health Monitoring**: Continuously checks thread vitality and automatically restarts failed components, ensuring agent resilience.
+
+## Inter-Thread Coordination
+
+### Database Communication Layer
+The three threads coordinate primarily through the shared MySQL database rather than direct communication. The Data Ingest Thread populates the raw data tables, the ML Training Thread consumes this data for model development, and the ML Inference Thread accesses the latest model artifacts.
+
+### Temporal Decoupling
+This architecture creates natural temporal decoupling: data ingestion operates at sensor sampling rates, training occurs on longer time scales based on data accumulation, and inference happens at message arrival rates. Each thread can optimize for its specific timing requirements without impacting others.
+
+### Resource Management
+Each thread manages its own resources (Kafka connections, database cursors, model memory) and implements appropriate cleanup procedures. The `AgentBase` class monitors thread health and can restart individual components without affecting others.
+
+This three-thread architecture creates a complete machine learning pipeline that ingests streaming data, continuously improves models, and provides real-time intelligence - all while maintaining the simplicity and reliability principles that define the SMOCS platform.
@@ -0,0 +1,61 @@
+### Core Responsibilities
+The Data Ingest Thread serves as the agent's primary interface to the streaming data ecosystem. Built on `KafkaConsumerBase`, it maintains a persistent connection to configured Kafka topics and transforms raw streaming messages into structured database records.
+
+### Processing Pipeline
+The thread operates through a continuous polling loop that retrieves message batches from Kafka. Each message undergoes parsing to extract sensor readings, timestamps, and metadata. The `AutoencoderDataIngestThread` implementation demonstrates this pattern by parsing JSON messages containing gymnasium environment state data, extracting numeric state values while filtering out non-numeric metadata fieldsI wou.
+
+### Data Storage Strategy
+Parsed data flows into MySQL through the shared `DBManager` instance. The thread uses the `record_sensor_data()` method to store timestamped sensor readings with proper data type conversion (numpy arrays to binary blobs). This creates a persistent training dataset that accumulates over time, providing the foundation for ML model development.
+
+### Error Handling and Resilience
+The thread implements robust error handling at multiple levels: JSON parsing failures are logged but don't terminate processing, database connection issues trigger retry logic, and malformed messages are skipped with appropriate warnings. This ensures that transient data quality issues don't disrupt the overall data flow.
+
+### Configuration Integration
+Topic subscriptions, parsing rules, and storage parameters are all driven by the central configuration file. This allows agents to be reconfigured for different data sources without code changes, supporting the system's flexibility goals.
+
+## User Implementation Requirements
+
+### Single Required Method
+```python
+def store_message(self, message, topic, partition, offset) -> bool:
+    # Parse message, validate data, store to database
+    # Return True for success, False for failure
+```
+
+### Implementation Steps
+
+1. **Parse Message**: Decode bytes to string, parse JSON/format
+2. **Extract Data**: Pull relevant fields (timestamps, sensor values, metadata)
+3. **Validate**: Check data types, handle missing fields, filter invalid data
+4. **Transform**: Convert to database schema format (arrays to numpy, timestamps to datetime)
+5. **Store**: Call `self.db_manager.record_sensor_data(data_dict)`
+6. **Return Status**: `True` if successful, `False` if failed
+
+### Configuration Needed
+```python
+config = {
+    'kafka_topics': {'input': 'your-topic-name'}
+}
+```
+
+### Environment Variables Required
+- `KAFKA_BROKER_URL`
+- `MYSQL_HOST`, `MYSQL_PORT`, `MYSQL_USER`, `MYSQL_ROOT_PASSWORD`, `MYSQL_DATABASE`
+
+## What's Handled Automatically
+
+- Kafka consumer setup/teardown
+- Topic subscription and polling
+- Database connection management
+- Thread lifecycle and health monitoring
+- Error recovery and restart logic
+- Message batching and offset management
+
+## Key Constraints
+
+- **Single-threaded**: Keep `store_message()` fast and efficient
+- **Error handling**: Catch exceptions, log errors, return `False` for failures
+- **Database schema**: Match expected format for `record_sensor_data()`
+- **Memory management**: Don't accumulate state between messages
+
+The base classes handle all infrastructure complexity - users only implement domain-specific data transformation logic.