diff --git a/Claude.md b/Claude.md new file mode 100644 index 0000000..3db5a8f --- /dev/null +++ b/Claude.md @@ -0,0 +1,1372 @@ +# PULSE - Pipelined Unified Logic & Server Engine + +## Project Overview + +PULSE is a distributed workflow orchestration platform designed for managing and executing complex multi-step operations across server clusters. It provides a centralized web-based control center for defining, managing, and executing workflows that can span multiple servers, require human interaction, and perform complex automation tasks at scale. + +### Core Objectives +- Orchestrate operations across distributed infrastructure +- Enable interactive workflows with user prompts and conditional logic +- Provide high availability through redundant worker nodes +- Offer real-time monitoring and execution tracking +- Support both simple command execution and complex multi-step workflows + +## Architecture + +### System Components + +#### 1. PULSE Server (Web Server) +**Location:** `10.10.10.65` (LXC Container ID: 122) +**Directory:** `/opt/pulse-server` + +The central orchestration hub that: +- Hosts the web interface for workflow management +- Manages workflow definitions and execution state +- Coordinates task distribution to worker nodes +- Handles user interactions via Authelia SSO +- Provides real-time status updates via WebSocket +- Stores all data in MariaDB database + +**Technology Stack:** +- Node.js 20.x +- Express.js (web framework) +- WebSocket (ws package) for real-time communication +- MySQL2 (database driver for MariaDB) +- Bcryptjs (password hashing, legacy) +- Jsonwebtoken (JWT tokens, legacy) +- Crypto (Node.js built-in for UUIDs) + +**Key Files:** +- `server.js` - Main server application +- `public/index.html` - Web dashboard UI +- `.env` - Environment configuration (NOT in git) +- `package.json` - Dependencies +- `.gitignore` - Excludes secrets and node_modules + +#### 2. PULSE Worker Nodes +**Example:** `10.10.10.151` (LXC Container ID: 153, hostname: pulse-worker-01) +**Directory:** `/opt/pulse-worker` + +Lightweight execution agents that: +- Connect to PULSE server via HTTP and WebSocket +- Execute commands, scripts, and workflows on target infrastructure +- Report execution status and results back to server +- Send heartbeat with system metrics every 30 seconds +- Support multiple concurrent workflow executions +- Auto-reconnect on connection loss + +**Technology Stack:** +- Node.js 20.x +- Axios (HTTP client) +- WebSocket (ws package) +- Child_process (command execution) +- OS module (system metrics) + +**Key Files:** +- `worker.js` - Main worker agent +- `.env` - Worker configuration (NOT in git) +- `package.json` - Dependencies + +#### 3. MariaDB Database +**Location:** `10.10.10.50:3306` +**Database:** `pulse` +**User:** `pulse_user` (access from 10.10.10.65 only) +**Password:** `ZE6BuNtBG6P&g*gDpZRY` + +**Tables:** + +```sql +-- Users table (SSO managed) +CREATE TABLE users ( + id VARCHAR(36) PRIMARY KEY, + username VARCHAR(255) UNIQUE NOT NULL, + display_name VARCHAR(255), + email VARCHAR(255), + groups TEXT, + last_login TIMESTAMP, + created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP +); + +-- Worker nodes +CREATE TABLE workers ( + id VARCHAR(36) PRIMARY KEY, + name VARCHAR(255) UNIQUE NOT NULL, + status VARCHAR(50) NOT NULL, + last_heartbeat TIMESTAMP NULL, + api_key VARCHAR(255), + metadata JSON, + INDEX idx_status (status), + INDEX idx_heartbeat (last_heartbeat) +); + +-- Workflow definitions +CREATE TABLE workflows ( + id VARCHAR(36) PRIMARY KEY, + name VARCHAR(255) NOT NULL, + description TEXT, + definition JSON NOT NULL, + created_by VARCHAR(255), + created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, + updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, + INDEX idx_name (name) +); + +-- Workflow executions +CREATE TABLE executions ( + id VARCHAR(36) PRIMARY KEY, + workflow_id VARCHAR(36) NOT NULL, + status VARCHAR(50) NOT NULL, + started_by VARCHAR(255), + started_at TIMESTAMP NULL, + completed_at TIMESTAMP NULL, + logs JSON, + FOREIGN KEY (workflow_id) REFERENCES workflows(id) ON DELETE CASCADE, + INDEX idx_workflow (workflow_id), + INDEX idx_status (status), + INDEX idx_started (started_at) +); +``` + +#### 4. Authentication System +**LLDAP Server:** `10.10.10.39:3890` +**Authelia Server:** `10.10.10.39:9091` +**Auth Domain:** `auth.lotusguild.org` + +**Authentication Flow:** +1. User accesses `https://pulse.lotusguild.org` +2. Nginx Proxy Manager forwards auth check to Authelia +3. If not authenticated, redirect to `auth.lotusguild.org` +4. User logs in via LLDAP (admin/employee groups) +5. Authelia sets headers and redirects back to PULSE +6. PULSE trusts headers: `Remote-User`, `Remote-Name`, `Remote-Email`, `Remote-Groups` +7. User session is auto-created/updated in database + +**Allowed Groups:** +- `admin` - Full access including delete operations +- `employee` - Standard access, can execute workflows + +#### 5. Deployment Pipeline +**Git Repository:** `https://code.lotusguild.org/LotusGuild/pulse` +**Webhook Endpoint:** `http://10.10.10.65:9000/hooks/pulse-deploy` +**Webhook Secret:** `c0dd85e473d0efdd3653b77bb38408b14015e7e020e59ad7d446b6c1fab1940d` + +**Deployment Flow:** +1. Developer pushes code to Gitea +2. Gitea triggers webhook with SHA256 signature +3. Webhook service validates signature +4. Deployment script `/usr/local/bin/pulse_deploy.sh` runs +5. Script backs up `.env`, pulls latest code, restores `.env` +6. Installs dependencies with `npm install --production` +7. Restarts PULSE service via systemd +8. Verifies service is running + +**Deployment Script Location:** `/usr/local/bin/pulse_deploy.sh` +**Webhook Config:** `/etc/webhook/hooks.json` +**Webhook Service:** `systemd` service on port 9000 + +## API Endpoints + +### Authentication (SSO via Authelia) +All API endpoints require SSO authentication via Authelia headers. + +### User Management +- `GET /api/user` - Get current user info (SSO headers) + +### Workers +- `GET /api/workers` - List all workers +- `POST /api/workers/heartbeat` - Worker heartbeat (requires `X-API-Key` header) +- `DELETE /api/workers/:id` - Delete worker (admin only) +- `POST /api/workers/:id/command` - Send direct command to worker + +### Workflows +- `GET /api/workflows` - List all workflows +- `POST /api/workflows` - Create new workflow +- `DELETE /api/workflows/:id` - Delete workflow (admin only) + +### Executions +- `GET /api/executions` - List all executions (recent 50) +- `POST /api/executions` - Start workflow execution +- `GET /api/executions/:id` - Get execution details with logs +- `POST /api/executions/:id/respond` - Respond to workflow prompt + +### Health +- `GET /health` - Health check (no auth required) + +## Workflow Definition Format + +Workflows are defined in JSON format with the following structure: + +```json +{ + "steps": [ + { + "name": "Step Name", + "type": "execute|prompt|wait", + "targets": ["all", "worker-name", "worker-group"], + "command": "shell command to execute", + "condition": "JavaScript expression", + "timeout": 300000, + "message": "Prompt message for user", + "options": ["Yes", "No", "Cancel"], + "duration": 5000 + } + ] +} +``` + +### Step Types + +#### 1. Execute Step +Executes a shell command on target workers. + +```json +{ + "name": "Update packages", + "type": "execute", + "targets": ["all"], + "command": "apt update && apt upgrade -y", + "timeout": 600000 +} +``` + +**Fields:** +- `name` - Step description +- `type` - Must be "execute" +- `targets` - Array of worker names or ["all"] +- `command` - Shell command to execute +- `timeout` - Timeout in milliseconds (default: 300000) +- `condition` - Optional JavaScript condition to evaluate + +#### 2. Prompt Step +Pauses workflow and prompts user for input. + +```json +{ + "name": "User confirmation", + "type": "prompt", + "message": "Proceed with system reboot?", + "options": ["Yes", "No", "Cancel"] +} +``` + +**Fields:** +- `name` - Step description +- `type` - Must be "prompt" +- `message` - Message to display to user +- `options` - Array of options for user to choose from + +The user's response is stored in `promptResponse` variable for use in conditions. + +#### 3. Wait Step +Delays execution for a specified duration. + +```json +{ + "name": "Wait for services", + "type": "wait", + "duration": 10000 +} +``` + +**Fields:** +- `name` - Step description +- `type` - Must be "wait" +- `duration` - Delay in milliseconds + +### Conditions + +Steps can have optional conditions that determine if they should execute: + +```json +{ + "name": "Reboot servers", + "type": "execute", + "command": "reboot", + "condition": "promptResponse === 'Yes'" +} +``` + +Conditions are JavaScript expressions evaluated in the workflow context. Available variables: +- `promptResponse` - The most recent user prompt response +- `state` - The full execution state object + +### Example Workflows + +#### Simple Command Execution +```json +{ + "steps": [ + { + "name": "Check disk space", + "type": "execute", + "targets": ["all"], + "command": "df -h" + } + ] +} +``` + +#### Interactive System Update +```json +{ + "steps": [ + { + "name": "Update package list", + "type": "execute", + "targets": ["all"], + "command": "apt update" + }, + { + "name": "User approval", + "type": "prompt", + "message": "Packages updated. Proceed with upgrade?", + "options": ["Yes", "No"] + }, + { + "name": "Upgrade packages", + "type": "execute", + "targets": ["all"], + "command": "apt upgrade -y", + "condition": "promptResponse === 'Yes'" + }, + { + "name": "Reboot confirmation", + "type": "prompt", + "message": "Upgrade complete. Reboot servers?", + "options": ["Yes", "No"] + }, + { + "name": "Reboot servers", + "type": "execute", + "targets": ["all"], + "command": "reboot", + "condition": "promptResponse === 'Yes'" + } + ] +} +``` + +#### Backup with Verification +```json +{ + "steps": [ + { + "name": "Create backup", + "type": "execute", + "targets": ["all"], + "command": "tar -czf /tmp/backup-$(date +%Y%m%d).tar.gz /opt/pulse-worker" + }, + { + "name": "Wait for backup", + "type": "wait", + "duration": 5000 + }, + { + "name": "Verify backup", + "type": "execute", + "targets": ["all"], + "command": "tar -tzf /tmp/backup-*.tar.gz > /dev/null && echo 'OK' || echo 'FAILED'" + }, + { + "name": "Cleanup decision", + "type": "prompt", + "message": "Backup complete. Delete old backups?", + "options": ["Yes", "No"] + }, + { + "name": "Cleanup old backups", + "type": "execute", + "targets": ["all"], + "command": "find /tmp -name 'backup-*.tar.gz' -mtime +7 -delete", + "condition": "promptResponse === 'Yes'" + } + ] +} +``` + +## WebSocket Protocol + +### Client → Server Messages + +#### Worker Connection +```json +{ + "type": "worker_connect", + "worker_id": "uuid", + "worker_name": "pulse-worker-01" +} +``` + +#### Command Result +```json +{ + "type": "command_result", + "execution_id": "uuid", + "worker_id": "uuid", + "success": true, + "stdout": "command output", + "stderr": "", + "duration": 1234, + "timestamp": "2025-11-30T12:00:00Z" +} +``` + +#### Workflow Result +```json +{ + "type": "workflow_result", + "execution_id": "uuid", + "worker_id": "uuid", + "success": true, + "message": "Workflow completed", + "timestamp": "2025-11-30T12:00:00Z" +} +``` + +#### Pong Response +```json +{ + "type": "pong", + "worker_id": "uuid" +} +``` + +### Server → Client Messages + +#### Execute Command +```json +{ + "type": "execute_command", + "execution_id": "uuid", + "step_index": 0, + "command": "uptime", + "timeout": 300000, + "worker_id": "uuid" +} +``` + +#### Execute Workflow +```json +{ + "type": "execute_workflow", + "execution_id": "uuid", + "workflow": { + "name": "Workflow Name", + "steps": [...] + } +} +``` + +#### Ping +```json +{ + "type": "ping" +} +``` + +#### Broadcast Messages +```json +{ + "type": "worker_update", + "worker_id": "uuid", + "status": "online" +} +``` + +```json +{ + "type": "workflow_created", + "workflow_id": "uuid" +} +``` + +```json +{ + "type": "execution_started", + "execution_id": "uuid", + "workflow_id": "uuid" +} +``` + +```json +{ + "type": "execution_prompt", + "execution_id": "uuid", + "prompt": { + "message": "Proceed?", + "options": ["Yes", "No"], + "step": 2 + } +} +``` + +```json +{ + "type": "execution_status", + "execution_id": "uuid", + "status": "completed" +} +``` + +## Environment Variables + +### PULSE Server (.env) +```bash +# Server Configuration +PORT=8080 +HOST=0.0.0.0 +SECRET_KEY=change-this-to-a-secure-random-string + +# MariaDB Configuration +DB_HOST=10.10.10.50 +DB_PORT=3306 +DB_NAME=pulse +DB_USER=pulse_user +DB_PASSWORD=ZE6BuNtBG6P&g*gDpZRY + +# Worker API Key (for worker authentication) +WORKER_API_KEY=5709f45547622803ad0af4726e43aea3aa3f412b25ca5df0c5a0da7929579c53 + +NODE_ENV=production +``` + +### PULSE Worker (.env) +```bash +# PULSE Server Configuration +PULSE_SERVER=http://10.10.10.65:8080 +PULSE_WS=ws://10.10.10.65:8080 + +# Worker Configuration +WORKER_NAME=pulse-worker-01 +WORKER_API_KEY=5709f45547622803ad0af4726e43aea3aa3f412b25ca5df0c5a0da7929579c53 + +# Heartbeat interval (seconds) +HEARTBEAT_INTERVAL=30 + +# Max concurrent tasks +MAX_CONCURRENT_TASKS=5 +``` + +## Infrastructure Details + +### LXC Container Specifications + +#### PULSE Server (ID: 122) +- **OS:** Debian 13 (Trixie) +- **Hostname:** pulse-web-server +- **IP:** 10.10.10.65 +- **Storage:** 8GB on Ceph pool (appPool) +- **RAM:** 4096 MiB +- **CPU:** 4 cores +- **Container Type:** Unprivileged +- **Features:** Standard (no nesting, no FUSE) +- **Network:** vmbr0 bridge, DHCP (static assignment via router) + +#### PULSE Worker (ID: 153) +- **OS:** Debian 13 (Trixie) +- **Hostname:** pulse-worker-01 +- **IP:** 10.10.10.151 (originally 10.10.10.189, changed via DHCP) +- **Storage:** 8GB on Ceph pool (appPool) +- **RAM:** 512 MiB +- **CPU:** 1 core +- **Container Type:** Unprivileged +- **Features:** Standard +- **Network:** vmbr0 bridge, DHCP + +### Proxmox Cluster +- **Ceph Storage:** High availability distributed storage +- **Backend:** Ceph RBD +- **Pools:** appPool (containers), mediafs (templates) +- **Replication:** Data replicated across cluster nodes + +### Nginx Proxy Manager Configuration + +**Domain:** `pulse.lotusguild.org` + +**Proxy Host Settings:** +- Scheme: http +- Forward Hostname/IP: 10.10.10.65 +- Forward Port: 8080 +- Cache Assets: Yes +- Block Common Exploits: Yes +- Websockets Support: Yes +- Force SSL: Yes +- HTTP/2 Support: Yes + +**Custom Nginx Configuration:** +```nginx +include /snippets/authelia-location.conf; +location / { + add_header Strict-Transport-Security $hsts_header always; + + # Authelia auth check + auth_request /authelia; + auth_request_set $user $upstream_http_remote_user; + auth_request_set $groups $upstream_http_remote_groups; + auth_request_set $name $upstream_http_remote_name; + auth_request_set $email $upstream_http_remote_email; + + # Pass auth headers to backend + proxy_set_header Remote-User $user; + proxy_set_header Remote-Groups $groups; + proxy_set_header Remote-Name $name; + proxy_set_header Remote-Email $email; + + # Redirect to login on 401 + error_page 401 =302 https://auth.lotusguild.org/?rd=$scheme://$http_host$request_uri; + + # Websockets + proxy_set_header Upgrade $http_upgrade; + proxy_set_header Connection "upgrade"; + proxy_http_version 1.1; + + # Standard proxy headers + include /etc/nginx/conf.d/include/proxy.conf; +} +``` + +### Authelia Configuration + +**Location:** Authelia server `/etc/authelia/configuration.yml` + +**PULSE Access Rule:** +```yaml +access_control: + rules: + # ... other rules ... + + # PULSE Workflow Orchestration + - domain: pulse.lotusguild.org + policy: one_factor + subject: + - group:admin + - group:employee +``` + +**LDAP Backend:** +- Server: `ldap://10.10.10.39:3890` +- User: `uid=autheliaapplication,ou=people,dc=example,dc=com` +- Base DN: `dc=example,dc=com` +- Users DN: `ou=people` +- Groups DN: `ou=groups` + +### Gitea Webhook Configuration + +**Gitea Server Configuration:** +```ini +[webhook] +ALLOWED_HOST_LIST = 10.10.10.0/24 +``` + +**Repository Webhook:** +- URL: `http://10.10.10.65:9000/hooks/pulse-deploy` +- Method: POST +- Content Type: application/json +- Secret: `c0dd85e473d0efdd3653b77bb38408b14015e7e020e59ad7d446b6c1fab1940d` +- Trigger: Push events on main branch +- Active: Yes + +## Systemd Services + +### PULSE Server +**Service File:** `/etc/systemd/system/pulse.service` + +```ini +[Unit] +Description=PULSE Workflow Orchestration Server +After=network.target + +[Service] +Type=simple +User=root +WorkingDirectory=/opt/pulse-server +ExecStart=/usr/bin/node server.js +Restart=always +RestartSec=10 + +[Install] +WantedBy=multi-user.target +``` + +**Commands:** +```bash +systemctl daemon-reload +systemctl enable pulse +systemctl start pulse +systemctl status pulse +journalctl -u pulse -f +``` + +### PULSE Worker +**Service File:** `/etc/systemd/system/pulse-worker.service` + +```ini +[Unit] +Description=PULSE Worker Agent +After=network.target + +[Service] +Type=simple +User=root +WorkingDirectory=/opt/pulse-worker +ExecStart=/usr/bin/node worker.js +Restart=always +RestartSec=10 +LimitNOFILE=65536 + +[Install] +WantedBy=multi-user.target +``` + +**Commands:** +```bash +systemctl daemon-reload +systemctl enable pulse-worker +systemctl start pulse-worker +systemctl status pulse-worker +journalctl -u pulse-worker -f +``` + +### Webhook Service +**Service File:** `/etc/systemd/system/webhook.service` + +```ini +[Unit] +Description=Webhook Listener for Auto Deploy +After=network.target + +[Service] +ExecStart=/usr/bin/webhook -hooks /etc/webhook/hooks.json -port 9000 -verbose +Restart=always +User=root + +[Install] +WantedBy=multi-user.target +``` + +**Webhook Configuration:** `/etc/webhook/hooks.json` +```json +[ + { + "id": "pulse-deploy", + "execute-command": "/usr/local/bin/pulse_deploy.sh", + "command-working-directory": "/opt/pulse-server", + "response-message": "Deploying PULSE server...", + "trigger-rule": { + "match": { + "type": "payload-hash-sha256", + "secret": "c0dd85e473d0efdd3653b77bb38408b14015e7e020e59ad7d446b6c1fab1940d", + "parameter": { + "source": "header", + "name": "X-Gitea-Signature" + } + } + } + } +] +``` + +**Commands:** +```bash +systemctl daemon-reload +systemctl enable webhook +systemctl start webhook +systemctl status webhook +journalctl -u webhook -f +``` + +## Development Workflow + +### Making Changes to PULSE Server + +1. **SSH into server or work locally:** + ```bash + ssh root@10.10.10.65 + cd /opt/pulse-server + ``` + +2. **Make changes to code:** + ```bash + nano server.js + # or + nano public/index.html + ``` + +3. **Test changes locally:** + ```bash + systemctl restart pulse + systemctl status pulse + journalctl -u pulse -f + ``` + +4. **Commit and push:** + ```bash + git add . + git commit -m "Description of changes" + git push + ``` + +5. **Automatic deployment happens:** + - Gitea webhook triggers + - Deployment script runs + - Service restarts automatically + - Check logs: `journalctl -u webhook -f` + +### Making Changes to PULSE Worker + +1. **SSH into worker:** + ```bash + ssh root@10.10.10.151 + cd /opt/pulse-worker + ``` + +2. **Make changes:** + ```bash + nano worker.js + ``` + +3. **Restart service:** + ```bash + systemctl restart pulse-worker + systemctl status pulse-worker + journalctl -u pulse-worker -f + ``` + +4. **If adding to git (future):** + - Workers could be deployed from a separate repo + - Or use same repo with different directory + +### Adding Dependencies + +**Server:** +```bash +cd /opt/pulse-server +npm install +git add package.json package-lock.json +git commit -m "Add dependency: " +git push +``` + +**Worker:** +```bash +cd /opt/pulse-worker +npm install +# Manual restart required (not in git yet) +systemctl restart pulse-worker +``` + +## Troubleshooting + +### Server Won't Start + +1. **Check logs:** + ```bash + journalctl -u pulse -xeu + ``` + +2. **Common issues:** + - Database connection failed: Check MariaDB is running and credentials + - Port 8080 already in use: Check for other processes + - Missing .env file: Restore from backup or recreate + +3. **Test manually:** + ```bash + cd /opt/pulse-server + node server.js + ``` + +### Worker Won't Connect + +1. **Check logs:** + ```bash + journalctl -u pulse-worker -f + ``` + +2. **Common issues:** + - Wrong server URL in .env: Should be `http://10.10.10.65:8080` + - Wrong API key: Must match server's WORKER_API_KEY + - Network issue: Check firewall and connectivity + +3. **Test connection manually:** + ```bash + curl http://10.10.10.65:8080/health + ``` + +4. **Test heartbeat:** + ```bash + cd /opt/pulse-worker + node -e " + const axios = require('axios'); + const crypto = require('crypto'); + axios.post('http://10.10.10.65:8080/api/workers/heartbeat', { + worker_id: crypto.randomUUID(), + name: 'test', + metadata: {} + }, { + headers: {'X-API-Key': '5709f45547622803ad0af4726e43aea3aa3f412b25ca5df0c5a0da7929579c53'} + }).then(() => console.log('OK')).catch(err => console.error(err.message)); + " + ``` + +### Authentication Issues + +1. **Can't access web interface:** + - Check Authelia is running + - Verify you're in admin or employee group in LLDAP + - Check Nginx Proxy Manager configuration + - Try accessing directly: `http://10.10.10.65:8080/health` + +2. **Check auth headers:** + ```bash + curl -H "Remote-User: testuser" \ + -H "Remote-Groups: admin" \ + http://10.10.10.65:8080/api/user + ``` + +### Database Issues + +1. **Can't connect to database:** + ```bash + # From server + cd /opt/pulse-server + node -e " + const mysql = require('mysql2/promise'); + (async () => { + try { + const pool = mysql.createPool({ + host: '10.10.10.50', + user: 'pulse_user', + password: 'ZE6BuNtBG6P&g*gDpZRY', + database: 'pulse' + }); + await pool.query('SELECT 1'); + console.log('✓ Database connection OK'); + await pool.end(); + } catch (err) { + console.error('✗ Database error:', err.message); + } + })(); + " + ``` + +2. **Inspect database:** + ```bash + # Install mysql client first + apt install default-mysql-client + + mysql -h 10.10.10.50 -u pulse_user -p'ZE6BuNtBG6P&g*gDpZRY' pulse + + # Run queries + SHOW TABLES; + SELECT * FROM workers; + SELECT * FROM workflows; + SELECT * FROM executions ORDER BY started_at DESC LIMIT 10; + ``` + +### Deployment Issues + +1. **Webhook not triggering:** + - Check webhook service: `systemctl status webhook` + - Check Gitea webhook configuration + - Verify secret matches in both places + - Check webhook logs: `journalctl -u webhook -f` + +2. **Deployment fails:** + - Check deployment script: `/usr/local/bin/pulse_deploy.sh` + - Check git connectivity from server + - Verify .env file is preserved + +3. **Manual deployment:** + ```bash + /usr/local/bin/pulse_deploy.sh + ``` + +## Security Considerations + +### Secrets Management + +**NEVER commit these to git:** +- `.env` files (both server and worker) +- Database passwords +- API keys +- Webhook secrets +- SSH keys + +**Protected by .gitignore:** +``` +.env +.env.local +.env.*.local +*.db +*.sqlite +*.backup +node_modules/ +``` + +### Network Security + +- **Worker API Key:** All worker heartbeats require valid API key +- **SSO Authentication:** All web access requires Authelia login +- **Group Authorization:** Only admin/employee groups can access +- **Database Access:** Restricted to 10.10.10.65 only +- **Internal Network:** All communication on private 10.10.10.0/24 + +### Access Control + +**Admin users can:** +- Create/delete workflows +- Execute workflows +- Delete workers +- View all executions + +**Employee users can:** +- Execute workflows +- View workflows +- View workers +- View executions + +### Recommended Practices + +1. **Rotate secrets regularly:** + - Worker API key every 90 days + - Webhook secret every 90 days + - Database passwords every 180 days + +2. **Monitor access:** + - Review Authelia logs for suspicious activity + - Check worker connections regularly + - Monitor execution logs for anomalies + +3. **Backup strategy:** + - Database: Regular mysqldump backups + - .env files: Secure backup location + - Workflow definitions: Stored in database (backed up with DB) + +4. **LXC container security:** + - Unprivileged containers + - Regular OS updates via apt + - Minimal installed packages + - No SSH access (console access only via Proxmox) + +## Monitoring and Maintenance + +### Health Checks + +**Server health:** +```bash +curl http://10.10.10.65:8080/health +``` + +**Expected response:** +```json +{ + "status": "ok", + "timestamp": "2025-11-30T12:00:00.000Z", + "database": "connected", + "auth": "authelia-sso" +} +``` + +### Log Monitoring + +**Real-time logs:** +```bash +# Server +journalctl -u pulse -f + +# Worker +journalctl -u pulse-worker -f + +# Webhook +journalctl -u webhook -f +``` + +**Log analysis:** +```bash +# Last 100 lines +journalctl -u pulse -n 100 + +# Since specific time +journalctl -u pulse --since "1 hour ago" + +# Errors only +journalctl -u pulse -p err +``` + +### Resource Monitoring + +**Worker metrics:** +- Dashboard shows CPU, memory, load average +- Heartbeat includes system metrics +- Check worker status every 30 seconds + +### Performance Tuning + +**Server optimization:** +- Node.js runs single-threaded by default +- Consider PM2 for clustering if needed +- WebSocket connections scale to ~10k per process +- Database connection pool: 10 connections + +**Worker optimization:** +- Workers limit concurrent tasks (default: 5) +- Adjust MAX_CONCURRENT_TASKS based on worker resources +- Monitor CPU and memory usage +- Consider worker groups for task distribution + +### Maintenance Tasks + +**Weekly:** +- Review worker health and remove dead workers +- Check execution logs for failures +- Monitor database growth + +**Monthly:** +- Review and clean old execution logs +- Check for outdated workflow definitions +- Update dependencies (security patches) +- Review Authelia access logs + +**Quarterly:** +- Rotate secrets (API keys, webhook secrets) +- Review and optimize workflows +- Backup database +- Test disaster recovery procedures + +## Future Enhancements + +### Planned Features + +1. **Workflow Scheduler:** + - Cron-like scheduling for workflows + - Recurring execution support + - Time-based triggers + +2. **Advanced Worker Management:** + - Worker groups and tags + - Resource-based task assignment + - Worker health monitoring + - Auto-scaling capabilities + +3. **Enhanced Execution Engine:** + - Parallel step execution + - Step dependencies and DAGs + - Retry logic with backoff + - Execution templates + +4. **Notification System:** + - Email notifications on completion/failure + - Slack/Discord webhooks + - Custom notification handlers + +5. **Audit and Compliance:** + - Detailed audit logs + - Execution history retention policies + - Compliance reporting + +6. **UI Enhancements:** + - Visual workflow builder (drag-and-drop) + - Real-time execution viewer + - Execution comparison and diff + - Mobile-responsive design improvements + +7. **API Improvements:** + - GraphQL API + - REST API versioning + - API rate limiting + - API documentation (Swagger/OpenAPI) + +8. **Advanced Workflows:** + - Conditional workflows (if/else) + - Loops and iterations + - Error handling and rollback + - Workflow composition (call workflows from workflows) + +9. **Security Enhancements:** + - Workflow approval system + - Execution authorization per workflow + - Secret management integration (Vault) + - Command whitelisting + +10. **Monitoring & Observability:** + - Prometheus metrics export + - Grafana dashboards + - Distributed tracing + - Performance profiling + +### Scaling Considerations + +**Horizontal Scaling:** +- Multiple PULSE servers behind load balancer +- Redis for shared state and WebSocket pub/sub +- Database read replicas +- Worker auto-scaling based on queue depth + +**Vertical Scaling:** +- Increase server resources (CPU/RAM) +- Optimize database queries and indexes +- Implement caching layer +- Worker resource allocation tuning + +### Migration Path + +**To Kubernetes:** +- Containerize PULSE server and workers +- Use Helm charts for deployment +- Implement StatefulSets for database +- ConfigMaps for configuration +- Secrets for sensitive data + +**To High Availability:** +- Active-active PULSE servers +- PostgreSQL with replication +- Redis cluster for session storage +- Load balancer with health checks +- Shared storage for artifacts + +## Contributing + +### Development Setup + +1. **Clone repository:** +```bash + git clone https://code.lotusguild.org/LotusGuild/pulse.git + cd pulse +``` + +2. **Install dependencies:** +```bash + npm install +``` + +3. **Create .env file:** +```bash + cp .env.example .env + # Edit .env with your configuration +``` + +4. **Run locally:** +```bash + node server.js +``` + +5. **Access dashboard:** +``` + http://localhost:8080 +``` + +### Code Style + +- **JavaScript:** ES6+ syntax +- **Indentation:** 2 spaces +- **Quotes:** Single quotes preferred +- **Semicolons:** Required +- **Naming:** camelCase for variables, PascalCase for classes + +### Testing + +**Manual testing checklist:** +- [ ] Worker connects and sends heartbeat +- [ ] Workflow creation succeeds +- [ ] Workflow execution starts +- [ ] Commands execute on workers +- [ ] User prompts display correctly +- [ ] WebSocket real-time updates work +- [ ] Authentication via Authelia works +- [ ] Worker deletion works (admin only) +- [ ] Database queries are efficient + +### Pull Request Process + +1. Create feature branch: `git checkout -b feature/my-feature` +2. Make changes and commit: `git commit -m "Add feature"` +3. Push to Gitea: `git push origin feature/my-feature` +4. Create pull request in Gitea +5. Wait for review and approval +6. Merge to main branch +7. Automatic deployment via webhook + +## Support and Resources + +### Documentation +- **Main README:** `/opt/pulse-server/README.md` +- **This Guide:** `Claude.md` +- **Authelia Docs:** https://www.authelia.com/docs/ +- **LLDAP Docs:** https://github.com/lldap/lldap + +### Logging Locations +- **PULSE Server:** `journalctl -u pulse` +- **PULSE Worker:** `journalctl -u pulse-worker` +- **Webhook:** `journalctl -u webhook` +- **Nginx:** `/var/log/nginx/` +- **Authelia:** Check Authelia container logs + +### Common Commands Reference +```bash +# Server Management +systemctl status pulse +systemctl restart pulse +systemctl stop pulse +systemctl start pulse +journalctl -u pulse -f + +# Worker Management +systemctl status pulse-worker +systemctl restart pulse-worker +journalctl -u pulse-worker -f + +# Deployment +/usr/local/bin/pulse_deploy.sh +journalctl -u webhook -f +git status +git log --oneline -10 + +# Database +mysql -h 10.10.10.50 -u pulse_user -p'ZE6BuNtBG6P&g*gDpZRY' pulse + +# Health Checks +curl http://10.10.10.65:8080/health +curl https://pulse.lotusguild.org/health + +# Container Management (from Proxmox host) +pct list +pct status 122 +pct status 153 +pct enter 122 +pct enter 153 +``` + +## Glossary + +- **PULSE:** Pipelined Unified Logic & Server Engine - the name of this orchestration platform +- **Worker:** An agent node that executes commands and workflows +- **Workflow:** A defined sequence of steps to accomplish a task +- **Execution:** An instance of a workflow being run +- **Step:** A single action within a workflow (execute, prompt, wait) +- **Heartbeat:** Periodic status update from worker to server +- **LXC:** Linux Container - lightweight virtualization +- **Ceph:** Distributed storage system +- **Authelia:** SSO authentication server +- **LLDAP:** Lightweight LDAP server for user management +- **Gitea:** Self-hosted Git service +- **Webhook:** HTTP callback triggered by events (like git push) +- **SSO:** Single Sign-On - authentication system +- **MariaDB:** MySQL-compatible database server + +## Version History + +- **v1.0.0** (2025-11-30) - Initial release + - Basic workflow orchestration + - Worker management + - Authelia SSO integration + - Web dashboard + - Git deployment pipeline + - Interactive workflows with prompts + - Command execution on workers + +--- + +**Last Updated:** November 30, 2025 +**Project Status:** Active Development +**License:** Internal - Lotus Guild +**Maintainer:** Jared (jared@lotusguild.org) \ No newline at end of file