Commit Graph

29 Commits

Author SHA1 Message Date
ba5ba6f899 Security and reliability fixes: XSS, race conditions, logging
- Fix XSS in workflow list: escape name/description/created_by with escapeHtml()
- Fix broadcast() race: snapshot browserClients Set before iterating
- Fix waitForCommandResult: use position-based matching (worker omits command_id in result)
- Fix scheduler duplicate-run race: atomic UPDATE with affectedRows check
- Suppress toast alerts for automated executions (gandalf:/scheduler: prefix)
- Add waiting_for_input + prompt fields to GET /executions/:id
- Add GET /api/workflows/:id endpoint
- Add interactive prompt UI: respondToPrompt(), WebSocket execution_prompt handler
- Add workflow editor modal (admin-only)
- Remove sensitive console.log calls (WS payload, command output)
- Show error messages in list containers on load failure
- evalCondition: log warning instead of silently swallowing errors
- Worker reconnect: clean up stale entries for same dbWorkerId
- Null-check execState before continuing workflow steps

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 17:30:32 -04:00
ba75a61bd2 Security hardening, bug fixes, and backend improvements
Security:
- validateWebhookUrl() rejects non-http/https and private/internal IPs
- Validate webhook URL on workflow create and update
- Replace error.message in all HTTP 500 responses with 'Internal server error'
- Add requireJSON middleware (HTTP 415 if Content-Type wrong) on POST/PUT routes
- Reject missing API keys in worker heartbeat (not just wrong ones)
- Validate prompt response against allowed options before accepting

Bugs fixed:
- goto infinite loop protection: stepVisits[] counter, fails at GOTO_MAX_VISITS (100)
- wait step: validate duration (no NaN/negative), cap at WAIT_STEP_MAX_MS (24h)
- _executionPrompts now stores {resolve, options} for option validation
- JSON.parse wrapped in try/catch: workflows/:id, executions/:id, internal/executions/:id, POST /api/executions, scheduler worker_ids
- pong handler uses ws.dbWorkerId (set on connect) not message.worker_id
- Worker disconnect now marks worker offline in DB and broadcasts update
- command validation (type + empty check) on POST /api/workers/:id/command
- workflow_id required check on POST /api/executions

Performance & reliability:
- markStaleWorkersOffline() runs every 60s, marks workers without recent heartbeat offline
- Named constants: PROMPT_TIMEOUT_MS, COMMAND_TIMEOUT_MS, QUICK_CMD_TIMEOUT_MS,
  WEBHOOK_TIMEOUT_MS, WAIT_STEP_MAX_MS, GOTO_MAX_VISITS, WORKER_STALE_MINUTES

New features:
- GET /api/health (auth required): version, uptime, worker counts
- DELETE /api/executions/completed: bulk delete finished executions (admin)
- schedule_value positive-integer validation for interval/hourly schedule types
- Request logging middleware: [HTTP] METHOD /path STATUS Xms

Code quality:
- All console.log on error paths changed to console.error
- Removed stray debug console.log in POST /api/workflows

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 10:59:07 -04:00
2d6a0f1054 Add rate limiting, cron scheduling, webhooks, dry-run, execution filtering, and UX improvements
- Rate limiting: 300 req/15min general, 20 req/min on POST /api/executions
- Cron schedule type support using cron-parser for full cron expressions
- Webhook notifications: POST to workflow webhook_url on execution complete/failed
- Dry-run mode: simulate workflow execution without running any commands
- Global execution timeout via EXECUTION_MAX_MINUTES env var (default 60min)
- Execution filtering: status, workflow_id, started_by, after, before, search
- Event-driven command result delivery (replaces 500ms DB polling)
- Atomic log appends via JSON_ARRAY_APPEND (no read-modify-write race)
- Separate browserClients/workerClients sets (workers no longer receive broadcasts)
- Stale execution cleanup on startup (mark running→failed after crash)
- Scheduler overlap prevention (skip if same workflow already running)
- Frontend: webhook_url field in create/edit workflow modals
- Frontend: dry-run checkbox in workflow param modal
- Frontend: ESC closes modals, ws.onerror handler added
- Frontend: selectedExecutions changed from Array to Set (O(1) ops)
- Frontend: XSS fixes via escapeHtml() on all user-controlled innerHTML
- Frontend: param modal keydown listener deduplication fix
- Remove unused npm packages (bcryptjs, body-parser, cors, js-yaml, jsonwebtoken)
- Add express-rate-limit and cron-parser dependencies

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11 23:06:09 -04:00
58c172e131 Security hardening, bug fixes, and performance improvements
Security fixes:
- Replace new Function() condition eval with vm.runInNewContext() (RCE fix)
- Add admin checks to DELETE executions, all scheduled-commands endpoints
- Remove api_key from GET /api/workers response (was exposed to all employees)
- Separate browserClients/workerClients sets; broadcast() now sends to browsers only
- Add worker WebSocket auth: reject if api_key provided but invalid
- Fix XSS: escapeHtml() on step_name, duration, worker_id, user info, execution_id

Bug fixes:
- Replace DB-polling waitForCommandResult with event-driven _commandResolvers Map
- Replace non-atomic addExecutionLog with JSON_ARRAY_APPEND (fixes concurrent write race)
- Add stale execution recovery on startup: running→failed with log entry
- Fix calculateNextRun returning null for unknown types (now throws)
- Fix scheduler overlap: skip if previous execution still running
- Fix JSON double-parse on worker_ids column
- Fix switchTab() bare event.target reference
- Fix selectedExecutions Array→Set (O(1) lookups, fixes performance regression)
- Fix param modal event listener leak (delegated handler, removes before re-adding)
- Add ws.onerror handler (was silently swallowing WebSocket errors)
- Move misplaced routes to before server.listen()

Performance/cleanup:
- DB connection pool 10→50
- EXECUTION_RETENTION_DAYS default 1→30 (matches docs)
- Remove unused packages: bcryptjs, body-parser, cors, js-yaml, jsonwebtoken
- Remove generateUUID() wrapper, use crypto.randomUUID() directly
- Remove dead example workflow constants
- Add ESC key handler to close modals
- Fix clearCompletedExecutions limit 1000→9999
- Add security notice to README.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11 22:53:25 -04:00
0fee118d1d Suppress toast notifications for automated executions
Automated executions (started_by gandalf: or scheduler:) no longer
trigger success/failure toast alerts for connected browser users.
Server now includes is_automated flag in command_result broadcasts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-04 16:26:18 -05:00
937bddbe2f Fix waitForCommandResult: match by position, not command_id
Worker does not echo command_id back in command_result message.
Previously this caused all workflow steps to time out after 120s.
Now: find the command_sent entry for the commandId, then take the
next command_result after it — safe since steps run sequentially.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-04 11:21:30 -05:00
bf9b14bc96 Add interactive workflow system with prompt steps, workflow editor
- executeWorkflowSteps: rewritten with step-id/goto branching support
- executePromptStep: async pause via _executionPrompts Map, 60min timeout
- POST /api/executions/:id/respond: resolves pending prompt from browser
- PUT /api/workflows/🆔 admin-only workflow editing, broadcasts workflow_updated
- GET /api/workflows/🆔 fetch single workflow for edit modal
- GET /api/executions/🆔 now includes waiting_for_input + prompt fields
- index.html: prompt/prompt_response/step_skipped log entry rendering
- index.html: execution_prompt WebSocket handler refreshes open modal
- index.html: workflow_updated WebSocket handler reloads workflow list
- index.html: Edit button + modal for in-browser workflow editing
- index.html: respondToPrompt keeps modal open, refreshes execution view
- Interactive Link Troubleshooter v2 workflow: 45-step wizard with
  copper/fiber branches, clean/swap/reseat actions, re-test loops,
  CRC error path, performance diagnostics, SUCCESS/ESCALATE terminals

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 16:55:02 -05:00
033237482d feat: workflow param substitution + link troubleshooter support
Adds {{param_name}} template substitution to the workflow execution engine
so workflows can accept user-supplied inputs at run time.

server.js:
- applyParams() helper — substitutes {{name}} in command strings with
  validated values (alphanumeric + safe punctuation only); unsafe values
  throw and fail the execution cleanly
- executeCommandStep() / executeWorkflowSteps() — accept params={} and
  apply substitution before dispatching commands to workers
- POST /api/executions — accepts params:{} from client; validates required
  params against definition.params[]; logs params in initial execution log

index.html:
- loadWorkflows() caches definition in _workflowRegistry keyed by id;
  shows "[N params]" badge on parameterised workflows
- executeWorkflow() checks for definition.params; if present, shows
  param input modal instead of plain confirm()
- showParamModal() — builds labelled input form from param definitions,
  marks required fields, focuses first input, Enter submits
- submitParamForm() — validates required fields, calls startExecution()
- startExecution() — POSTs {workflow_id, params} and switches to executions tab
- Param input modal — terminal-aesthetic overlay, no external dependencies

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 16:20:05 -05:00
6d945a1913 feat: Gandalf M2M API, manual/automated execution sub-tabs, cleanup tuning
- server.js: add authenticateGandalf middleware (X-Gandalf-API-Key header)
  and two internal endpoints used by Gandalf link diagnostics:
    POST /api/internal/command  — submit SSH command to a worker, returns execution_id
    GET  /api/internal/executions/:id — poll execution status/logs
  Also tag automated executions as started_by 'gandalf:*' / 'scheduler:*';
  add hide_internal query param to GET /api/executions; change cleanup
  from daily/30d to hourly/1d to keep execution history lean
- index.html: add Manual / Automated sub-tabs on Execution History tab so
  Gandalf diagnostic runs don't clutter the manual run view; persists
  selected tab to localStorage; dashboard recent-run strip filters to
  manual runs only; sub-tabs show live counts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 16:04:22 -05:00
e7707d7edb Add abort execution feature for stuck/running processes 2026-01-08 22:11:59 -05:00
ea7a2d82e6 Fix execution details endpoint - remove reference to deleted activeExecutions 2026-01-08 22:06:25 -05:00
8224f3b6a4 Add missing helper functions for workflow execution (addExecutionLog, updateExecutionStatus) 2026-01-08 22:03:00 -05:00
06eb2d2593 Add detailed logging to workflow creation endpoint
This will help diagnose the 'Cannot read properties of undefined' error
by logging each step of the workflow creation process.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-07 23:30:50 -05:00
b85bd58c4b Fix: Remove duplicate old workflow execution code
Removed old/obsolete workflow execution system that was conflicting
with the new executeWorkflowSteps() engine:

Removed:
- activeExecutions Map (old tracking system)
- executeWorkflow() - old workflow executor
- executeNextStep() - old step processor
- executeCommandStep() - old command executor (duplicate)
- handleUserInput() - unimplemented prompt handler
- Duplicate app.post('/api/executions') endpoint
- app.post('/api/executions/:id/respond') endpoint

This was causing "Cannot read properties of undefined (reading 'target')"
error because the old code was being called instead of the new engine.

The new executeWorkflowSteps() engine is now the only workflow system.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-07 23:24:42 -05:00
018752813a Implement Complete Workflow Execution Engine
Added full workflow execution engine that actually runs workflow steps:

Server-Side (server.js):
- executeWorkflowSteps() - Main workflow orchestration function
- executeCommandStep() - Executes commands on target workers
- waitForCommandResult() - Polls for command completion
- Support for step types: execute, wait, prompt (prompt skipped for now)
- Sequential step execution with failure handling
- Worker targeting: "all" or specific worker IDs/names
- Automatic status updates (running -> completed/failed)
- Real-time WebSocket broadcasts for step progress
- Command result tracking with command_id for workflows
- Only updates status for non-workflow quick commands

Client-Side (index.html):
- Enhanced formatLogEntry() with workflow-specific log types
- step_started - Shows step number and name with amber color
- step_completed - Shows completion with green checkmark
- waiting - Displays wait duration
- no_workers - Error when no workers available
- worker_offline - Warning for offline workers
- workflow_error - Critical workflow errors
- Better visual feedback for workflow progress

Workflow Definition Format:
{
  "steps": [
    {
      "name": "Step Name",
      "type": "execute",
      "targets": ["all"] or ["worker-name"],
      "command": "your command here"
    },
    {
      "type": "wait",
      "duration": 5
    }
  ]
}

Features:
- Executes steps sequentially
- Stops on first failure
- Supports multiple workers per step
- Real-time progress updates
- Comprehensive logging
- Terminal-themed workflow logs

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-07 23:19:12 -05:00
4511fac486 Phase 10: Command Scheduler
Added comprehensive command scheduling system:

Backend:
- New scheduled_commands database table
- Scheduler processor runs every minute
- Support for three schedule types: interval, hourly, daily
- calculateNextRun() function for intelligent scheduling
- API endpoints: GET, POST, PUT (toggle), DELETE
- Executions automatically created and tracked
- Enable/disable schedules without deleting

Frontend:
- New Scheduler tab in navigation
- Create Schedule modal with worker selection
- Dynamic schedule input based on type
- Schedule list showing status, next/last run times
- Enable/Disable toggle for each schedule
- Delete schedule functionality
- Terminal-themed scheduler UI
- Integration with existing worker and execution systems

Schedule Types:
- Interval: Every X minutes (e.g., 30 for every 30 min)
- Hourly: Every X hours (e.g., 2 for every 2 hours)
- Daily: At specific time (e.g., 03:00 for 3 AM daily)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-07 23:13:27 -05:00
9f972182b2 Phase 5: Auto-cleanup and pagination for executions
Changes:
Server-side:
- Added automatic cleanup of old executions (runs daily)
- Configurable retention period via EXECUTION_RETENTION_DAYS env var (default: 30 days)
- Cleanup runs on server startup and every 24 hours
- Only cleans completed/failed executions, keeps running ones
- Added pagination support to /api/executions endpoint
- Returns total count, limit, offset, and hasMore flag

Client-side:
- Implemented "Load More" button for execution pagination
- Loads 50 executions at a time
- Appends additional executions when "Load More" clicked
- Shows total execution count info
- Backward compatible with old API format

Benefits:
- Automatic database maintenance
- Prevents execution table from growing indefinitely
- Better performance with large execution histories
- User can browse all executions via pagination
- Configurable retention policy per deployment

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-07 22:50:39 -05:00
7656f4a151 Add execution cleanup functionality
Changes:
- Added DELETE /api/executions/:id endpoint
- Added "Clear Completed" button to Executions tab
- Deletes all completed and failed executions
- Broadcasts execution_deleted event to update all clients
- Shows count of deleted executions

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-07 22:36:51 -05:00
e13fe9d22f Fix worker ID mapping - use database ID for command routing
Problem:
- Workers generate random UUID on startup (runtime ID)
- Database stores workers with persistent IDs (database ID)
- UI sends commands using database ID
- Server couldn't find worker connection (stored by runtime ID)
- Result: 400 Bad Request "Worker not connected"

Solution:
- When worker connects, look up database ID by worker name
- Store WebSocket connection in Map using BOTH IDs:
  * Runtime ID (from worker_connect message)
  * Database ID (from database lookup by name)
- Commands from UI use database ID → finds correct WebSocket
- Cleanup both IDs when worker disconnects

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-07 22:29:23 -05:00
8bda9672d6 Fix worker command execution and execution status updates
Changes:
- Removed duplicate /api/executions/:id endpoint that didn't parse logs
- Added workers Map to track worker_id -> WebSocket connection
- Store worker connections when they send worker_connect message
- Send commands to specific worker instead of broadcasting to all clients
- Clean up workers Map when worker disconnects
- Update execution status to completed/failed when command results arrive
- Add proper error handling when worker is not connected

Fixes:
- execution.logs.forEach is not a function (logs now properly parsed)
- Commands stuck in "running" status (now update to completed/failed)
- Commands not reaching workers (now sent to specific worker WebSocket)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-07 22:23:02 -05:00
4974730dc8 Remove database migrations after direct schema fixes
Changes:
- Removed all migration code from server.js
- Database schema fixed directly via MySQL:
  * Dropped users.role column (SSO only)
  * Dropped users.password column (SSO only)
  * Added executions.started_by column
  * Added workflows.created_by column
  * All tables now match expected schema
- Server startup will be faster without migrations

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-07 22:11:07 -05:00
df581e85a8 Remove password column from users table
Changes:
- Drop password column from users table (SSO authentication only)
- PULSE uses Authelia SSO, not password-based authentication
- Fixes 500 error: Field 'password' doesn't have a default value

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-07 22:01:58 -05:00
e4574627f1 Add last_login column to users table migration
Changes:
- Add last_login TIMESTAMP column to existing users table
- Complete the users table migration with all required columns
- Fixes 500 error: Unknown column 'last_login' in 'INSERT INTO'

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-07 20:33:04 -05:00
1d994dc8d6 Add migration to update users table schema
Changes:
- Add display_name, email, and groups columns to existing users table
- Handle MariaDB lack of IF NOT EXISTS in ALTER TABLE
- Gracefully skip columns that already exist
- Fixes 500 error when authenticating users

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-07 20:28:47 -05:00
02ed71e3e0 Allow NULL workflow_id in executions table for quick commands
Changes:
- Modified executions table schema to allow NULL workflow_id
- Removed foreign key constraint that prevented NULL values
- Added migration to update existing table structure
- Quick commands can now be stored without a workflow reference

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-07 20:27:02 -05:00
cff058818e Fix quick command executions not appearing in execution tab
Changes:
- Create execution record in database when quick command is sent
- Store initial log entry with command details
- Broadcast execution_started event to update UI
- Display quick commands as "[Quick Command]" in execution list
- Fix worker communication to properly track all executions

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-07 20:24:11 -05:00
05c304f2ed Updated websocket handler 2026-01-07 20:20:18 -05:00
bb247628b0 Workflow & Command system 2025-11-30 13:03:18 -05:00
dcca2b9e50 Initial PULSE server commit with Authelia SSO integration 2025-11-29 19:26:20 -05:00