Commit Graph

23 Commits

Author SHA1 Message Date
86be5fd1c1 Add efficient process wait utility function
Add wait_for_process() that uses kill -0 instead of ps -p
for checking if a process is running. This is more efficient
as kill -0 only checks process existence without spawning
a new process like ps would.

Includes optional spinner for visual feedback during waits.

#7

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 10:51:36 -05:00
a491ae4592 Cache disk list to avoid multiple lsblk calls
Add get_disk_list() function that caches the output of lsblk
on first call. Subsequent calls return the cached value,
reducing overhead when multiple functions need to iterate
over disk devices.

#8

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 10:51:13 -05:00
7514e2ba7c Add logging infrastructure without subshell overhead
Add optional logging to file via PROXDOC_LOGFILE environment
variable. Uses exec redirection with tee instead of subshells,
which is more efficient for long-running diagnostics.

Usage: PROXDOC_LOGFILE=/tmp/proxdoc.log ./proxDoc.sh --diags

#9

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 10:50:45 -05:00
f7ed682bdb Standardize error handling with cleanup trap
- Add cleanup function called on EXIT trap
- Add ERRORS_OCCURRED and WARNINGS_OCCURRED counters
- Make handle_error support non-fatal errors with optional parameter
- Add proper exit codes for INT (130) and TERM (143) signals
- Add summary of errors/warnings at the end of diagnostics
- Redirect error messages to stderr

#10

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 10:50:14 -05:00
148a7ac644 Add validation for potentially empty variables
Add fallback values for variables that might be empty when
system information is unavailable. Use parameter expansion
with default values (${var:-Default}) to ensure meaningful
output even when commands fail or return empty results.

Affected functions: get_cpu_info, get_ram_info, get_network_info

#11

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 10:49:23 -05:00
67d4b76324 Extract magic strings into named constants
Define pattern constants at the top of the script for:
- VIRTUAL_IFACE_PATTERN: virtual/firewall interface patterns
- STORAGE_CONTROLLER_PATTERN: HBA/storage controller detection
- DISK_DEVICE_PATTERN: disk device name patterns
- EXCLUDED_PCI_PATTERN: PCI devices to exclude from listing

This improves maintainability and makes patterns easier to modify.

#12

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 10:48:51 -05:00
6633a0a9a1 Implement selective checks with --checks option
Add the ability to run only specific diagnostic checks using
--checks=cpu,ram,disk syntax. This allows users to perform
targeted diagnostics without running the full suite.

Supported checks: cpu, ram, memory, storage, disk, network,
hardware, temps, services, ceph, vms, containers

#13

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 10:46:59 -05:00
eff8eb3a3c Add timeout protection to external commands
Add a configurable CMD_TIMEOUT constant and apply timeouts to
smartctl and ceph commands that may hang on unresponsive disks
or network issues. This prevents the script from blocking
indefinitely.

#14

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 10:46:26 -05:00
07989c8788 Add examples section to help documentation
Expand the help output to include practical usage examples
for common operations like full diagnostics, quick health
checks, service monitoring, and Ceph health checks.

#15

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 10:45:46 -05:00
c8fadf924b Add input validation with whitelist of valid options
Implement strict input validation using a whitelist approach.
Only accept options that match the expected pattern and are in
the approved list. This prevents injection attacks and invalid
inputs from being processed.

#16

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 10:45:19 -05:00
c25e3ccc76 Fix variable quoting in disk iteration loops
Replace unsafe for loops with properly quoted while loops when
iterating over disk devices. This prevents word splitting issues
with device names containing special characters.

#17

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 10:44:43 -05:00
e1dac4c08c #4 2026-02-02 15:52:09 -05:00
d0d1a3b174 #3 2026-02-02 15:41:59 -05:00
08290a1a49 #1 2026-02-02 15:38:59 -05:00
f5df832941 #5 2026-02-02 15:35:30 -05:00
806d883476 #2 2026-02-02 15:29:45 -05:00
b65bcb1c4c Filter out integrated devices and firewall interfaces
- PCI: Exclude USB controller, Audio device, Encryption controller,
  Multimedia controller (integrated motherboard devices)
- Network: Also filter fwbr*, fwln*, fwpr*, tap* interfaces
  (firewall bridges and VM tap devices)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 21:41:17 -05:00
8a12db93ae Condense network output and show all PCI devices
- Filter out veth* interfaces from network statistics and NIC details
  (these are container virtual interfaces that clutter the output)
- Show all interesting PCI devices instead of just VGA/ethernet/raid
  (excludes Host bridge, PCI bridge, ISA bridge, SMBus, IOMMU, Dummy)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 21:34:17 -05:00
6e3cafa98d Fix memory count, HBA detection, remove redundant disk health
- Fix memory slot counting: use grep on Locator/Size fields instead
  of awk pattern that wasn't matching dmidecode output correctly
- Add SATA to HBA detection patterns - was missing SATA controllers
- Remove get_disk_health from runDiags - redundant with DriveAtlas
  which shows the same info in a better format

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 21:29:14 -05:00
ca23a30bd3 Fix memory slot count bug in get_memory_details
The previous grep-based counting was including Size: lines from
Physical Memory Array sections, causing incorrect counts (e.g., 5/4
instead of 4/4). Now uses awk to only count Size: lines within
Memory Device sections.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 21:24:13 -05:00
d9e546f75d Remove interactive features for remote-only execution
- Remove interactive menu (requires stdin)
- Remove --connect option (requires stdin)
- Remove --save option (not practical for remote execution)
- Show help when run without arguments
- Update help to show curl usage example
- Update README for remote-only usage

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 17:22:37 -05:00
575c60b1fa Update to v1.1.0: Add interactive menu, DriveAtlas, and monitoring integrations
- Add interactive numbered menu when run without arguments
- Add DriveAtlas integration (--drives) for physical drive bay mapping
- Add Ceph cluster health monitoring (--ceph)
- Add Node Exporter status check (--node-exporter)
- Add hwmon daemon status check (--hwmon)
- Add quick health check mode (--quick)
- Add container list option (--ct-list)
- Full diagnostics now includes all monitoring checks
- Update README with new features and changelog

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 17:18:53 -05:00
f9f59da191 first commit 2025-01-01 18:28:45 -05:00