Skip to main content Skip to navigation Skip to footer
Limited time: Design Partner Program — BUSINESS plan free for life

Monitoring

Monitor machine health, containers, services, repositories, and run diagnostics.

Monitoring

Rediacc provides built-in monitoring commands to inspect machine health, running containers, services, repository status, and system diagnostics.

Machine Health

Get a comprehensive health report for a machine:

rdc machine health --name server-1

This reports:

  • System: uptime, disk usage, datastore usage
  • Containers: running, healthy, unhealthy counts
  • Storage: SMART health status
  • Issues: identified problems

Use --output json for machine-readable output.

List Containers

View all running containers across all repositories on a machine:

rdc machine containers --name server-1
ColumnDescription
NameContainer name
StatusUptime or exit reason
StateRunning, exited, etc.
HealthHealthy, unhealthy, none
CPUCPU usage percentage
MemoryMemory usage / limit
RepositoryWhich repository owns the container

Options:

  • --health-check, Perform active health checks on containers
  • --output json, Machine-readable JSON output

JSON output includes full container details (labels, port_mappings, image, id) plus repository (resolved name), repository_guid (original GUID), domain, and autoRoute.

List Services

View systemd services related to Rediacc on a machine:

rdc machine services --name server-1
ColumnDescription
NameService name
StateActive, inactive, failed
Sub-stateRunning, dead, etc.
RestartsRestart count
MemoryService memory usage
RepositoryAssociated repository

Options:

  • --stability-check, Flag unstable services (failed, >3 restarts, auto-restart)
  • --output json, Machine-readable JSON output

JSON output includes full service details with repository (resolved name) and repository_guid (original GUID).

List Repositories

View repositories on a machine with detailed stats:

rdc machine repos --name server-1
ColumnDescription
NameRepository name
SizeDisk image size
MountMounted or unmounted
DockerDocker daemon running or stopped
ContainersContainer count
Disk UsageActual disk usage within the repository
ModifiedLast modification time

Options:

  • --search <text>, Filter by name or mount path
  • --output json, Machine-readable JSON output

JSON output includes name (resolved) and guid (original GUID), and nests each repository’s containers (with domain, autoRoute, repository/repository_guid) and services arrays.

Storage Health

Inspect BTRFS fragmentation and reflink sharing across all repositories on a machine:

rdc machine query --name server-1 --storage-health
ColumnDescription
SizeLUKS image file size (what the repo looks like)
UniqueActual unique data owned only by this repo
SharedData blocks reused across repos via BTRFS reflinks (free copies)
ExtentsNumber of file extents (higher = more fragmented)
FragFragmentation level: low, moderate, or high

The summary shows total savings from BTRFS reflinks:

14 repos, 224.3 GB virtual size
Unique data: 323.7 MB | Shared: 224.0 GB | Efficiency: 99.9%
  • Virtual size is the sum of all repo image sizes. This is what the repos look like, but it double-counts blocks shared via reflinks.
  • Unique data is the actual storage consumed by repo data that exists in only one repo. This is what you would free by deleting a repo.
  • Shared is data reused across repos via BTRFS reflinks. Forking a repo creates reflink copies that share blocks until either side writes new data, at which point blocks diverge.
  • Efficiency is the percentage of data reused via reflinks. Higher is better. A machine with many forks from the same parent will show near-100% efficiency.

Repos with high fragmentation and zero shared blocks can be safely defragmented with btrfs filesystem defragment. Repos with shared blocks should NOT be defragmented because defrag replaces shared blocks with unique copies, increasing disk usage.

The scan runs in parallel and takes 5-15 seconds depending on the number and size of repos. When --storage-health is not specified, a one-line hint appears after the query output as a reminder.

BTRFS Scrub

Rediacc automatically schedules a weekly BTRFS scrub on every machine. The scrub reads every data block on the datastore, verifies checksums, and reports any corruption. This catches silent data corruption (bitrot) before it propagates to backups and forks.

The scrub runs every Sunday at 02:00 local time (machine timezone) with a randomized delay of up to 1 hour. It runs at the lowest I/O priority (ionice idle, nice 19) so it does not interfere with running services. On SSD-backed machines, expect roughly 8 minutes per 100 GB of datastore.

The scrub timer is installed automatically on the first daemon start after a renet upgrade. When the scrub policy changes in a future renet version, it updates itself on the next daemon start with no user action needed.

Scrub status

The result of the last scrub is saved outside the BTRFS volume (at /var/lib/rediacc/scrub-last-result.json) so it remains readable even if the volume has issues. The rdc machine query --system output includes a scrub_status field:

"scrub_status": {
  "last_run_human": "3 days ago",
  "status": "ok",
  "total_errors": 0,
  "uncorrectable": 0,
  "duration_seconds": 312
}
StatusMeaning
okLast scrub completed with no errors
never_runScrub has not run yet (timer was just installed)
overdueLast scrub was more than 14 days ago
errors_foundScrub found checksum mismatches (check the total_errors and uncorrectable counts)
failedScrub process exited with a non-zero code

If uncorrectable is greater than zero, the affected blocks cannot be repaired automatically (single-disk BTRFS has no redundant copy). Restore the affected repository from the most recent backup.

Manual scrub

To run a scrub immediately (e.g. after a power failure or disk migration):

rdc term connect -m server-1 -c "sudo renet maintenance scrub --datastore /mnt/rediacc"

The result is saved to the same JSON file and immediately visible in the next rdc machine query --system.

Vault Status

Get a complete overview of a machine including deployment information:

rdc machine vault-status --name server-1

This provides:

  • Hostname and uptime
  • Memory, disk, and datastore usage
  • Total repositories, mounted count, Docker running count
  • Detailed per-repository information

Use --output json for machine-readable output.

Test Connection

Cloud adapter only. In local mode, use rdc term connect -m server-1 -c "hostname" to verify connectivity.

Verify SSH connectivity to a machine:

rdc machine test-connection --ip 203.0.113.50 --user deploy

Reports:

  • Connection status (success/failed)
  • Authentication method used
  • SSH key configuration
  • Public key deployment status
  • Known hosts entry

Options:

  • --port <number>, SSH port (default: 22)
  • --save -m server-1, Save verified host key to machine config

Diagnostics (doctor)

Run a comprehensive diagnostic check of your Rediacc environment:

rdc doctor
CategoryChecks
EnvironmentNode.js version, CLI version, SEA mode, Go installation, Docker availability
RenetBinary location, version, CRIU, rsync, SEA embedded assets
ConfigurationActive config, adapter, machines, SSH key
VirtualizationChecks if your system can run local virtual machines (rdc ops)

Each check reports OK, Warning, or Error. Use this as a first step when troubleshooting any issue.

Exit codes: 0 = all passed, 1 = warnings, 2 = errors.