Show HN: Open-source GPU cost analysis tool

Apr 28, 2026 · 12:19 PM UTC ·4 min read · 0 reactions · 0 comments · 0 views

Contribute to gpuaudit/cli development by creating an account on GitHub.

Original article

GitHub

Full article excerpt tap to expand

gpuaudit Scan your cloud for GPU waste and get actionable recommendations to cut your spend. $ gpuaudit scan --skip-eks Found 38 GPU nodes across 47 nodes in gpu-cluster gpuaudit — GPU Cost Audit for AWS Account: 123456789012 | Regions: us-east-1 | Duration: 4.2s ┌──────────────────────────────────────────────────────────┐ │ GPU Fleet Summary │ ├──────────────────────────────────────────────────────────┤ │ Total GPU instances: 38 │ │ Total monthly GPU spend: $127450 │ │ Estimated monthly waste: $18200 ( 14%) │ └──────────────────────────────────────────────────────────┘ CRITICAL — 3 instance(s), $15400/mo potential savings Instance Type Monthly Signal Recommendation ──────────────────────────────────── ────────────────────────── ──────── ──────────────── ────────────────────────────────────────────── gpu-cluster/ip-10-15-255-248 g6e.16xlarge (1× L40S) $ 6752 idle Node up 13 days with 0 GPU pods scheduled. gpu-cluster/ip-10-22-250-15 g6e.16xlarge (1× L40S) $ 6752 idle Node up 1 days with 0 GPU pods scheduled. ... What it scans EC2 — GPU instances (g4dn, g5, g6, g6e, p4d, p4de, p5, inf2, trn1) with CloudWatch metrics SageMaker — Endpoints with GPU utilization and invocation metrics EKS — Managed GPU node groups via the AWS EKS API Kubernetes — GPU nodes and pod allocation via the Kubernetes API (Karpenter, self-managed, any CNI) What it detects Idle GPU instances — running but doing nothing (low CPU + near-zero network for 24+ hours) Oversized GPU — multi-GPU instances where utilization suggests a single GPU would suffice Pricing mismatch — on-demand instances running 30+ days that should be Reserved Instances Stale instances — non-production instances running 90+ days SageMaker low utilization — endpoints with <10% GPU utilization SageMaker oversized — endpoints using <30% GPU memory on multi-GPU instances K8s unallocated GPUs — nodes with GPU capacity but no pods requesting GPUs Install go install github.com/gpuaudit/cli/cmd/gpuaudit@latest Or build from source: git clone https://github.com/gpuaudit/cli.git cd cli go build -o gpuaudit ./cmd/gpuaudit Quick start # Uses default AWS credentials (~/.aws/credentials or environment variables) gpuaudit scan # Specific profile and region gpuaudit scan --profile production --region us-east-1 # Kubernetes cluster scan (uses KUBECONFIG or ~/.kube/config) gpuaudit scan --skip-eks # Specific kubeconfig and context gpuaudit scan --kubeconfig ~/.kube/config --kube-context gpu-cluster # JSON output for automation gpuaudit scan --format json -o report.json # Compare two scans to see what changed gpuaudit diff old-report.json new-report.json # Slack Block Kit payload (pipe to webhook) gpuaudit scan --format slack -o - | \ curl -X POST -H 'Content-Type: application/json' -d @- $SLACK_WEBHOOK # Skip specific scanners gpuaudit scan --skip-metrics # faster, less accurate gpuaudit scan --skip-sagemaker gpuaudit scan --skip-eks # skip AWS EKS API (use --skip-k8s for Kubernetes API) gpuaudit scan --skip-k8s Comparing scans Save scan results as JSON, then diff them later: gpuaudit scan --format json -o scan-apr-08.json # ... time passes, changes happen ... gpuaudit scan --format json -o scan-apr-15.json gpuaudit diff scan-apr-08.json scan-apr-15.json gpuaudit diff — 2026-04-08 12:00 UTC → 2026-04-15 12:00 UTC ┌──────────────────────────────────────────────────────────┐ │ Cost Delta │ ├──────────────────────────────────────────────────────────┤ │ Monthly spend: $142000 → $127450 (-$14550) │ │ Estimated waste:…

This excerpt is published under fair use for community discussion. Read the full article at GitHub.

Anonymous · no account needed

Discussion

0 comments

Show HN: Open-source GPU cost analysis tool

Discussion

More from GitHub