Reliability and SLA
Uptime guarantees, the status page, idempotency, and how we handle scheduled maintenance.
Reliability and SLA
A QR you printed last year should still scan today. A 30 second outage on our redirect path is a 30 second outage for every dynamic QR in the world. We treat reliability as the product.
Uptime targets
| Plan | Render endpoint SLA | Redirect endpoint SLA |
|---|---|---|
| Free | best effort | best effort |
| Starter | best effort | best effort |
| Pro | 99.5% | 99.5% |
| Agency | 99.9% | 99.9% |
| Enterprise | 99.95% | 99.95% with regional failover |
Measured monthly. SLA breach credits are auto-applied to the next invoice.
Health and status endpoints
| Endpoint | Auth | Purpose |
|---|---|---|
GET /api/health/ | none | Cheap liveness check. Returns {"status":"ok"}. No DB call. |
GET /api/status/ | none | Structured readiness. Probes DB, cache, and reports per-component state. |
Use /api/health/ for binary up/down monitoring (UptimeRobot, Railway
healthcheck, k8s liveness). Use /api/status/ for the dashboard
component view.
The public status page lives at status.qrstudio.agency and surfaces the same data plus historical incidents.
Idempotency
The /api/v1/generate/ endpoint is naturally idempotent: same input,
same output, cached for 24 hours. Re-sending the same request after a
network blip yields the same image and does not double-charge quota
for the same content (the second call is a cache hit; both still count
as quota credits because both produced a deliverable for you).
For non-idempotent endpoints (creating a key, creating a dynamic QR,
posting to billing), include an Idempotency-Key header with a UUID
of your choice:
curl -X POST https://api.qrstudio.agency/api/v1/dynamic/ \
-H "Authorization: Bearer <jwt>" \
-H "Idempotency-Key: 4e1c8a76-9f91-4b8b-9b2c-6d7e8f9a0b1c" \
-H "Content-Type: application/json" \
-d '{"name":"Spring menu","destination_url":"https://x.com/menu"}'We store the idempotency key for 24 hours. A retry with the same key
returns the original response; a retry with the same key but a
different body returns 409 Conflict.
Retry policy guidance
| Status | Retry? | Why |
|---|---|---|
200 OK | n/a | You got the result |
400-403 | No | Caller bug; retrying does not help |
404 | No | Resource truly missing |
409 | No | Idempotency mismatch; resolve client-side |
422 | No | Per-item validation failure on bulk |
429 | After quota reset | Re-running does not change quota |
5xx | Yes, with backoff | Likely transient |
Recommended backoff for 5xx: exponential, base 1 second, max 3 retries. Add jitter so concurrent clients do not stampede.
Our official SDKs implement this policy by default. If you roll your own client, follow it.
Scheduled maintenance
We pre-announce any maintenance that may affect availability at least 72 hours in advance via:
- The status page banner
- Email to every workspace owner whose plan SLA is affected
Render endpoints can degrade during maintenance (longer p99 latency). Redirect endpoints stay up via regional failover on Enterprise.
What we do during incidents
- Acknowledge on the status page within 5 minutes.
- Escalate to on-call engineer via PagerDuty.
- Post a public RCA within 5 business days for any incident lasting over 30 minutes or affecting paid tiers.
The RCA archive lives at status.qrstudio.agency/history.
Regional architecture
- Primary region: Montreal (yul-1). Primary Postgres, Redis, and app servers.
- Read replicas: us-east, eu-west.
- Redirect failover (Enterprise): the redirect path runs in three regions with shared dynamic-QR config. If primary fails, the alternate region serves the 302 with eventual scan-log consistency.
Backups
- Postgres: continuous WAL archiving with PITR for the last 30 days.
- Daily full snapshot, retained 90 days.
- Render cache is not backed up (re-rendering on miss is the recovery plan).
- Encrypted at rest with AES-256.
Data retention
| Data | Retention |
|---|---|
| API keys | Until revoked |
QrApiCall audit rows | 13 months (rolling) |
DynamicQrScan rows | 13 months (rolling) |
| Webhook deliveries | 30 days |
| Stripe webhook events | 13 months |
Enterprise customers can negotiate longer retention windows.
What is next
- Security model: SSRF, hashing, signing
- API reference: status endpoint