> ## Documentation Index
> Fetch the complete documentation index at: https://docs.katalo.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Reliability

> Operational rules for idempotency, rate limits, retries, concurrency, and short-lived output URLs.

## Replay-safe writes

Every write endpoint requires an `Idempotency-Key`. Reusing the same key with the same payload is safe. Reusing it with a different payload is not.

| Behavior                    | Result                                                          |
| --------------------------- | --------------------------------------------------------------- |
| Same key, same payload      | Returns the original job instead of creating a duplicate write. |
| Same key, different payload | Returns `409 idempotency_conflict`.                             |
| Retention window            | Idempotency records are retained for 24 hours.                  |

## Rate limits and concurrency

The API applies layered limits per key, per organization, and in some cases per IP. Limit checks happen before expensive work starts.

| Route                                          | Limit                                              |
| ---------------------------------------------- | -------------------------------------------------- |
| `POST /api/v1/source-assets`                   | Per-key and per-organization write limits.         |
| `GET /api/v1/source-assets/{ingest_id}`        | Read limits for polling source ingest state.       |
| `POST /api/v1/generations`                     | Per-key and per-organization write limits.         |
| `GET /api/v1/generations/{job_id}`             | Read limits for polling and recovery.              |
| `POST /api/v1/generations/{job_id}/regenerate` | Write limits plus organization in-flight job caps. |
| `GET /api/v1/usage/summary`                    | Read limits for reporting.                         |
| `GET /api/v1/billing/events`                   | Read limits for ledger export.                     |

<Note>
  The organization-wide in-flight cap is 50 simultaneous generation jobs.
</Note>

## Retry strategy

* Retry `429` responses after at least the advertised `retry_after` value.
* Retry retryable `5xx` responses with exponential backoff and jitter.
* Keep the same idempotency key when repeating the same write.
* Do not blindly retry validation, authorization, entitlement, or idempotency conflict errors.

## Signed URLs are temporary

Returned output URLs are intentionally short-lived. API responses also send `Cache-Control: no-store` so clients always read fresh job and billing state.

Re-read the job if a signed output URL has expired. Do not assume the first URL returned for a job remains valid for the lifetime of your workflow.
