kisenon

Troubleshooting

Common alpha-era failure modes and how to recover.

Kisenon is in alpha. Here are the failure modes you are most likely to hit, and how to get past each.

Endpoint stuck in Pending

Symptom: an endpoint sits in Pending for more than a few seconds on creation. The console shows no error; the CLI shows the same state.

Likely cause: the compute pod cannot be scheduled. The most common reasons are cluster pressure (no node has the requested CPU or memory free) or a stale image-pull credential. Both are operator-side issues.

What to do:

  1. Wait 60 seconds. Transient pressure usually clears once another endpoint suspends.
  2. If it does not clear, delete the endpoint and recreate. The control plane will retry scheduling on a different node.
  3. If recreation also lands in Pending, the cluster is unhappy. File a report at the GitHub tracker with the endpoint id and the wall-clock time.

FATAL: endpoint unavailable on first connect

Symptom: psql returns FATAL: endpoint unavailable on the very first connection to a freshly created endpoint, or after a long idle.

Likely cause: cold-start race. The endpoint state is Stopped, your packet woke it, but Postgres is still replaying WAL up to the branch HEAD when your client gives up.

What to do: retry the connection. Cold-start typically completes in 300–500 ms, but the very first wake of a freshly created project, or after a 24h+ idle, can take 10–30 seconds while the pageserver page cache warms. Most drivers tolerate this if you allow at least one retry; raw psql does not retry by default.

# psql with one explicit retry
for i in 1 2; do psql "$URI" -c '\q' && break; sleep 5; done

If the endpoint is still unavailable after 30 seconds, the pod itself may have failed — check the state in the console and consult the Pending guidance above.

keon connection-string returns branch_not_found

Symptom:

$ keon connection-string my-feature --project prj_abc...
Error: branch_not_found: my-feature

…but the branch exists in the console.

Likely cause: the name doesn't match a branch in the project you targeted — usually a typo, or the wrong --project. The CLI resolves a branch by id or by name, whether you pass it as the positional argument or via --branch, so a name that does exist will resolve either way.

What to do: confirm the branch name and project:

keon branches list --project prj_abc...
keon connection-string my-feature --project prj_abc...

Sign-in returns access_denied

Symptom: Google or GitHub OAuth completes, but the console redirects to an error page citing access_denied.

Likely cause: during alpha, sign-in is gated by an email allowlist. If your address has not been onboarded, the sign-in callback rejects it on principle.

What to do: apply at Alpha access. Once your address is added to the allowlist, sign-in completes normally on the next attempt.

Console session expires mid-session

Symptom: the console works for a while, then suddenly returns 401s on every API call until you sign out and back in.

Likely cause: the cp-signed JWT expired and its refresh window has lapsed. The console mints a short-lived JWT (≈15 minutes) and, while you're active, refreshes it in the background against /v1/auth/refresh. The refresh window is fixed at 12 hours from sign-in: an active session renews indefinitely, but a tab left untouched past that window can no longer refresh.

What to do: sign out and back in. For headless or long-running automation, use an nsk_ API key instead of a browser session — API keys do not expire and are revoked explicitly. See Auth.

Where to file bugs

For anything not covered here:

  • Product bugs and feature requests: the GitHub tracker.
  • Security vulnerabilities: Security — never on the public tracker.

Concrete repro steps, the affected ids (project, branch, endpoint), and a wall-clock timestamp shorten the round-trip dramatically.