Architecture
Workspace structure
Section titled “Workspace structure”| Path | Responsibility |
|---|---|
apps/gproxy | Executable service entry (Axum + embedded frontend assets) |
crates/gproxy-core | AppState, route orchestration, auth, request execution |
crates/gproxy-provider | Channel implementations, retry, OAuth, dispatch, tokenizer |
crates/gproxy-middleware | Protocol conversion middleware, usage extraction |
crates/gproxy-protocol | OpenAI/Claude/Gemini types and conversion models |
crates/gproxy-storage | SeaORM storage layer, query models, async write queue |
crates/gproxy-admin | Admin/user domain services |
Startup phase
Section titled “Startup phase”Main startup sequence:
- Read
gproxy.tomland apply CLI/ENV overrides. - Connect to DB and auto-sync schema.
- Initialize provider registry, credential pool, and credential statuses.
- Ensure admin user (
id=0) and admin key exist.
Request phase
Section titled “Request phase”Core request chain:
- Auth: parse and validate
x-api-key(or compatible headers). - Routing: resolve target provider by scoped/unscoped routes + dispatch.
- Credential selection: select available credential and do retry/failover.
- Protocol conversion: convert or pass through by provider protocol rules.
- Recording: persist upstream/downstream requests and usage.
Credential health status
Section titled “Credential health status”healthy: available.partial: partially available (typically model-level cooldown).dead: unavailable (excluded from scheduling).
Default cooldown behavior:
- rate limit: around
60s - transient failure: around
15s
This mechanism reduces the impact of single-credential instability on overall success rate.