BYOK Enhancement System — Per-Request Limits, Key Management
October 19, 2025
This release significantly expands EnginifyAI’s Bring Your Own Key (BYOK) capabilities. If you use your own AI provider API keys, the platform now enforces per-request credit ceilings and token limits based on your plan tier, tracks API key health and last-used timestamps, and records detailed per-request usage data for analytics. These controls give you more visibility into and governance over how your keys are being used, while the platform-side enforcement protects against runaway costs.
New Features
- Per-request credit ceilings — Each plan tier now has a maximum credit cost per individual AI request. This prevents any single generation from consuming an unexpectedly large portion of your credits. The ceiling scales with your tier — higher tiers get higher per-request allowances.
- Token input/output limits — Plan tiers now enforce maximum input and output token counts per request. This provides a safety net against accidentally sending extremely long prompts or receiving unexpectedly large responses, keeping costs predictable.
- API key status tracking — Your stored API keys now have a status indicator (valid, invalid, revoked, or unknown) along with timestamps for when the key was last validated and last used. You can also assign a friendly label to each key (e.g., “Production key” or “Testing key”) so you can easily tell them apart.
- Detailed per-request usage tracking — Every AI request now generates a detailed usage record including the model used, tokens consumed, estimated cost, and variance analysis comparing actual usage against expected usage. This data powers the analytics that help you optimize your model choices and prompt efficiency.
- Comprehensive audit logging — All quota-related events — credit consumption, limit checks, BYOK usage — are now recorded in an audit trail. This provides a complete history of how your quota has been used.
- Quota enforcement in all creation workflows — Credit checks, model allowlist enforcement, and token limit validation now run before every AI call across builders, prompt creation, library creation, and version creation. Each check produces user-friendly error messages with clear guidance on how to resolve the issue, including upgrade paths when applicable.
Improvements
- Quota-aware model selection — The model selector now factors in your tier’s allowlist when displaying available models, so you only see models you’re authorized to use. This eliminates the confusing experience of selecting a model and then being told you can’t use it.
- User-friendly quota error alerts — When you hit a quota limit, a dedicated alert component displays your current tier, what limit was reached, and what your options are — including upgrading or adjusting your request. This replaces generic error messages with clear, actionable guidance.

