SaaS APIs
Predictable, Per-Call Cost
API calls have a fixed, amortizable compute cost. The size of the JSON payload has a negligible impact on the price.
Billed by the Request
The billable event is the API call itself.
Our LLM proxy simplifies AI billing by creating a double-entry ledger for every model call. Send your prompt, chosen model, and Customer ID; we handle routing to the provider, returning the response, and attributing tokens by model and type, all in one request.
Andreessen Horowitz just declared:
“AI is driving a shift towards outcome-based pricing. Software is becoming labor.”
But where is the infrastructure for outcome-based billing & accounting?
This not a payments problem, but an accounting problem that leads to catastrophic failure. The future is outcome based...but the tools to build it don't exist.
Until now.
Why is monetizing AI agents so difficult? Because the financial models that powered the last decade of SaaS are fundamentally broken in the world of Generative AI.
Predictable, Per-Call Cost
API calls have a fixed, amortizable compute cost. The size of the JSON payload has a negligible impact on the price.
Billed by the Request
The billable event is the API call itself.
Variable, Computational Cost
An agent's cost is directly tied to the “work” it performs. It's a metered, computational resource, not a fixed endpoint.
Billed by the Computation
The billable events are the input tokens, output tokens, the number of thoughts, and every downstream tool call the agent makes.
This creates a massive accounting problem that traditional billing systems cannot solve. Trying to price a variable, multi-step agentic workflow like a simple API call forces you to either guess—and consistently underbill or overcharge your customers—or build a complex, brittle accounting system from scratch.
Simple token counters and API proxies are failing. They see the token count and total bill from your model provider, but they can't tell you the profitability of a single agent run.
StringCost isn't a library that invades your code; it's a runtime-aware control plane that intelligently inspects your agent's traffic. We integrate at the network level using a secure, signed-URL architecture. Your code calls our Gateway run this request through an asynchronous deep prompt inspection engine, using a meta-classifier to understand the business intent of the action without adding a single millisecond of latency to your user's request.
Automatic instrumentation is the core of this ledger. Our Local Sidecar and asynchronous classifier automatically record every discrete computational action—whether it's an LLM call or an external tool—as a distinct line item with two sides:
The result is a perfect, auditable, real-time P&L for every single agent run. You can finally answer critical business questions: “Which agents are most profitable?”, “Which tools are driving the most cost?”, and “What is the true margin on our AI features?”
Why waste months integrating different providers? We've done the work so you can focus on building.
Access OpenAI, Anthropic, Google Gemini, Cohere, Groq, and hundreds more through a single, unified, OpenAI-compatible endpoint.
Write your code in the standard OpenAI format. We build on top of production-grade open-source tooling that automatically handles the complex prompt and response transformations for you.
A new, cheaper model just dropped? Swap to it by changing a single parameter in your config. No refactoring, no new SDKs, no vendor lock-in.
Your AI bill is a black box. StringCost gives you X-ray vision.
While other proxies just count total tokens, we provide a double-entry ledger for every agent run. Our asynchronous background worker inspects every prompt to give you a true P&L statement for your AI.
Our Event Collector logs the raw event instantly and returns the response to your user with zero delay.
A background Worker (polling every 200ms) calls a meta-classifier to tag every request with an action_type (e.g., synthesis, tool_selection, evaluation).
Finally, you can answer critical business questions. What's the P&L of your Tree-of-Thought agent? Are “evaluation” steps costing more than “synthesis” steps? StringCost gives you the answers.
Stop embedding sk-xxx keys in your agents, clients, or servers. Our architecture is built on a dynamic, signed-URL model that makes key leakage impossible.
Your application asks the Control Plane for permission to run a call.
The Control Plane returns a short-lived, single-use signed URL that contains the encrypted credentials and user context.
Your agent uses this temporary URL to call the Gateway. Our system validates the signature, checks for replay attacks, and proxies the call.
Stop subsidizing your customers' AI usage. StringCost is built for B2B, allowing your users to provide their own provider keys.
Let your customers enter their own OpenAI, Gemini, or Anthropic keys.
We encrypt their key at rest using pgcrypto and set a configurable TTL (e.g., 1 hour).
Our pg_cron job automatically and permanently deletes expired keys. You get all the benefits of BYOK without the risk or liability.
StringCost is not a simple SaaS tool; it's a production-grade stack designed for serious enterprise deployment. You get the control of an on-premise solution with the flexibility of the cloud.
Run the high-performance Gateway and Control Plane in your own Kubernetes cluster. This ensures your sensitive prompts, keys, and AI responses never leave your network, giving you maximum security and compliance.
Use our powerful, managed Classifier service in our cloud, or deploy the entire stack—including the Worker—within your own VPC. The choice is yours.
Our system is a production-grade, K8s-native application, packaged with Helm for easy, repeatable deployment to any certified Kubernetes cluster, whether it's GKE, EKS, AKS, or self-hosted.
We ensure your infrastructure is always in a reliable state. Database migrations are built to run automatically as Kubernetes Jobs before any service starts, guaranteeing your deployments are safe, idempotent, and roll back cleanly.
Stop building brittle, insecure, and non-monetizable AI apps. Start building on a true enterprise-grade control plane.