Start free. Scale when your agents do.
Start free, add seats at €29 / core-seat / month, and pay only for the events you send. Self-host whenever you want. No credit card to begin.
Developer
€0free forever
Everything you need to start building and testing agents.
Get StartedNo credit card required
- 50k events / month
- 14-day data access
- 2 users
- 3 scenarios, 3 simulations, 3 custom evals
- Community support (GitHub & Discord)
Most popular
Growth
€29/ core-seat / month
For teams shipping agents to production.
Try for Free- Everything in Developer, plus:
- 200k events included, then €5 / 100k
- 30-day retention included (extend at €3 / GB)
- Unlimited lite-users
- Unlimited simulations, evals, prompts
- Private Slack / Teams support
- Volume discounts above 20 users
Enterprise
Custom
For regulated teams that need control and assurance.
Talk to Sales- Hybrid, self-hosted or on-prem
- Custom data retention
- Custom SSO / RBAC
- Audit logs & SLAs
- ISO 27001 reports, InfoSec & legal review
- Custom Terms, DPA
- Forward Deployed Engineer
- Billing via AWS / Google Marketplace
Only pay for what you use.
Events at €5 per 100k, on top of €29 / core-seat / month. Storage is billed only when you keep data beyond the included 30-day retention, €3 per GB. Drag to estimate.
Events / month
200k
Usage / month
€0
200k60M
First 200k included, €5 / 100k after
Stored beyond 30 days
0 GB
Usage / month
€0
0 GB1 TB
30-day retention included, €3 / GB beyond
Estimated usage: €0 / month + €29 / core-seat / month. Enterprise gets custom usage pricing.
Compare every plan.
| Developer | Growth | Enterprise | |
|---|---|---|---|
| Agent SimulationsAvailable in all packages | |||
| Simulated users (LLM-powered user simulator)An AI plays the user, generating realistic messages from your scenario description. | |||
| Multi-turn conversation testingTest full back-and-forth dialogues, not just single prompts. | |||
| Judge agent, evaluate & verdict at any turnA judge scores the conversation against criteria and can decide pass/fail at any turn. | |||
| Configurable success criteriaDefine what good means in natural language per scenario. | |||
| Scripted to auto-pilot simulationsFrom fully scripted turns to fully automated runs, choose your level of control. | |||
| Tool-call verification across long dialoguesAssert the right tools were called at the right moments throughout a conversation. | |||
| Framework-agnostic adaptersWorks with LangGraph, CrewAI, Pydantic AI and any framework via a one-method adapter. | |||
| Run locally or in CI/CDRuns in pytest / vitest and in your CI pipeline. | |||
| Simulation visualizer (visual debugging)Replay and inspect each simulated run step by step. | |||
| Pause, evaluate & annotate mid-conversationStop a run at any turn to inspect, score, or annotate. | |||
| Open-source Scenario SDK (Python + TypeScript)The Scenario testing framework is open source. | |||
| Voice agent testingEnd-to-end voice simulations with ElevenLabs, OpenAI Realtime, Twilio, Pipecat, Gemini Live. | |||
| Voice: latency metrics & noise/interruption injectionTTFB, p50/p95 latency, plus background-noise and interruption injection. | |||
| Adversarial / red-teamingCrescendo escalation, refusal detection and backtracking to surface vulnerabilities. | |||
| Evaluations | |||
| Offline experiments via SDKRun batch evals over datasets from code. | |||
| Offline experiments via UIRun and compare experiments with a no-code wizard. | |||
| CI/CD integrationGate merges on eval results in your pipeline. | |||
| Multi-modal evaluationsEvaluate text, images and more. | |||
| Online evaluations, MonitorsContinuously evaluate production traffic and alert on drops. | |||
| Evaluation by threadScore whole conversations/threads, not just single messages. | |||
| Guardrails (code integration)Run evals inline as guardrails in your app. | |||
| Built-in evalsRAGAS, hallucination, toxicity, PII, LLM-as-a-judge and more, out of the box. | |||
| Create reusable evaluators org-wideDefine an evaluator once and share it across projects. | |||
| Custom scoringAttach your own scores to traces and spans. | |||
| Build custom evals via workflowsCompose evaluators visually in the workflow builder. | |||
| Annotations / annotation inboxHuman-in-the-loop review queue for labelling and feedback. | |||
| Datasets, programmatic accessCreate and query evaluation datasets from the API/SDK. | |||
| Generate datasets with AIBootstrap datasets automatically with AI. | |||
| Build dataset from tracesTurn real production traces into evaluation datasets. | |||
| Images in datasetsStore and evaluate image inputs in datasets. | |||
| Observability | |||
| Traces and graphs (agents)Full agent traces with nested spans and the execution graph. | |||
| Session tracking (chats / threads)Group traces into conversations with a thread id. | |||
| User trackingAttribute activity to a user id for per-user analytics. | |||
| Topic clusteringAutomatically cluster conversations by topic. | |||
| Token and cost trackingAutomatic token and cost accounting per provider, prompt and model. | |||
| Native framework integrationsFirst-class integrations with popular agent frameworks. | |||
| SDKs (Python, TypeScript)Official SDKs for Python and TypeScript. | |||
| OpenTelemetry (TypeScript, Go, custom)OTel-native instrumentation, including Go and custom setups. | |||
| Proxy-based logging (via LiteLLM)Capture calls through a LiteLLM proxy without code changes. | |||
| Custom via APISend spans directly via the ingestion API. | |||
| Multi-modalCapture text, image and audio payloads. | |||
| Additional usageEvents beyond your monthly allowance are billed per 100k. | €5 / 100k | Custom | |
| Custom usage pricingNegotiated volume rates for large deployments. | |||
| Prompt Management | |||
| Prompt version control (code, UI & API)Versioned prompts editable from code, the UI, or the API. | |||
| Liquid template syntaxTemplating with variables and logic via Liquid. | |||
| Prompt data modelStructured prompts with messages, inputs and config. | |||
| Prompt management via GitHubSync and review prompts through GitHub. | |||
| PlaygroundIterate on prompts side by side across models. | |||
| Prompt experiments / A/B testingCompare prompt versions on quality, cost and latency. | |||
| Webhooks & SlackNotify on prompt changes via webhooks and Slack. | |||
| Prompt tags (deployment stages)Deploy labels (e.g. staging, production) for prompts. | |||
| AI GovernanceEnterprise | |||
| Oversight & policy controlsOrg-wide control over which assistants, providers and tools each team can use. | |||
| Audit log & CSV exportEvery change is logged and exportable to CSV. | |||
| SecurityEnterprise | |||
| Custom SSO (Okta, Azure, AWS, Google)Connect your own identity provider via SAML/OIDC. | |||
| SSO enforcementRequire SSO for all members of your organization. | |||
| Enterprise RBAC (org, project, team)Granular role-based access control across org, project and team scopes. | |||
| SCIM API for automated user provisioningAutomatically provision and de-provision users from your IdP. | |||
| Audit logsFull security audit trail of access and actions. | |||
| Data maskingMask sensitive content in traces and exports. | |||
| Data-retention managementConfigure custom retention windows and deletion policies. | |||
| S3 data export (data retention)Continuously export trace data to your own S3 bucket for long-term retention. | |||
| ComplianceEnterprise | |||
| GDPR / ISO 27001 reportsCompliance reports and documentation on request. | |||
| Custom T&C contractsNegotiated terms, DPAs and custom contracts. | |||
| InfoSec / legal reviews / PenTest reportSecurity questionnaires, legal reviews and penetration-test reports. | |||
| Support | |||
| Private Slack / Teams channelA shared channel with the LangWatch team. | |||
| Onboarding & architectural guidanceHands-on help designing your setup. | |||
| Dedicated support engineer (deployment & hosting)A named engineer for deployment and hosting questions. | |||
| Solution architect during evaluation & rolloutArchitect support through evaluation and rollout. | |||
| Direct access to the product teamA direct line to product for feedback and requests. | |||
| Billing via AWS / Azure / GCP MarketplaceConsolidate billing through your cloud marketplace. | |||
| Response-time SLOGuaranteed first-response times. | |||
| Invoice billingPay by invoice instead of card. | |||
| Support SLAContractual uptime and support service-level agreement. | |||
Pricing questions, answered.
- How does billing work?
- Billing combines seats and usage. Growth is €29 per core-seat / month; on top you pay for usage, €5 per 100k events beyond the 200k monthly allowance, and €3 per GB only if you retain data beyond the included 30 days. Pay by card, or by invoice and AWS / Azure / GCP marketplace on Enterprise.
- How do I track my usage?
- Your usage, events, storage, and cost by user, prompt and model, is visible live in the dashboard, with anomaly alerts. Enterprise adds an org-wide governance dashboard showing spend by team and top spenders.
- What are events?
- An event is a single ingested span: one LLM call, tool call, or retrieval step within a trace. A typical multi-step agent run produces several events.
- How does per-user pricing work?
- Paid Growth seats are €29 per core-seat / month. Add or remove seats anytime; volume discounts apply above 20 users.
- Do I need a credit card to start?
- No. The Developer plan is free forever, sign up and start sending events with no card required.
- Can I self-host?
- Yes. LangWatch runs fully self-hosted with docker compose on your own ClickHouse, so nothing leaves your environment. Enterprise self-host adds SSO, RBAC, SLAs and support.
- Can I change plans or cancel anytime?
- Yes, upgrade, downgrade or cancel whenever you like.
Try the whole platform, free.
Spin up in minutes on the Developer plan, or talk to us about Growth and Enterprise.