Does Tracehold sit in the data path between my agent and the cloud?

Yes. Tracehold acts as a single enforcement gateway that intercepts every tool call before it reaches your cloud provider. Every action is classified by risk level, evaluated against your policies, and either allowed, blocked, or held for human approval. You can also run in observe-only mode to start without blocking anything.

Which cloud providers and agent frameworks does Tracehold support?

AWS, GCP, and Azure have first-class support with native credential federation. We also support DigitalOcean, Hetzner, Cloudflare, and any provider with a REST API through custom classifier rules. On the agent side, we ship SDK wrappers for LangChain, Claude Agent SDK, OpenAI Agents SDK, and CrewAI, plus a generic Python client for custom frameworks and internal tools.

What happens if the Tracehold gateway goes down?

Tracehold supports configurable fail modes per organization. You can choose fail-open (agents continue with logging only) or fail-closed (agents are paused until the gateway recovers). Most production deployments use fail-closed for critical workloads and fail-open for non-critical ones.

Is Tracehold self-hosted or managed?

Both. You can self-host Tracehold in your own VPC so your audit trail and credentials never leave your infrastructure, or use our managed deployment. The entire stack runs on Docker Compose with PostgreSQL, Redis, and an OpenTelemetry collector.

How is Tracehold different from a WAF or prompt injection filter?

WAFs and prompt filters protect the input side. Tracehold protects the output side: the actual cloud actions your agent takes. We classify every tool call by risk and blast radius, issue short-lived credentials scoped to each task, and maintain a tamper-evident audit trail. Think of it as IAM and audit for AI agents, not another input filter.

Does Tracehold align with OWASP and NIST standards?

Yes. Tracehold is built around the OWASP Top 10 for Agentic Applications 2026 (ASI01 through ASI10) and maps every gateway decision to NIST SP 800-53 controls (AC-6, AU-2, AU-12, CM-5, IA-9, SC-28, SI-7). The immutable audit trail exports as compliance evidence for SOC 2 and NIST AI RMF assessments.

Task-scoped credentials for AI agents: AWS STS vs GCP WIF vs Azure Managed Identity

Your AI agent does not need a permanent API key. Here is how to issue short-lived, least-privilege credentials per task using each cloud's native identity federation, and why the old way will eventually burn you.

The permanent key problem

Most AI agent setups ship with a long-lived API key baked into the environment. AWS access keys in .env, a GCP service account JSON on disk, an Azure client secret in a vault that nobody rotates. The agent starts, reads the key, and now it has the same permissions for every task it runs until someone remembers to rotate.

This works until it doesn't. A prompt injection trick, a confused-deputy bug, or a simple scope creep and suddenly the agent is calling iam:AttachRolePolicy with a key that has AdministratorAccess. The blast radius is the entire account.

The fix is not a better secret manager. The fix is credentials that are born scoped to one task and die when the task ends.

All three major clouds already support this. The mechanisms are different, the terminology is different, but the outcome is the same: a short-lived token that can only do what the current task requires.

How task-scoped credentials work

The idea is simple:

Agent starts a task ("provision staging-v2 for QA").
Before the first tool call, a credential is minted that covers only the actions this task needs (e.g. ec2:RunInstances, ec2:CreateTags in a specific VPC).
The credential has a TTL, typically 15 minutes. It auto-expires even if nobody revokes it.
When the task ends, the credential is revoked explicitly as a belt-and-suspenders measure.

No permanent key is stored. No broad permissions are inherited. If the agent gets hijacked mid-task, the attacker gets a 15-minute token that can only launch t3.medium instances in staging. That is a bad day. It is not an account takeover.

AWS STS: AssumeRole with inline session policies

AWS Security Token Service is the most mature option and the one most teams hit first.

The mechanism

Your organization creates a single gateway IAM role (e.g. TraceholdGatewayRole) with a broad trust policy that allows your credential issuer to assume it. The role itself has wide permissions, but that's fine because every assumption narrows them down with an inline session policy.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["ec2:RunInstances", "ec2:CreateTags"],
      "Resource": "*"
    }
  ]
}

The resulting session token can only do what the inline policy allows, regardless of what the underlying role permits. This is the key insight: the intersection of the role policy and the session policy is what the agent actually gets.

The call

sts.assume_role(
    RoleArn="arn:aws:iam::123456789:role/TraceholdGatewayRole",
    RoleSessionName=f"tracehold-{agent_id[:8]}-{task_id[:8]}",
    Policy=inline_session_policy_json,
    DurationSeconds=900,  # 15 minutes
)

The response contains three values: AccessKeyId, SecretAccessKey, and SessionToken. Hand these to the agent as environment variables. They expire in 15 minutes whether or not anyone revokes them.

The tradeoff

Strengths:

Inline session policies are powerful. You can scope down to specific resource ARNs, conditions, even IP ranges.
Session names show up in CloudTrail, so you get free traceability back to the agent and task.
No additional infrastructure. STS is built into every AWS account.

Weaknesses:

STS session tokens cannot be revoked early. If you need instant revocation, you have to attach a deny-all inline policy to the role, which affects all active sessions.
The maximum session duration is 12 hours (role-level setting), and inline session policies are limited to 2048 characters. Complex permission sets require creative policy design.
The agent needs network access to STS. In a VPC without a NAT gateway or VPC endpoint, this fails silently.

When to use it

AWS STS is the right choice when your agents run on AWS infrastructure and you need fine-grained, per-action scoping. It is the most flexible of the three options.

GCP Workload Identity Federation: token exchange without service account keys

GCP's approach to short-lived credentials is Workload Identity Federation (WIF). The goal is the same as STS but the mechanism is different: instead of assuming a role, you exchange an external identity token for a GCP access token.

The mechanism

You configure a workload identity pool and a provider that trusts your credential issuer (e.g. an OIDC provider, an AWS account, or a custom token service). When the agent needs credentials, your issuer mints a token, exchanges it with GCP's Security Token Service, and then impersonates a service account.

The flow:

Your credential issuer generates a signed JWT or OIDC token that asserts the agent's identity and task scope.
The token is exchanged via sts.googleapis.com/v1/token for a federated access token.
The federated token impersonates a GCP service account via iamcredentials.googleapis.com/v1/serviceAccounts/{sa}:generateAccessToken.
The resulting access token has the service account's permissions, scoped by IAM conditions.

Scoping permissions

Unlike AWS inline session policies, GCP does not support per-request permission narrowing at the token level. Instead, you scope through:

Multiple service accounts: create one per permission set (e.g. bigquery-reader@, storage-writer@) and impersonate the right one per task.
IAM Conditions: use resource.name conditions, time-based conditions, or custom attributes to limit what the service account can do.
Short TTL: the access token defaults to 1 hour but can be set as low as 15 minutes via lifetime parameter.

The tradeoff

Strengths:

No service account keys on disk, ever. The entire flow is keyless.
Workload Identity Federation supports cross-cloud identity (e.g. an AWS-hosted agent can get GCP credentials without storing GCP secrets).
Access tokens are bearer tokens, simpler to pass around than STS's three-value credential set.

Weaknesses:

Per-request scoping is coarser than AWS. You cannot attach an inline policy to a token exchange. Scoping requires pre-built service accounts or IAM conditions.
Initial setup is more complex. You need a workload identity pool, a provider, and the trust chain configured before you can issue a single token.
Revocation: you can disable the service account, but individual tokens cannot be revoked. TTL is the primary control.

When to use it

GCP WIF is the right choice when your agents touch GCP resources and you want to eliminate service account keys entirely. It is also the best option for cross-cloud setups where agents run on AWS or Azure but need GCP access.

Azure Managed Identity: identity at the infrastructure level

Azure takes a different approach. Instead of issuing tokens from a central credential service, Azure assigns an identity to the compute resource itself (the VM, the container, the Function). The agent inherits the identity from the infrastructure it runs on.

The mechanism

There are two flavors:

System-assigned: Azure creates an identity tied to the lifecycle of the resource. Delete the VM, the identity goes with it.
User-assigned: you create the identity independently and attach it to one or more resources. This is the one you want for agents, because you can share a single identity across a fleet and manage its permissions centrally.

The agent retrieves a token from the Azure Instance Metadata Service (IMDS):

GET http://169.254.169.254/metadata/identity/oauth2/token
    ?api-version=2018-02-01
    &resource=https://management.azure.com/

No credentials are passed. The VM's identity is proven by the fact that the request comes from the VM itself. The response is a bearer token with the identity's permissions.

Scoping permissions

Azure scoping happens through RBAC role assignments:

Assign roles at the resource group or resource level, not the subscription level.
Use custom roles with only the actions the task needs (e.g. Microsoft.Compute/virtualMachines/read + Microsoft.Compute/virtualMachines/deallocate).
Combine with Conditional Access policies for time-based or location-based restrictions.

The tradeoff

Strengths:

Zero credential management. No keys, no tokens to store, no rotation to schedule. The identity is the infrastructure.
Simplest developer experience: the SDK handles token retrieval automatically via DefaultAzureCredential.
System-assigned identities are automatically cleaned up when the resource is deleted. No orphaned credentials.

Weaknesses:

Scoping is per-identity, not per-request. You cannot narrow permissions on a per-task basis without switching identities.
Task-level scoping requires creating multiple user-assigned identities (one per permission set) and attaching/detaching them per task. This is operationally heavier than STS inline policies.
Only works when the agent runs on Azure compute. No equivalent of WIF's cross-cloud federation.
IMDS is only reachable from the VM itself. If your credential issuer runs outside the VM, you need a different approach.

When to use it

Azure Managed Identity is the right choice when your agents run on Azure compute and you want zero credential management overhead. For fine-grained per-task scoping, combine it with multiple user-assigned identities and automate the attachment.

Side-by-side comparison

Dimension	AWS STS	GCP WIF	Azure MI
Per-request scoping	Inline session policies (fine-grained)	Service account + IAM conditions (coarse)	RBAC role assignment (coarse)
Credential format	AccessKeyId + SecretKey + SessionToken	Bearer access token	Bearer access token
Default TTL	15 min to 12 hours	15 min to 1 hour	1 hour (adjustable)
Early revocation	Not possible for sessions	Not possible for tokens	Not possible for tokens
Cross-cloud	No	Yes (WIF accepts external OIDC)	No
Setup complexity	Low (built into every account)	Medium (pool + provider + SA chain)	Low (assign identity to resource)
Key on disk	No (STS is keyless)	No (WIF is keyless)	No (IMDS is keyless)
Agent location	Anywhere with STS access	Anywhere with OIDC token	Azure compute only

The pattern: a credential gateway

Regardless of which cloud you use, the pattern is the same:

Intercept the tool call before it reaches the cloud API.
Classify the action to determine what permissions it needs.
Mint a credential scoped to exactly those permissions, with a TTL tied to the task duration.
Hand the credential to the agent as environment variables or a bearer token.
Log everything: which agent, which task, which action, which permissions, when it was issued, when it expired.
Revoke on task completion as a safety net, even though the TTL handles it.

This is what a credential gateway does. It sits between the agent and the cloud, maps every action to the minimum credential it needs, and ensures nothing gets a permanent key or a broad role.

The alternative is trusting that your agent will never be tricked, never drift, and never use its AdministratorAccess key for something you didn't intend. That bet gets worse every time you add an agent.

If you want to see task-scoped credentials in action against a live agent workflow, book a 30-minute walkthrough. We run it on our sandbox, no SDK install.

Task-scoped credentials for AI agents: AWS STS vs GCP WIF vs Azure Managed Identity

The permanent key problem

How task-scoped credentials work

AWS STS: AssumeRole with inline session policies

The mechanism

The call

The tradeoff

When to use it

GCP Workload Identity Federation: token exchange without service account keys

The mechanism

Scoping permissions

The tradeoff

When to use it

Azure Managed Identity: identity at the infrastructure level

The mechanism

Scoping permissions

The tradeoff

When to use it

Side-by-side comparison

The pattern: a credential gateway

See Tracehold in action