We’re building a cloud security agent with Strands Agents that triages AWS security findings. The agent uses the AWS Knowledge MCP server to search and read AWS documentation. The agent looks up remediation guides, CLI syntax, and best practices as part of its analysis. This worked great in development and improved accuracy of the agent significantly over basic consumption of the human-equivalent remediation reference docs. Then we ran our new eval suite with 10 parallel test workers, and started seeing random test failures due to rate limits. Ultimately, we doubled the eval worker parallelism using the authenticated Knowledge MCP tools available on the authenticated AWS MCP server endpoint, but the configuration was not straightforward. This post documents the problem and solution using AWS’ mcp-proxy-for-aws.
The problem: rate limits on the unauthenticated Knowledge Base endpoint
The AWS Knowledge MCP server has a public, unauthenticated endpoint at https://knowledge-mcp.global.api.aws. No credentials needed, no setup required. It’s the endpoint shown in examples and blog posts.
But the unauthenticated AWS Knowledge MCP service seems to rate-limit at roughly 3-4 concurrent connections. I say ‘seems to’ because as of Mar 30, 2026, I could not find any documented quotas for the AWS MCP or Knowledge MCP services.
Connect more than that and you get HTTP 429 responses:
httpx.HTTPStatusError: Client error '429 Too Many Requests'
for url 'https://knowledge-mcp.global.api.aws'
For a single agent running locally, this limit is fine. But it breaks running eval workers in parallel in CI and even locally because each worker has their own agent. Each agent makes multiple search_documentation and read_documentation calls, depending on the finding. And running evals in serial is no fun.
So I started looking for ways to avoid the rate limit without impacting test fidelity. The following solution appears to double the Knowledge MCP rate limit (quota) to ~7 concurrent connections by switching to the authenticated AWS MCP service.
The solution: IAM-authenticated requests via the AWS MCP service
AWS offers an IAM-authenticated MCP endpoint that uses AWS’ standard SigV4 request signing scheme. Authenticated requests get higher rate limits because AWS can identify and account for your workload.
The mcp-proxy-for-aws package handles SigV4 signing. It provides aws_iam_streamablehttp_client as a drop-in replacement for the standard streamable_http_client.
Here’s the working configuration. Getting here took more debugging than I feel is necessary, so sharing the solution.
Install the dependency
pip install mcp-proxy-for-aws
# or in pyproject.toml
dependencies = [
"mcp-proxy-for-aws>=1.1.0",
]
Connect Strands MCPClient to the AWS MCP service
Here is a fully-legible example of using the aws_iam_streamablehttp_client with the AWS MCP service:
from mcp_proxy_for_aws.client import aws_iam_streamablehttp_client
from strands import Agent
from strands.models import BedrockModel
from strands.tools.mcp import MCPClient
# Create the MCP client with IAM authentication
client = MCPClient(
lambda: aws_iam_streamablehttp_client(
endpoint="https://aws-mcp.us-east-1.api.aws/mcp",
aws_service="aws-mcp",
),
)
# Create the agent — this starts the MCP client and loads tools
model = BedrockModel(
model_id="us.anthropic.claude-sonnet-4-6",
max_tokens=8192
)
agent = Agent(
model=model,
system_prompt="You are a helpful assistant with access to AWS documentation.",
tools=[client],
)
# The agent can now use search_documentation, read_documentation, etc.
result = agent("How do I enable MFA for the root user in AWS?")
print(result)
That’s it. With the authenticated endpoint, I have been able to run up to 7 workers using the Knowledge MCP in parallel without 429s. Our eval suite uses 5 workers reliably.
But there are three things that aren’t obvious from the documentation. I felt like I was tabbing in circles through docs trying to figure out what I was doing wrong. So here are the errors and their explanations.
aws_iam_streamablehttp_client config errors for Knowledge MCP
1. The authenticated Knowledge MCP endpoint URL is different from the unauthenticated one
The IAM-authenticated endpoint for the Knowledge MCP uses the aws-mcp service, so in us-east-1 it is:
https://aws-mcp.us-east-1.api.aws/mcp
Not https://knowledge-mcp.global.api.aws nor https://knowledge-mcp.us-east-1.api.aws/mcp. The AWS MCP service hosts the Knowledge Base tools. The aws-mcp endpoint URL supports the AWS SigV4 authentication. Of course, you can try substituting your favorite region into the endpoint URL.
If you use the global knowledge-mcp URL with aws_iam_streamablehttp_client, you’ll get one of these errors that don’t tell you the endpoint URL is invalid:
ValueError: Failed to load tool <strands.tools.mcp.MCPClient object at 0x...>:
Failed to start MCP client: the client initialization failed
ValueError: Failed to load tool <strands.tools.mcp.MCPClient object at 0x...>:
Connection to the MCP server was closed
2. The aws_service parameter must be aws-mcp
The aws_service parameter controls the SigV4 signing scope. The mcp-proxy-for-aws README shows bedrock-agentcore in its examples, but that’s for AgentCore Gateway endpoints.
For the AWS MCP service and Knowledge Base tools, you need aws-mcp:
aws_iam_streamablehttp_client(
endpoint="https://aws-mcp.us-east-1.api.aws/mcp",
aws_service="aws-mcp",
)
This will route the request to the aws-mcp service which hosts the Knowledge Base tools directly.
Wrong service name = the request is signed with the wrong scope. You’ll get the same errors as with the incorrect endpoint URL:
ValueError: Failed to load tool <strands.tools.mcp.MCPClient object at 0x...>:
Failed to start MCP client: the client initialization failed
ValueError: Failed to load tool <strands.tools.mcp.MCPClient object at 0x...>:
Connection to the MCP server was closed
Nothing in the exception mentions SigV4, the service name, or authentication. You just see a generic MCP client failure.
If you look at the exception’s cause, you’ll see a clue that something is wrong with auth:
| Traceback (most recent call last):
| File "/Users/developer/dev/agent/.venv/lib/python3.11/site-packages/mcp/client/streamable_http.py", line 565, in handle_request_async
| await self._handle_post_request(ctx)
| File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/contextlib.py", line 222, in __aexit__
| await self.gen.athrow(typ, value, traceback)
| File "/Users/developer/dev/agent/.venv/lib/python3.11/site-packages/httpx/_client.py", line 1590, in stream
| yield response
| File "/Users/developer/dev/agent/.venv/lib/python3.11/site-packages/mcp/client/streamable_http.py", line 358, in _handle_post_request
| response.raise_for_status()
| File "/Users/developer/dev/agent/.venv/lib/python3.11/site-packages/httpx/_models.py", line 829, in raise_for_status
| raise HTTPStatusError(message, request=request, response=self)
| httpx.HTTPStatusError: Client error '401 Unauthorized' for url 'https://aws-mcp.us-east-1.api.aws/mcp'
| For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/401
+------------------------------------
In the example error above, I switched to the (correct) aws-mcp regional endpoint but thought I was still trying to talk to the knowledge-mcp service (which doesn’t exist but I thought it did based on the global endoint).
Note: The way I got out of this debugging hole was to create an integration test that only tried to configure a connection and use the connection to the Knowledge Base MCP. That enabled me to iterate quickly through a bunch of configurations.
Adding a fallback for local development
During local development, you might not always have AWS credentials configured. A fallback to the unauthenticated endpoint keeps things working:
from mcp.client.streamable_http import streamable_http_client
from mcp_proxy_for_aws.client import aws_iam_streamablehttp_client
from strands.tools.mcp import MCPClient
def create_aws_knowledge_client():
"""Create MCP client with IAM auth, falling back to unauthenticated."""
try:
import botocore.session
session = botocore.session.get_session()
credentials = session.get_credentials()
if credentials is None:
raise ValueError("No AWS credentials available")
return MCPClient(
lambda: aws_iam_streamablehttp_client(
endpoint="https://aws-mcp.us-east-1.api.aws/mcp",
aws_service="aws-mcp",
),
)
except Exception as e:
logger.warning(
f"Falling back to unauthenticated AWS Knowledge MCP: {e}. "
f"IAM-authenticated requests avoid rate limiting."
)
return MCPClient(
lambda: streamable_http_client(
url="https://knowledge-mcp.global.api.aws",
),
)
The warning is important as it tells you when you’re not using the production Knowledge MCP configuration and that you have a stricter rate limit / quota.
The result
After switching to IAM-authenticated requests, our parallel eval suite (10 test cases, 5 workers) runs reliably without any 429 errors. Before this change, 2 of 10 tests would fail randomly on every run due to rate limiting.
If you’re building Strands agents that use AWS documentation, I suggest evaluating whether your use case would benefit from the higher limits of the authenticated AWS MCP server endpoint.
—
*We’re building Kitt, an AI agent that takes your team from security alert to ready-to-apply fix in minutes. Kitt auto-triages findings using your account, code, and IAM context, then proposes exact CLI or IaC fixes. If your team spends hours each week figuring out which findings matter and how to fix them, we’d love to hear about your workflow.*
Recent Comments