Debugging AccessDenied in AWS IAM

botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the PutObject operation: Access Denied

Ugh…That AccessDenied error looks like it could be the start of a two hour or two week long goose chase.

Understanding why access was denied in AWS and implementing a secure solution can be complicated. Sometimes it’s not even clear where to start and what to do when you get stuck. (Good news: AWS will improve the AccessDenied error messages in Sept 2021 by adding the policy type that caused the deny.)

Here’s an approach to debugging AWS access control problems:

Read logs, guess, and check by using application
Examine CloudTrail
Explore with integration tests
Simulate whether actions are allowed by policy

This should help you investigate and resolve even the most complicated access control rabbit problems.

Example: Debugging access for a Secure Inbox

The example we’ll use for this article is constructing a Secure Inbox where a publishing process in one AWS account needs to encrypt and deliver files to an S3 bucket in another account. The Secure Inbox pattern looks like:

Figure 1. Secure Inbox Pattern

The tricky bit of the Secure Inbox pattern is getting the security policies right, where:

a publisher service in Account 111, must publish reports to the customer in Account 222
the service temporarily stores an object in its outbox S3 bucket, encrypted with the customer’s reports KMS key in Account 222
the service publishes the report to the customer-managed inbox S3 bucket in Account 222
the published report data is also encrypted with the customer-managed reports encryption key

The critical API actions are s3:PutObject to the internal outbox S3 bucket managed by the service and s3:CopyObject to deliver the object to the customer. Both actions use the customer-managed key to encrypt the customer’s data and keep them in control of it.

This Secure Inbox implementation depends on IAM, S3 bucket, and KMS key policies all working together correctly across accounts.

Read logs, guess, and check by using application

Most engineers start by reading app logs and testing through the app, but this turns into a quagmire fast. In this case, the s3:PutObject action to the outbox bucket appears simple enough:

# _s3_client is a boto3.client('s3')
response = self._s3_client.put_object(ACL='private',
                                      ServerSideEncryption='aws:kms',
                                      SSEKMSKeyId=kms_encryption_key_id,
                                      Bucket=bucket_name,
                                      Key=key,
                                      Body=body_bytes)

The ‘obvious’ part is to specify server-side encryption by aws:kms and the customer’s KMS encryption key ARN with the S3 PUT API action.

AWS KMS provides customer managed encryption keys and an api. The really neat thing about the KMS API is that you don’t have to give full access to an encryption key. Instead, you can Allow use of:

specific api actions like kms:Encrypt and kms:GenerateDataKey
for particular encryption keys
for particular AWS principals: IAM roles and users or entire AWS accounts

But… what you don’t see and isn’t documented directly in the s3:PutObject API docs are the specific permissions you need for the KMS key policy in order to PUT the object. The KMS policy generated by the AWS wizard when you provision a key allows the necessary actions, but that policy is very broad.

Further, from the application perspective, if you you go ahead and catch the ClientError from the s3 client’s put_object method, and print out the error response, you’ll still only see something like:

Unexpected error putting object into bucket: { 
  'Error': {
    'Code': 'AccessDenied',
    'Message': 'Access Denied'
  },
... snip ...
}

Still nothing more than AccessDenied. AWS doesn’t give detailed information about how to debug permissions problems in API responses.

This is really important to understand if you’re a security or platform engineer — application engineers often cannot debug permissions problems from their application code, even if they want to.

We need a better tool for understanding the system. The next tool to use is CloudTrail.

CloudTrail

CloudTrail provides very useful information if engineers have access and know how to query it. CloudTrail is an AWS service that provides an audit log of important events that occur in your AWS account. The logs, called trails, record most AWS API usage, but not all. CloudTrail log events include important request parameters and the principal (IAM user or role, AWS service) they were executed with.

If you have CloudTrail enabled in your account (you definitely should) and access to view the trail, you may be able to find valuable clues as to why access was denied, and to which resource access was denied. (Note: CloudTrail does not log data plane events such as sqs:SendMessage or does not do so by default, e.g. s3:PutObject.)

Here’s an example AccessDenied event logged to CloudTrail in the service account 111 that reveals why a report object couldn’t be stored in the outbox bucket:

Figure 2. AccessDenied event in CloudTrail

Progress!

You can find events like this by filtering for Error Code AccessDenied in the AWS CloudTrail console.

The skuenzli user was denied access to use the kms:GenerateDataKey api. Now, I’m running through this example from my laptop with nearly Admin permissions. I know I have access to invoke kms:GenerateDataKey.

The real issue is hidden inside the errorMessage of the event:

{
    "eventVersion": "1.05",
    "userIdentity": {
        "type": "IAMUser",
        "principalId": "AIDAJREII7F7Q2K7QMCLE",
        "arn": "arn:aws:iam::111:user/skuenzli",
        "accountId": "111",
        "accessKeyId": "ASIAEXAMPLE",
        "userName": "skuenzli",
        "sessionContext": {
            "sessionIssuer": {},
            "webIdFederationData": {},
            "attributes": {
                "mfaAuthenticated": "false",
                "creationDate": "2019-12-03T15:55:35Z"
            }
        },
        "invokedBy": "AWS Internal"
    },
    "eventTime": "2019-12-03T15:55:35Z",
    "eventSource": "kms.amazonaws.com",
    "eventName": "GenerateDataKey",
    "awsRegion": "us-east-1",
    "sourceIPAddress": "AWS Internal",
    "userAgent": "AWS Internal",
    "errorCode": "AccessDenied",
    "errorMessage": "User: arn:aws:iam::111:user/skuenzli is not authorized to perform: kms:GenerateDataKey on resource: arn:aws:kms:us-east-1:222:key/e9d04e90-8148-45fe-9a75-411650eea80f",
    "requestParameters": null,
    "responseElements": null,
    "requestID": "722abc0e-77c2-42e0-8448-4f0469420f3a",
    "eventID": "92edcefe-6d12-4357-85f4-20f709f3e413",
    "readOnly": true,
    "eventType": "AwsApiCall",
    "recipientAccountId": "111"
}

Aha:

"User: arn:aws:iam::111:user/skuenzli is not authorized to perform: kms:GenerateDataKey on resource: arn:aws:kms:us-east-1:222:key/e9d04e90-8148-45fe-9a75-411650eea80f"

The skuenzli user is not permitted to invoke kms:GenerateDataKey with account 222’s encryption key. This is what was really being denied access.

The s3:PutObject action invokes kms:GenerateDataKey on the IAM principal’s behalf. s3:PutObject does the same for kms:EncryptData.

Time to update the KMS encryption key policy.

It’s also time for a new tactic: fine-grained integration tests.

Fine-grained integration tests

You may start debugging access problems like this by investigating them through the application. But if an application normally takes several minutes to run and indicate integration success or error, then that feedback loop is probably too expensive to focus and resolve the access control problem. You’ll forget what you were testing by the time the results arrive.

Instead, create ‘fine-grained’ integration tests or Lambda test events that isolate one aspect of the integration. Integration tests are essential for verifying complicated integrations work over time. Let’s break the example’s integration down.

In the Secure Inbox use case, there are two integrations at work:

store the object in account 111’s s3 bucket encrypting with account 222’s encryption key
copy the object to account 222’s s3 bucket encrypting with account 222’s encryption key

Start with an integration test that exercises the full Secure Inbox use case: store in the outbox bucket and copies to the inbox bucket. Execute the store and copy operations with a representative test object. Executing just that portion of code directly should only take a few seconds, enabling you to iterate on KMS encryption key and S3 bucket policy quickly.

Depending on your testing philosophy and reality of your feedback loop, you may want to break the tests down further. The copy operation is deceptively simple.

During the s3:CopyObject operation, S3:

gets the object from the source bucket, outbox
decrypts the object
generates a data key and encrypts the object
puts the object into the target bucket, inbox

This uses several S3 and KMS API actions and an additional resource that were not used in the store step. The copy operation warrants its own integration test.

Cross-account Access

When your use case involves cross-account access, test with an actual representative AWS account setup in order to get the resource policies correct. In this use case, account 222’s S3 & KMS resource policies need to allow account 111’s IAM principals to use resources. If you try to develop this sort of solution entirely inside of account 111, it’s likely your IAM permissions grant unintended access that allow the store and copy to succeed.

Once you have a quick and easy way to test the integration, you can iterate and learn much quicker and be confident your solution really works when the policies are done.

Now let’s take a look at another tool that can accelerate feedback.

Policy simulator

Once you’ve narrowed the access problem down to particular API actions and resources, you can use the IAM policy simulator to discover and fix problems in IAM and Resource policies.

With the policy simulator you can simulate AWS API actions with all of the contextual information we’ve been talking about here:

the actual IAM user or role
current or proposed policies
one or more api actions, like s3:PutObject
specific resources: buckets, KMS keys and their policies
policy condition keys

The simulator will tell you if an action is allowed, which policy allowed or denied it, and basic diagnostic information about why an action was not permitted.

The simulator web UI is a little clunky, but it’s improving. See this tutorial on Testing an S3 policy using the IAM simulator for an introduction to the mechanics. Also be aware that the UI (as of Jan 2021):

does not always prominently indicate errors that affect the simulation; for example, if the simulator is unable to retrieve a resource policy, it proceeds with the simulation without the policy instead of throwing an error
does not indicate an action was denied by a Service Control Policy; however this information is available in the API response

Still, policy simulation can be a very useful diagnostic tool within a policy development cycle because it tells you what IAM is going to decide about an action without having to execute it. This is particularly useful for actions that would otherwise write or delete data or resources.

Let’s recap.

Summary

This approach to debugging and resolving AWS security policy issues:

showed how to investigate and understand why access is denied
shortened feedback loops for both investigating the problem and developing the solution
identified issues to be aware of with cross-account use cases and diagnostic tools

Contact Us

Please contact us with questions or comments. We’d love to discuss AWS security with you.