Continuous Security Policy Engineering

Review and improve AWS security policies continuously

This guide will help you create a process to review AWS security policies continuously so you can understand and reduce risk from excess privileges. k9 calls this continuous security policy engineering.

Many organizations spend a lot of energy, time, and money implementing and monitoring access controls with little to show for it. Those organizations may have deployed several security tools and stored a ton of data stored in a SIEM. But they still have poor security policies and wait a long time for them.

Why is that?

Scale IAM Security Operations

Learn how organizations scale IAM SecOps by executing the k9 Security Katas

Learn

Interviews with DevOps practitioners reveal the pain of engineering good AWS security policies and analyzing access. These practitioners generally have the practical responsibility of creating and maintaining AWS security policies. Practitioners report AWS security policies are hard to get right and difficult to validate. This overloads cloud security specialists with an unending stream of high-stakes policy development and reviews. Rushing security policy engineering puts the organization and customers at risk. Holding up deployments for security policy engineering delays projects. This situation costs a lot and produces bad results such as data breaches, accidental resource destruction, and burnout.

Let’s analyze this situation starting with common organization-level security policy requirements.

Suppose an organization has information confidentiality requirements such as:

Must preserve privacy of users’ data
Must preserve confidentiality of organization’s intellectual property

These requirements might be collected into a ‘least privilege access’ goal for the organization’s applications or at least applications with ‘critical’ data.

But no tool is going to be able to tell you if you’ve implemented “least privilege” access directly, let alone do it for you. That tool won’t have all the necessary information to make and implement a decision. Implementing and verifying least privilege access requires information spread across many people and tools. Because most security policy engineering processes are not well-defined nor optimized, security specialists spend more time collecting and synthesizing data than analyzing and improving security policies. And people who are not AWS security experts can’t help.

Let’s fix that!

Needs

Start with the high level needs of the security control process. First, you need a process to implement least privilege access. Then you must ensure that access stays least privilege over time while adapting to application and business changes.

You need a process control loop that continuously reviews AWS security policies and converges to least privilege access.

Let’s see how to implement process control for an application’s security policies.

Start with the software application, which is the process whose behavior we want to control.

Figure 1. Process behavior to be controlled

Now let’s introduce a process that enforces the organization’s requirement of least privilege access for confidential data with a feedback process control loop.

Securing Access with Process Control

Process controllers maintain process output or operating conditions by modeling process operation, measuring process inputs and outputs, and adjusting inputs to ensure the process operates within the desired range.

Process control is pervasive:

Thermostats control climate in buildings
Cruise control keeps automobiles at the desired speed
Electrical utilities monitor load and generate electricity to match
Agile teams gather product feedback and prioritize the most important work

You can use a feedback process controller to ensure information security requirements are met.

This diagram illustrates securing access to an application’s data with a process that integrates feedback:

Figure 2. Secure data using process control with feedback

Organizations often have elements of a control process, particularly ‘sensor’ tools that gather raw telemetry. But maybe:

Measurements are not converted into understandable, actionable information nor reviewed
Security controls are not updated based on the collected information
Measurements are difficult to understand or decide whether the configuration is correct
People are overloaded by the volume of measurements

Let’s examine each component’s responsibilities and trace information through the control loop. As we step through each component in the process, think about whether:

Your access control process has an implementation of each logical component
Components are automated or manual
Information flows between each component
This process control loop completes in your team or organization

Process Controller

The first process control component is the controller. Suppose the organization implements the “least privilege access for confidential data” constraint by ensuring between 1 and 5 authorized principals have access to Confidential application data. That ensures someone has access and provides room for the application, backup processes, and say tier-3 customer service to have access to that data.

We could write infrastructure code that infrastructure management tools like Terraform or CloudFormation understand to:

Classify data and compute resources
Specify who is allowed access to the data
Deny everyone else access

Enable engineers to make good access management decisions by adopting libraries with usable interfaces that encapsulate expert AWS security knowledge. This greatly simplifies both implementing and reviewing security policy changes.

If the application’s data is stored in an S3 bucket you could implement that with k9 Security Terraform module for S3:

module "s3_bucket" {
  source = "[email protected]:k9securityio/terraform-aws-s3-bucket.git"

  logical_name = "credit-applications"
  logging_target_bucket = "secureorg-logs-bucket"

  org   = "secureorg"
  owner = "credit-team"
  env   = "prod"
  app   = "credit-processor"

  confidentiality = "Confidential"

  allow_administer_resource_arns = [
      "arn:aws:iam::111:user/ci", 
      "arn:aws:iam::111:role/admin"
  ]
  allow_read_data_arns = [
      "arn:aws:iam::111:role/credit-processor",
      "arn:aws:iam::111:role/cust-service"
  ]
  allow_write_data_arns = ["arn:aws:iam::111:role/credit-processor"]
}

This code describes the desired state of the credit processor application’s data resources and security policies.

The example declares access to data in the bucket with words like administer_resource, read_data, and write_data. Changes to the principals granted each of those access capabilities are easy to implement and review. Reviewing ~200 line bucket policy generated by the library much less so. In a code review, an engineer or tool could see the data is tagged Confidential and 4 principals are allowed to access it, so the least privilege access constraint is met.

Codifying security best practice into libraries makes those practices accessible to every team and engineer.

Once the team has decided on the desired access to grant, the actuator component implements any needed changes.

Actuator

The process controller communicates the desired state of the system to an actuator. The actuator is an infrastructure management tool that knows how to examine the running system and converge it to our desired state by:

Computing changes required to converge reality to desired state
Apply those changes as resource tags and security policy updates

Infrastructure management tools split computation of changes and application into two steps so that operators can review the changes and verify the changes are safe and will do what is desired.

Engineers can review the change plan to double-check the least privilege access constraint is met prior to applying.

Once reviewed and approved, operators apply changes using the infrastructure management tool.

At this point several things can and do happen.

The infrastructure management tool may apply the desired changes successfully. Or it may fail (partially). Even if change application succeeds, another actor may reconfigure the system manually in response to a production incident or to test something. A competing control plane may overwrite changes. The world is a complex place.

So while we have described how access should be configured and used a tool to implement that in the running system, we need to verify reality matches our desired state.

Sensor

Sensors gather data from the running system so that we can analyze its actual state. Because actual systems change for many events, we need sensors to collect this data continuously so the control loop can verify constraints are met.

In the case of analyzing access control, sensors may read low level telemetry like:

The actual security policies
The actual tags on the resource
Whether there are any differences between actual and desired state

The sensor can transmit this raw telemetry directly to the process controller for evaluation. Alternatively, a sensor may compute higher level measurements that are easier to understand and send that to the process controller.

An example of a higher level measurement might be the actual effective access IAM principals have to data and API actions provided by k9 Security. Here’s how k9 reports the effective access to the credit applications bucket:

Figure 3. Example of high level measurement: effective access to S3 bucket

This report clearly shows the access capabilities each of the admin, ci, cust-service, and credit-processor principals have and that no unexpected users or roles have access.

People are an essential part of the process controller and find this information much easier to interpret than simulating policies in their head, and far more accurate (details).

Closing the loop

Now let’s close the control loop. The people and tools that implement the process controller receive the sensor’s information and evaluate compliance with the “least privilege access” constraint. The controller actually reviews the effects of the AWS security policies. Then the controller quickly decides if additional control actions are needed.

If the constraint is violated it could be too many principals were given access, there’s a mixture of data in that data source, or a number of other issues.

If you are required to implement least privilege access to application data, you’ll need to execute a process control loop like this frequently to keep up with changes in the environment. Start by reviewing access once per sprint and adjust to fit your application delivery process.

Organizations should adopt standardized models, processes, and self-service tools that help all teams implement common constraints like least privilege access. This includes:

Classifying data sources according to a standard so everyone has the context they need (e.g. Tagging Guide, Terraform context module)
Generating secure policies with infrastructure code libraries (e.g. Terraform, CDK)
Analyzing effective access principals have to API actions and data (e.g. k9 Security SaaS)
Notifying teams when the actual state of access does not match the desired state

These tools help teams implement and execute the access control loop easily and autonomously. Teams can declare the access they intend, detect when reality drifts, and address the issue themselves. This improves and scales information security while unburdening security specialists from repetitive work.

Review the k9 Security Katas for an example of how to scale AWS IAM security operations with a process designed for frequent execution by domain experts, not security experts.

Scale IAM Security Operations

Learn how organizations scale IAM SecOps by executing the k9 Security Katas

Learn

Summary

Implement effective security control processes by:

Constructing a process control loop with integrated controller, actuator, and sensor components
Standardizing security controls and architecture so you can adopt or create reusable process controller, actuator, and sensor libraries
Incorporating people, processes, and automation into the process controller
Supporting people’s decision making with automation
Executing them frequently so you detect and resolve hazard conditions quickly

Go Fast, Safely.

Contact Us

Please contact us with questions or comments. We’d love to discuss AWS security with you.

Watch Walkthrough