Scaling IAM Security for Major Cloud Platforms

In a recent episode of the ScaleToZero podcast powered by Cloudanix, k9 Security’s founder, Stephen Kuenzli, broke down one of the most persistent challenges in cloud security: how to scale identity and access management (IAM) in large, fast-moving engineering organizations.

The conversation explored what holds companies back, what scalable IAM actually looks like, and how emerging tools like AI and policy automation are reshaping the landscape.

Special thanks to host Purusottam Mupunu and the ScaleToZero team for fostering such a thoughtful and insightful discussion. The podcast continues to be a valuable platform for security practitioners navigating real-world cloud challenges.

This post captures the key lessons from that discussion, along with examples of how leading teams are tackling IAM security in practice.

What Gets in the Way of Scalable IAM

Most organizations still lean on centralized security teams to oversee IAM. In theory, this ensures control and consistency. In practice, it creates bottlenecks. With dozens of application engineers for every security specialist, this model doesn’t scale. Worse, security often gets blamed for slowing delivery.

Another stumbling block is how organizations interpret “least privilege.” Many teams pursue it literally, trying to strip away every unused permission across every resource. This quickly becomes unmanageable. IAM policies are sprawling, and cloud providers expose tens of thousands of permissions. Teams get stuck playing “privilege golf” when they could be addressing risk more strategically.

The reality is that least privilege needs to be framed in terms of business intent. Instead of focusing on individual permissions, teams should ask: Can this principal administer the resource? Can they read or delete data? The first goals are to align access with responsibility and secure access to resources using the language everyone understands.

Self-Service and Secure Defaults

To scale IAM without sacrificing security, security teams must shift from oversight to enablement. This means providing delivery teams with the tools and patterns they need to apply proven security controls themselves.

One approach Stephen described is building self-service reference architectures with embedded security. For example, a data perimeter pattern where each application receives its own KMS key and encrypts all related data with it. Access is then controlled through the key policy. When combined with k9’s open-source policy generators for CDK and Terraform, teams can adopt this approach with minimal friction.

This pattern prevents lateral access across applications, even when they share the same AWS account. More importantly, it removes the burden of writing complex IAM policies from application teams. Security becomes part of the framework, not a gate or an afterthought.

Start with What You Can See

For teams wondering where to begin, Stephen recommends starting with the basics: audit who has IAM administrator access. In many environments, especially older ones, there are multiple accounts with administrative rights that no longer need them. These permissions often linger after an incident or as a workaround that was never cleaned up.

Another common issue is long-lived access keys, particularly for users and applications that could use roles instead. These credentials are frequently involved in breaches. Rotating or removing them is often a low-effort, high-impact improvement.

These simple assessments can surface immediate risks and give teams a foundation to build from.

IAM as Part of the Application Lifecycle

A major theme in the conversation was the importance of integrating IAM into the software development lifecycle (SDLC). For applications, IAM policies should be defined, tested, and deployed alongside the code. The permissions an app will use in production should be the same ones it has in dev and staging. This approach minimizes surprises and ensures consistent enforcement.

This idea maps closely to Stephen’s earlier writing on Continuous Security Policy Engineering, where IAM policy management follows a feedback loop: declare the intended access, apply it through infrastructure as code, measure the effective access, and review the results. It’s a scalable model that prioritizes continuous alignment over one-time fixes.

For human access, the implementation path is different. People often require different permissions in dev versus prod, and organizations may leverage Just-In-Time (JIT) access tools to grant time-bound permissions. And now many teams are adopting entitlement-aware approval processes to evaluate what access is being requested, not just who is requesting it.

AI and the Future of Security Workflows

The conversation also explored how AI and Model Context Protocol (MCP) agents are starting to shape the future of security operations. Stephen ran a four-hour experiment where he configured a lightweight MCP server to pull findings from AWS SecurityHub, then asked an LLM (Claude) to identify the most important issues and propose fixes. The outcome was surprisingly effective.

These kinds of agentic workflows could help security teams surface high-risk issues, summarize context, and route alerts to the right owners. Rather than just generating more noise, well-designed agents could actually support better decisions and faster resolution.

However, volume is a real concern. As AI accelerates software delivery and infrastructure changes, the number of security events, pull requests, and policy diffs will grow. Teams need to rethink how they triage, approve, and measure changes. Stephen emphasized the importance of evaluating existing processes under load and redesigning for throughput, not just accuracy.

He also pointed to a future where agents could help quantify risk in a consistent, context-aware way. Even if teams don’t fully adopt quantitative risk models, consistent severity scoring can improve prioritization and help reduce bias in manual triage.

Usable Security at Scale

Scaling IAM isn’t just about automation or tooling. It’s about making security usable by default. Stephen invoked the design philosophy from Don Norman’s The Design of Everyday Things: good systems make the right action easy to identify, execute, and verify.

If developers don’t understand any of the buttons on the organization’s IAM “remote control”, that’s a failure of design. Instead, security should be delivered as clear, composable building blocks that support engineers without requiring them to become IAM experts.

At k9, that means providing reusable policy generators, context-aware analyzers, and a process that keeps access aligned over time. IAM security becomes continuous, consistent, and built into delivery, not bolted on.

The Bottom Line

Scaling IAM security requires a fundamental shift from oversight to enablement, making secure access patterns the default rather than an afterthought. The organizations that succeed will be those who integrate IAM into their development workflows, provide self-service tools with embedded security, and design systems that developers can use without becoming IAM experts.

Ready to transform your approach? Start by exploring the resources below, from the full podcast discussion, to k9’s practical tools for implementing these strategies in your own environment:

Recent Posts

Recent Comments

Scaling IAM Security For Major Cloud Platforms: Insights from the ScaleToZero Podcast