Having Fun With Cloud Services

 In Technical Blog

Over the last decade, the use of public cloud infrastructures such as Microsoft Azure and Amazon Web Services (AWS) has grown dramatically. More and more companies are migrating on-premises infrastructure to the cloud. However, the fact that cloud provider application programming interfaces (APIs) are also easily accessible via the internet has opened a new window for adversaries to take advantage.

The purpose of this blog series is to cover a range of cloud providers’ services and explore their respective attack surfaces from the offense and defense perspectives. This post is the first in a series of posts about cloud security. Our cloud research is an ongoing effort, and in this post, we are going to focus on cloud core services and permissions abuse. As we move on, we plan to cover the application stack and also explore Containers-as-a-Service (CaaS) attack surfaces.

We got the idea for this series when my workmate Igal Gofman and I sat down and brainstormed about threats that can impact our customers, and how we can help them mitigate those threats. One of the things we wanted to do is to better understand the cloud environments’ attacks surfaces.

We both played with cloud resources at the companies where we worked or on personal projects at home, so we only had a little understanding of this world. Igal had a more thorough background as part of his previous work at Microsoft, so he was familiar with Azure security and Active Directory. I was interested in virtualization and Big Data, so I played a lot with cloud services.

Igal focused on Azure and SaaS application attack surface. I was focused on the lower infrastructure side of things. We also had the privilege to speak with our customers and consult with them on everyday use cases, so we could understand how they operate their cloud environments.

To better understand the topics ahead, it’s best that you dive in with a basic understanding of AWS cloud services.

In a nutshell, you can use the cloud provider’s APIs to create resources such as virtual machines, storage locations, databases. Using cloud APIs, the user can also manage current, already existing resources. Without further ado, here’s how you can really make the most of your cloud provider’s capabilities.

AWS Components 101

AWS architecture can be intimidating as it contains many components and services. To better understand this post, here are a few components you should get familiar with.

Identity and Access Management (IAM)

AWS Identity and Access Management (IAM) enables you to securely manage access to services and resources. Using IAM, you can create and manage users and groups, and use permissions to allow and deny their access to AWS resources.

Policies and Permissions

Access in AWS is managed by creating policies and attaching them to IAM identities (users, groups of users, or roles) or AWS resources. A policy is an object in AWS that, when associated with an identity or resource, defines their permissions. AWS evaluates these policies when a principal entity (user or role) makes a request. Permissions in the policies determine whether the request is allowed or denied.

Amazon Simple Storage Service (S3)

Storage for the internet. You can use it to store and retrieve any amount of data at any time, from anywhere on the web.

EC2 (Amazon Elastic Compute Cloud)

A web service that enables you to launch and manage Linux/UNIX and Windows Server instances in Amazon’s data centers.

IAM Role

An IAM role is an IAM identity that you can create in your account that has specific permissions. An IAM role is similar to an IAM user, in that it is an AWS identity with permission policies that determine what the identity can and cannot do in AWS. However, instead of being uniquely associated with one person, a role is intended to be assumable by anyone who needs it. Also, a role does not have standard long-term credentials, such as a password or access keys, associated with it.

Assume Role

Returns a set of temporary security credentials that you can use to access AWS resources that you might not normally have access to. These temporary credentials consist of an access key ID, a secret access key, and a security token. Typically, you use Assume Role within your account or for cross-account access.

Access Key

The combination of an access key ID (like AKIAIOSFODNN7EXAMPLE) and a secret access key (like wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY). You then use access keys to sign API requests that you make to AWS.

AWS Systems Manager Agent (SSM Agent)

SSM gives you visibility and control of your infrastructure on AWS. Systems Manager provides a unified user interface so you can view operational data from multiple AWS services, and it allows you to automate operational tasks across your AWS resources. With Systems Manager, you can group resources like Amazon EC2 instances, Amazon S3 buckets, or Amazon RDS instances. You can also group by application, view operational data for monitoring and troubleshooting, and take action on your groups of resources.

Public APIs Authentication

The reason we first decide to look at the cloud provider APIs layer is that usually when you adopt the attacker mindset you look a layer below the entity you would like to exploit. In our case, AWS compute, AWS storage and AWS IAM can all be controlled natively by AWS APIs.

In order to access the public APIs we discussed above, the cloud provider asks us to authenticate. The authentication process is performed by providing the cloud provider with secrets that are attached to cloud identities. After successful authentication, the cloud provider evaluates the requested access by looking at the minimum required permissions to determine if the identity is allowed or denied access.

Igal and I are fans of authorization issues. We deal with those daily. We believe broad authorization permissions are one of the main issues companies face today. Authorization issues are usually the outcome of user mistakes, which are free to the adversaries, meaning they don’t need to invest a lot of resources to exploit them. Authorization issues do not pass the cloud, but when using the right tools, the cloud can manage the security concerns regarding authorization issues fairly well, and probably better than other environments.

There are several ways to manage cloud authorization and access. The popular way is by using Long Term Keys, which are called Access Keys in AWS and Certificates in Azure, or some other type of secret. It is precisely like your Active Directory secrets, and those can have the well-known “Domain Admin” credentials which have full permissions, or regular credentials that help you laterally move through the network as the excellent BloodHound framework does.

 

Many organizations buy into the misconception that the cloud is more secure than on-premises infrastructure. In reality, the cloud has a more sophisticated design with security in mind but by default, it is only as secure as the organization’s investment in its security. If you fail to follow best practices and fully utilize all the security features the cloud provider offers, you are likely to introduce many misconfigurations into your setup.

Cloud providers understood that the concept of regular private keys\passwords type of credentials would create higher security risks because humans are not very good at handling them. Subsequently, they created the concept of managed identities. The purpose of managed identities is to allow local cloud resources to access the cloud API without providing credentials.

They do this by generating temporary tokens that can be used by cloud computing resources. This seems like black magic at first: you give a computer (Function\Virtual machine) a managed identity, and suddenly it can do elevated operations without having any credentials stored! But how does this mysterious process work?

How can my endpoint already have built-in credentials?

 

We started delving into this black magic and found how the CLI tools, for example, are using those credentials. We discovered the cloud SDKs go through a URL, which is http://169.254.169.254. This URL is not supposed to work, and it’s called APIPA (Automatic Private IP Addressing). After further reading, we understood that this URL is probably implemented somewhere in the cloud hypervisor part related to the machine. The hypervisor is implementing some web services that can return information on the instance. Those can also be credentials!

Let’s look at some examples:

Now we have something like regular old Mimikatz. We can do the lateral movement the old fashioned way! But what is the best way to illustrate the old way of lateral movement? Of course, it’s BloodHound!

Our goal right now is to make the BloodHound alternative to the cloud. BloodHound is famous for its usage for Neo4J. We already had experience with this DB, so we decided to use it as well.

Igal and I had the concept of risk in the backs of our minds for a long time. We believe that security should be around risk and calculated risk. To manage security in large environments, the people in charge should be:

  1. Familiar with their risks and security issues,
  2. Understand the scale of those security issues,
  3. Understand how hackers can take advantage of those risks and security issues.

To illustrate this concept, here are some security issues as examples:

  1. One user has access to important EC2 through a single easy attack step
  2. One user has access to important EC2 through multiple easy attack steps
  3. One user has access to important EC2 through a single but complicated attack step
  4. Many users have access to important EC2 through single easy attack step

If we look at those examples, we can see that the first example might seem super crucial if you did not have all the information available to you, but in reality, it might not be. If we play the statistics game, the chance that the attacker finds this specific user is low, while point 4 might be much more critical. Because the human resources of our customers are limited, we should focus on fixing essential things first.

It would be best if you calculate risk based on the adversary sophistication. For example, if your organization has more sophisticated defense infrastructure (such as a bank) you should prioritize more sophisticated attack vectors.

BloodHound is using Neo4J and it allows us to answer those questions in Active Directory space, and we are going to do the same thing for AWS permissions. We gather all the AWS entities (EC2, Lambda, User\Role, Access Keys), and we used the AWS Policy Simulator API to calculate all the permissions from all the users to all the resources.

Here’s how it looks in Neo4J:

 

To understand how those permissions are an issue, you would have to have an excellent grasp of AWS, but we want to make this concept easy to understand if you’re not that proficient with AWS yet. We want to aggregate those permissions into attack vectors. For example, if the user has “Lambda:GetFunction” and “Lambda:UpdateFunctionCode” permission combined, we want to aggregate those into the “Update Lambda Function” attack.

After running the aggregated query, this is how it looks:

 

Now we can focus on the risk side of the solution I was talking about before. To calculate risk, we use the neo4j cypher query language. It’s a good idea to learn cypher before reading those queries, but it’s not mandatory.

Let’s have a look at some query language examples:

MATCH (start:IAMAccessKey{NodeId: “iam_access_key_AKIAIL4WCUER5XXXXX”}), (end:EC2Instance{NodeId:”ec2_instance_i-0e6e9db484fXXXXXX”})MATCH p = (start)-[r:Attack*]->(end) RETURN p AS shortestPath, reduce(cost=0, rel in relationships(p) | cost + rel.weight) AS totalCost
ORDER BY totalCost ASC
LIMIT 1

We try to find a path from the access key to EC2 instance (recursively), and we want to sort the vector by the aggregated cost.

We can make it more generic:

MATCH (start), (end) 
MATCH p = (start)-[r:Attack*]->(end) 
RETURN p AS shortestPath, reduce(cost=0, rel in relationships(p) | cost + rel.weight) AS totalCost 
ORDER BY totalCost ASC 
LIMIT 5

We can guide our queries to focus on important resources:

MATCH (start:IAMAccessKey), (end:EC2Instance) 
MATCH p = (start)-[r:Attack*]->(end) RETURN p AS shortestPath, reduce(cost=0, rel in relationships(p) | cost + rel.weight) AS totalCost 
ORDER BY totalCost ASC
LIMIT 5

So we can query only from any access key to any EC2 instance sorted by difficulty. We can now find some interesting choke points and point them to the relevant people, and get the issue fixed!

You can find our tool here:

Right now, the tool does not focus on PE methods like described below. However, those can be easily added. We recommend you read this post as it is very informative.

 

We hope you enjoyed reading this post, and we love to hear your comments! In part two of this blog series, we are going to explore the Azure stack.

Recommended Posts