Dimensioning & Costing AWS Workloads using AWS Cost & Usage Reports (CUR)

Table of Contents

Introduction

This article describes an approach for correlating the business outcome of an AWS workload with the cost of its AWS resources.

The end goal of this activity is to arrive at a point where, given the input parameters of a workload (like data ingested), we should be able to predict the cost of AWS resources needed to run this workload.

We begin by running the workload several times, with varying input parameters. Each time, we measure both the input metrics of the workload, like amount of data ingested & the AWS bill it generates.

Eventually, we arrive at a correlation like this:

WorkloadInput Metric #1Input Metric #2Input Metric #3Total Cost of All AWS Resources Involved
Workload Run #1M1aM2aM3aC1
Workload Run #2M1bM2bM3bC2
Workload Run #3M1cM2cM3cC3

Different AWS services are used to gather this data:

DataService
Input MetricsStandard CloudWatch service-level metrics or custom business-oriented metrics, like orders processed
AWS CostAWS Cost & Usage Reports

Create AWS Cost & Usage Reports (CUR)

The AWS CUR contains the most comprehensive set of cost & usage data available. Reporting can be enabled on an account or organization-level. Reports can be configured to breakdown costs by the hour. Once enabled, the first report might take up to 24 hours to be delivered. Reports are delivered into an S3 bucket you choose. Reports are updated at least once a day & at most thrice a day.

Follow these steps to enable CUR:

Open https://console.aws.amazon.com/billing/home#/reports/create

Provide a name for the report & select “include resource IDs” & auto-refresh:

On the next screen, click Configure to configure the S3 bucket where the reports will be delivered.

You can either create a new bucket or choose an existing one:

AWS will generate a bucket policy for this bucket, to allow AWS billing to put reports in it:

{
  "Version": "2008-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "billingreports.amazonaws.com"
      },
      "Action": [
        "s3:GetBucketAcl",
        "s3:GetBucketPolicy"
      ],
      "Resource": "arn:aws:s3:::my-cur-bucket",
      "Condition": {
        "StringEquals": {
          "aws:SourceArn": "arn:aws:cur:us-east-1:123456789012:definition/*",
          "aws:SourceAccount": "123456789012"
        }
      }
    },
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "billingreports.amazonaws.com"
      },
      "Action": [
        "s3:PutObject"
      ],
      "Resource": "arn:aws:s3:::my-cur-bucket/*",
      "Condition": {
        "StringEquals": {
          "aws:SourceArn": "arn:aws:cur:us-east-1:123456789012:definition/*",
          "aws:SourceAccount": "123456789012"
        }
      }
    }
  ]
}

Verify the policy & save:

Provide a report path prefix, select hourly granularity & enable Athena:

Complete the CUR creation wizard.

Activate Cost Allocation Tags

Since an AWS account could be running several workloads at any given time, one way to isolate a single instance of a workload is to tag all resources of that workload with a unique value. However, in order for any tags to appear in CUR, they must be activated as cost allocation tags. To do so:

Open https://console.aws.amazon.com/billing/home#/tags

If you’re using AWS Organizations, cost allocation tags can only be enabled in the master account.

Cost allocation tags enabled in the master account will show up in CURs in member accounts.

Search for the tag you’ll be using for your workloads, select & activate it:

You cannot add a new tag to the cost allocation tags console. The console lists tags that already exist in your account/organization. You must select & activate one of them.

If the tag you wish to use is not listed here, apply it to a resource temporarily & wait a few hours for it to show up here. You can then activate it.

Configure Amazon Athena to Query CUR

A CUR report can be sizeable, even for small/medium sized accounts. Amazon Athena can be used to query it using SQL to fetch the information we need after running a workload.

To configure Athena, we use a CloudFormation template generated by AWS when we configured CUR. Start by opening the CUR bucket & look for crawler-cfn.yml. Copy its Object URL:

Next, open https://console.aws.amazon.com/cloudformation/home#/stacks/create/template

Create a CloudFormation stack using the object URL:

The stack creates:

  1. An AWS Glue database, table & crawler
  2. A Lambda function to trigger the Glue crawler every time AWS creates or updates CUR in the bucket
  3. The S3 event trigger for the Lambda function
  4. The associated Lambda permissions & IAM roles & policies

Example — Measure Workload Cost

Everything we have done so far was a one-time setup activity as a prerequisite to the “costing workloads” activity. Now let’s see how to use this setup to measure AWS costs for a workload.

As an example workload, we will use “option 3” of an AWS-provided streaming data solution:

AWS Streaming Data Solution for Amazon Kinesis

Launch the provided CloudFormation template. Ensure you tag the workload:

We can now use the Kinesis Data Generator (KDG) to simulate incoming data to the workload.

Leave the KDG running for a while. Watch the Kinesis CloudWatch metrics.

After this simulated workload has run for a while, we need to wait a few hours for the CUR to be updated.

After the CUR has updated, query Athena to see the cost records associated with this workload:

SELECT * FROM "athenacurcfn_workload_c_u_r"."workloadcur"
WHERE resource_tags_user_workload = 'Perf Test 1'

To see the total cost of the workload, run:

SELECT SUM(line_item_unblended_cost)
FROM "athenacurcfn_workload_c_u_r"."workloadcur"
WHERE resource_tags_user_workload = 'Perf Test 1'

CUR Fields of Interest

FieldDescription
line_item_usage_account_idThe AWS account ID that consumed the resource
line_item_usage_start_date
line_item_usage_end_date
Start & end time for which the current line item captures the cost. This is usually the start & end of an hour for reports with hourly granularity.
line_item_product_codeThe AWS service that was consumed. EG: AmazonKinesis
line_item_usage_typeThe metric of the resource used for billing. EG: Storage-ShardHour & PutRequestPayloadUnits for Kinesis
line_item_resource_idThe unique resource identifier. EG: KDS ARN, S3 bucket name
line_item_usage_amountThe quantity of resource used is relation to line_item_usage_type. EG: 1 ShardHour for KDS
line_item_unblended_cost
line_item_blended_cost
See:
Why does the “blended” annotation appear on some line items in my AWS bill?
Understanding your AWS Cost Datasets: A Cheat Sheet
line_item_line_item_descriptionHuman-readable description of the charge. Examples:
> $0.015 per provisioned shard-hour (Kinesis)
> $0.005 per 1,000 PUT, COPY, POST, or LIST requests (S3)
> AWS Lambda – Requests Free Tier – 1,000,000 Requests – US East (Northern Virginia)
pricing_public_on_demand_costThe public price of the resource. You pay less based on various factors like credits, reservations & savings plans, agreements with AWS, support fees, etc
resource_tags_user_YOUR-TAGThe cost allocation tags you activate

For detailed description of all CUR fields, see Data dictionary.

To create a correlation between the workload inputs & its cost, we can use a combination of CloudWatch metrics like PutRecordsBytes / IncomingBytes for a Kinesis stream & relevant fields from the CUR like line_item_usage_amount & line_item_blended_cost.

Conclusion

This article describes how you can use the AWS CUR to gain detailed insights into your AWS billing. Although it may not be practical to use the CUR directly for larger workloads, understanding it will give you an appreciation of how other tools & services in the market use the CUR as their data source & provide actionable insights.

References

The AWS Cost and Usage Report (CUR)

Configure Cost and Usage Reports

What are AWS Cost and Usage Reports?

About the Author ✍🏻

Harish KM is a Principal DevOps Engineer at QloudX & a top-ranked AWS Ambassador since 2020. 👨🏻‍💻

With over a decade of industry experience as everything from a full-stack engineer to a cloud architect, Harish has built many world-class solutions for clients around the world! 👷🏻‍♂️

With over 20 certifications in cloud (AWS, Azure, GCP), containers (Kubernetes, Docker) & DevOps (Terraform, Ansible, Jenkins), Harish is an expert in a multitude of technologies. 📚

These days, his focus is on the fascinating world of DevOps & how it can transform the way we do things! 🚀

Leave a Reply

Your email address will not be published. Required fields are marked *