4. Modifying Cloudfront Origin Requests using Lambda@Edge functions

Introduction

In the previous post in this series, we explored distributing a static website using CloudFront. We encountered an issue: CloudFront could not load index.html pages in subdirectories, which is common for static sites. In this post, we will address that by attaching a Lambda@Edge function to our CloudFront distribution. Here’s an outline of what we will cover:

What are Lambda@Edge functions?
Creating a Lambda function
Configuring IAM roles and permissions
Associating the Lambda function with CloudFront

Let’s dive into each of these topics and learn how Lambda@Edge can help resolve this issue.

What are Lambda@Edge functions?

Lambda@Edge functions are a feature of Lambda that allow you to run serverless code at AWS edge locations around the globe, closer to the end users. This enables you to customize content delivery and improve performance by handling requests and responses at the edge, without requiring round trips to the origin servers like an S3 bucket or EC2 instance.

Benefits of Lambda@Edge

Low Latency: Since Lambda@Edge runs at AWS edge locations, it reduces latency by executing functions closer to users, improving load times and responsiveness.
Content Customization: You can manipulate requests and responses in real time. For instance, you can:
- Rewrite URLs to simplify site navigation.
- Modify headers to manage security or routing.
- Serve personalized content based on location, device, or other attributes.
Security Enhancements: You can add security headers, block unwanted traffic, or inspect requests to prevent malicious activity before reaching your origin server.
Seamless Scaling: Like AWS Lambda, Lambda@Edge automatically scales in response to traffic spikes without requiring any manual configuration.

Common Use Cases

URL Rewrites and Redirects: A prime example is ensuring that requests to subdirectories automatically load the index.html file, which we’ll explore in this post.
A/B Testing: Route a percentage of users to different versions of your site without modifying your main application.
Header Manipulation: Add or modify headers like CORS or security headers to ensure secure communication between your website and users.

How Lambda@Edge Works

Lambda@Edge functions are triggered by CloudFront events, such as when a viewer request or response occurs. These functions are associated with a CloudFront distribution and executed in response to specific events, such as:

Viewer Request: Before the request reaches CloudFront.
Origin Request: Origin Request: Before CloudFront forwards the request to the origin server (this is where we’ll focus).
Origin Response: After the origin responds but before it reaches the viewer.
Viewer Response: After the response leaves CloudFront but before reaching the viewer.

By associating a Lambda@Edge function with CloudFront, we can inject custom logic at different stages of the content delivery process.

Implementing the Lambda function

Let’s now look at how to create a Lambda function that modifies origin requests, so CloudFront can automatically serve the index.html file for subdirectory requests.

The full event payload passed to Lambda@Edge function by Cloudfront for the different event types can be found here.

Implement the JavaScript Lambda Handler

Below is the code for the Lambda handler:

// index.mjs

const hasExtension = /(.+)\.[a-zA-Z0-9]{2,5}$/;

export const handler = async (event) => {
  const request = event.Records[0].cf.request;

  const url = request.uri;

  if (url && url.endsWith("/")) {
    request.uri += "index.html";
  } else if (url && !url.match(hasExtension)) {
    request.uri += "/index.html";
  }
  return request;

Explanation

Directory Requests: If the URL ends with a /, the function appends index.html to it.
- Input: /post1/
- Output: /post1/index.html
No File Extension: If the URL does not match a regular expression for a file extension like .html, .css, .jpg, etc., the handler assumes the request is for a directory and appends /index.html.
- Input: /post1
- Output: /post1/index.html

Packaging the Lambda handler

Next, we’ll package the Lambda function using Terraform’s archive_file data resource to zip the handler for deployment:

# lambda.tf

data "archive_file" "zip" {
  type        = "zip"
  source_file = "${path.module}/index.mjs"
  output_path = "${path.module}/index.zip"
}

The data resource above creates a index.zip archive that we can upload to Lambda.

Configuring Terraform to Provision the Lambda Function

# lambda.tf

resource "aws_s3_bucket" "code" {
  bucket        = "code-bucket-12345667789"
  force_destroy = true
}

The aws_s3_bucket resource above creates an S3 bucket where the Lambda function’s packaged code will be stored.

# lambda.tf

resource "aws_s3_object" "code_package" {
  bucket      = aws_s3_bucket.code.id
  key         = "index.zip"
  source      = data.archive_file.zip.output_path
  source_hash = filebase64sha256(data.archive_file.zip.output_path)
}

The aws_s3_object above uploads the zipped Lambda handler index.zip to the S3 bucket. The source_hash ensures that the object is updated in S3 whenever the zip file content changes.

# lambda.tf

resource "aws_lambda_function" "edge_lambda" {
  function_name    = "origin-request-modifier-lambda-at-edge"
  s3_bucket        = aws_s3_bucket.code.id
  s3_key           = aws_s3_object.code_package.key
  source_code_hash = aws_s3_object.code_package.source_hash
  # We will create this IAM role in the next section
  role             = aws_iam_role.lambda_role.arn
  handler          = "index.handler"
  runtime          = "nodejs20.x"
  timeout          = 5
  memory_size      = 128
  package_type     = "Zip"
  publish          = true
  description      = "A Lambda@Edge function that modifies the origin request."
}

The aws_lambda_function resource provisions the Lambda function by specifying key details such as the function name, S3 bucket, and the location of the zipped code. It also defines the IAM role that the function will use for execution. The publish = true setting ensures that a version of the Lambda function is published after deployment, which is a requirement for Lambda@Edge functions. This version is essential, as we’ll reference it when constructing the full Lambda ARN for use in the CloudFront configuration.

Create the IAM and roles and polices for the Lambda function

To learn more about the policy statements show below, please read the AWS guide found here.

# lambda.tf

data "aws_iam_policy_document" "lambda_assume_role_policy" {
  statement {
    sid     = "LambdaServiceAssumeRole"
    actions = ["sts:AssumeRole"]
    principals {
      type = "Service"
      identifiers = [
        "lambda.amazonaws.com",
        "edgelambda.amazonaws.com",
        "replicator.lambda.amazonaws.com",
      ]
    }
  }
}

The aws_iam_policy_document resource above defines the IAM policy that allows AWS Lambda, Lambda@Edge and the replication service to assume the role needed to execute the Lambda function.

# lambda.tf

data "aws_iam_policy_document" "lambda_exec_role_policy" {
  statement {
    sid = "AllowLambdaToWriteLogs"
    actions = [
      "logs:CreateLogGroup",
      "logs:CreateLogStream",
      "logs:PutLogEvents"
    ]
    resources = [
      "arn:aws:logs:*:*:log-group:/aws/cloudfront/*"
    ]
  }

  statement {
    sid    = "LambdaCreateDeletePermission"
    effect = "Allow"
    actions = [
      "lambda:CreateFunction",
      "lambda:DeleteFunction",
      "lambda:DisableReplication"
    ]
    resources = [
      "arn:aws:lambda:*:*:function:*"
    ]
  }

  statement {
    sid    = "IamPassRolePermission"
    effect = "Allow"
    actions = [
      "iam:PassRole"
    ]
    resources = ["*"]
    condition {
      test     = "StringEqualsIfExists"
      variable = "iam:PassedToService"
      values   = ["lambda.amazonaws.com"]
    }
  }

  statement {
    sid    = "CloudFrontListDistributions"
    effect = "Allow"
    actions = [
      "cloudfront:ListDistributionsByLambdaFunction"
    ]
    resources = ["*"]
  }
}

The aws_iam_policy_document resource above defines the following IAM policy statements,

AllowLambdaToWriteLogs: This statement allows the Lambda function to create log groups, streams, and put log events in CloudWatch Logs. This is necessary for logging and monitoring Lambda executions. The resources field ensures that this permission applies specifically to CloudFront log groups.
LambdaCreateDeletePermission: This statement grants permissions to create, delete, and disable replication for Lambda functions.
IamPassRolePermission: This statement allows the Lambda service to pass an IAM role to a Lambda function. The condition ensures this permission only applies when passing roles to lambda.amazonaws.com, preventing misuse of iam:PassRole for other services.
CloudFrontListDistributions: This statement allows the Lambda function to list CloudFront distributions that are associated with it. This is useful for managing the integration between CloudFront and Lambda@Edge.

# lambda.tf

resource "aws_iam_role" "lambda_role" {
  name               = "origin-request-modifier-lambda-at-edge-role"
  assume_role_policy = data.aws_iam_policy_document.lambda_assume_role_policy.json
}

resource "aws_iam_policy" "policy" {
  name   = "origin-request-modifier-lambda-at-edge-policy"
  policy = data.aws_iam_policy_document.lambda_exec_role_policy.json
}

resource "aws_iam_role_policy_attachment" "lambda_policy_attachment" {
  role       = aws_iam_role.lambda_role.name
  policy_arn = aws_iam_policy.policy.arn
}

Associating the Lambda function to the Cloudfront distribution

Finally, we will associate the Lambda function with our CloudFront distribution. Modify the default_cache_behavior block to include the lambda_function_association section.

cloudfront.tf

resource "aws_cloudfront_distribution" "static_site_distribution" {
  ...
  default_cache_behavior {
    cache_policy_id        = data.aws_cloudfront_cache_policy.policy.id
    allowed_methods        = ["GET", "HEAD"]
    cached_methods         = ["GET", "HEAD"]
    target_origin_id       = aws_s3_bucket.site.bucket_regional_domain_name
    viewer_protocol_policy = "redirect-to-https"
    compress               = true

    # The block below is the new addition to the existing Cloudfront configuration
    lambda_function_association {
      event_type   = "origin-request"
      # The version of the Lambda function will only be available if "publish" is set to "true" in the Lambda configuration
      lambda_arn   = "${aws_lambda_function.edge_lambda.arn}:${aws_lambda_function.edge_lambda.version}"
      include_body = false
    }
  }
  ...
}

Seeing Lambda@Edge in action

With the Lambda function now associated with our CloudFront distribution and configured to trigger on origin-request events, it’s time to test the setup. We want to verify that our distribution is correctly loading the index.html file from subdirectories. If everything has been configured properly, the pages should now load as expected, similar to the example shown below. page-load-after-lambda-at-edge

Conclusion

In this post, we explored how to add a Lambda@Edge function to a CloudFront distribution to modify origin requests. We implemented a Lambda handler that resolves URLs to index.html, packaged the function, and used Terraform to deploy it along with the necessary IAM roles and permissions. By associating this Lambda function with CloudFront, we ensure that static sites can serve index.html pages from subdirectories seamlessly.

In the next post, we’ll take things a step further by setting up a custom domain and SSL certificate for our CloudFront distribution using Route 53 and AWS Certificate Manager. Stay tuned!

3. Distributing the static website using Cloudfront