Automated AWS Reporting: SQS And Lambda

Alex Johnson
-
Automated AWS Reporting: SQS And Lambda

In this article, we'll explore how to build an automated report generator on AWS using SQS (Simple Queue Service) and Lambda. This setup allows you to process data, generate reports, and store them efficiently in S3, making them accessible through CloudFront. This project focuses on infrastructure setup rather than code implementation, providing a solid foundation for your reporting needs. Let's dive into the details!

Project Overview

Our system is designed around an AWS Lambda function triggered by an AWS SQS queue. This queue receives JSON data, which our Lambda function uses to generate static report websites. These reports are then stored in AWS S3 for easy access and distribution. You can implement all operations using the AWS UI or the AWS CLI, whichever you prefer.

To give you a glimpse of what the Lambda function produces, here are some examples:

  • Generated website report (GIF image quality might be reduced)

    GIF of Generated Website Report

  • Generated Excel file (GIF quality might be reduced)

    GIF of Generated XLSX File

AWS Services Workflow - An Overview

The Big Picture

At a high level, here’s how our system works:

  1. An SQS queue receives messages containing JSON payloads.
  2. These messages can be sent via an API, AWS API Gateway, AWS CLI, AWS SDK, or other methods.
  3. The queue then triggers an AWS Lambda function.
  4. The Lambda function processes the JSON data and generates both a static website report and an XLSX file.
  5. The static website is stored in an AWS S3 bucket.
  6. Finally, the website files stored in S3 are served via CloudFront, ensuring low latency and edge caching.
  7. (Optional) You can integrate with AWS SES to automate sending reports to users via email.

What are AWS Lambda, SQS, and S3?

It’s essential to understand the core services we’re using:

  • AWS Lambda: This is an event-driven, serverless Function-as-a-Service (FaaS) provided by AWS. You only pay for the compute time you consume, making it highly cost-effective. However, there are limits, such as a 15-minute (900 seconds) maximum runtime and configurable memory up to 10 GB. These limits are subject to change, so always check the official AWS website for the latest details. There are multiple ways to package and manage dependencies for Lambda functions.
    • Lambda Layers (for Node.js): These are great for code reuse across multiple functions and reducing deployment package size. However, they don't reduce cold start times or runtime installation delays, which is a common misconception. There are other ways to avoid this problem, like provisioned concurrency.
    • Docker Images: These offer more control over the runtime environment and support larger dependency sets. You can use Lambda Layers with ECR pre-built images for enhanced flexibility.
  • AWS SQS: This is a message queuing service that enables decoupled communication between components. It ensures that messages are processed reliably, even if parts of your system are temporarily unavailable.
  • AWS S3: This is an object storage service for storing and retrieving data. It's highly scalable, durable, and cost-effective, making it ideal for storing our generated reports.
  • AWS CloudFront: A content delivery network (CDN) that distributes your content with low latency. It caches content at edge locations worldwide, ensuring fast access for users regardless of their location.

Hands-On: How It's Developed

Now, let’s walk through the steps to set up our automated report generator.

1. Install AWS CLI and Connect to AWS

First, you'll need the AWS Command Line Interface (CLI) to interact with AWS services from your terminal:

  • Install: brew install awscli (if you're using macOS with Homebrew)
  • Check Installation: aws --version
  • Connect to AWS: aws configure (This will prompt you for your AWS Access Key ID, Secret Access Key, Region, and Output format.)
  • Verify Connection: aws s3 ls (This command lists your S3 buckets and confirms that you're connected to AWS.)

For more information on the AWS CLI, check out the AWS CLI User Guide.

In the following sections, we'll assume you're logged into the AWS console and have the AWS CLI configured.

2. Create a Lambda Function - AWS User Interface

Let's start by creating our Lambda function. This function will process the data from SQS and generate the reports.

  1. Go to the AWS Lambda page.
  2. Click Create function.
  3. Choose Author from scratch.
  4. Select Node.js 22.x as the runtime.
  5. Give your function a name (e.g., report-generator-lambda).
  6. Under Permissions, select Create a new role with basic Lambda permissions or create a new role for the service.
  7. Click Create function.
  8. Once the function is created, navigate to Configuration > Environment variables and set the following environment variable:
    • S3_BUCKET_NAME: The name of your S3 bucket (we'll create this in the next step).

3. Create an S3 Bucket - AWS User Interface

Next, we'll create an S3 bucket to store our generated reports. S3 provides a scalable and cost-effective storage solution.

  1. Open the AWS S3 page.
  2. Click Create bucket.
  3. Enter a unique bucket name (e.g., report-generator-bucket).
  4. Choose the same region as your Lambda Function.
  5. Configure options as needed (versioning, encryption, etc.).
  6. Click Create bucket.

It’s crucial to keep your bucket private since we’ll use CloudFront to serve the content, eliminating the need to make the bucket publicly accessible.

4. Configure CloudFront to Serve AWS S3 Bucket Content

CloudFront will help us distribute our reports with low latency and high availability.

4.1. Create a CloudFront Distribution

  1. Open the AWS CloudFront page.
  2. Click Create Distribution.
  3. Under Origin, choose your S3 bucket.
  4. Under Default Cache Behavior Settings, find Origin request policy and change it to CORS-S3Origin.
  5. Click Create Distribution.

The CORS-S3Origin policy forwards CORS-related headers (Origin, Access-Control-Request-\*) to S3, allowing browsers to load your files correctly.

4.2. Configure S3 CORS

To ensure that CloudFront can serve content from your S3 bucket, you need to configure CORS settings.

  1. Open the AWS S3 page.
  2. Select your bucket.
  3. Go to the Permissions tab.
  4. Click Edit under Cross-origin resource sharing (CORS).
  5. Add allowed origins, methods (GET, HEAD), and headers to match your CloudFront settings.
  6. Save the changes.

Here’s an example CORS configuration:

[
  {
    "AllowedHeaders": ["*"],
    "AllowedMethods": ["GET", "HEAD"],
    "AllowedOrigins": ["https://YOUR_CLOUDFRONT_DOMAIN.cloudfront.net"],
    "ExposeHeaders": []
  }
]

Replace YOUR_CLOUDFRONT_DOMAIN.cloudfront.net with your actual CloudFront domain.

4.3. (Optional) Configure Custom Cache Policy

For optimal performance, you can configure a custom cache policy with a longer TTL (Time-To-Live).

  1. Go to your CloudFront distribution.
  2. Click Behaviors.
  3. Select Cache policy and choose Create policy.
  4. Configure the settings and set the TTL values to the maximum (e.g., 10 years).
  5. Add the policy to your distribution and save.

5. Set Up AWS S3 Bucket Role-Based Access Permissions for AWS Lambda - AWS UI

To allow your Lambda function to write reports to your S3 bucket, you need to configure the appropriate IAM permissions.

  1. Open the AWS IAM page.
  2. Find the IAM role associated with your Lambda function (it usually starts with lambda_ or the name you gave your function).
  3. Click on the role name to open it.
  4. Click Add permissions and select Attach policies.
  5. Select Create policy.
  6. Choose the JSON tab and paste the following policy:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["s3:PutObject"],
      "Resource": "arn:aws:s3:::YOUR_BUCKET_NAME_HERE/*"
    }
  ]
}

Replace YOUR_BUCKET_NAME_HERE with your actual bucket name.

  1. Click Next: Tags (you can skip tags).
  2. Click Next: Review.
  3. Give your policy a name (e.g., lambda-s3-access-policy) and click Create policy.
  4. Back on the Attach policies screen, search for your newly created policy and select it.
  5. Click Attach policies.

6. Create an SQS Queue - AWS User Interface

Now, let’s create the SQS queue that will trigger our Lambda function.

  1. Open the AWS SQS page.
  2. Click Create queue.
  3. Choose the Standard queue type.
  4. Enter a queue name (e.g., report-generation-queue).
  5. Under Access policy, configure the following:
    • Set Send messages permission to your AWS account ID.
    • Set Receive messages permission to the ARN of your Lambda IAM role (you can find this in the IAM console).
  6. Click Create queue.

7. Add a Lambda Trigger to the SQS Queue

To automatically trigger the Lambda function when a message is added to the queue, we need to configure a trigger.

7.1. Configure IAM Permissions

First, ensure that the Lambda IAM role has the necessary permissions to receive messages from the SQS queue.

  1. Open the AWS IAM page.
  2. Find the IAM role associated with your Lambda function.
  3. Click Add permissions and select Attach policies.
  4. Select Create policy.
  5. Choose the JSON tab and paste the following policy:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "sqs:ReceiveMessage",
        "sqs:GetQueueAttributes",
        "sqs:DeleteMessage"
      ],
      "Resource": ["arn:aws:sqs:REGION:ACCOUNT_ID:QUEUE_NAME"]
    }
  ]
}

Replace REGION, ACCOUNT_ID, and QUEUE_NAME with your actual values.

  1. Click Next: Tags (you can skip tags).
  2. Click Next: Review.
  3. Give your policy a name (e.g., lambda-sqs-access-policy) and click Create policy.
  4. Back on the Attach policies screen, search for your newly created policy and select it.
  5. Click Attach policies.

7.2. Configure the SQS Trigger

Now, let's add the trigger to the SQS queue.

  1. Open your SQS queue in the AWS SQS console.
  2. Go to the Lambda triggers tab.
  3. Click Add trigger.
  4. Choose your Lambda function.
  5. Click Save.

8. Create a Lambda Layer for Code Reuse and Smaller Deployments - AWS UI

Lambda Layers allow you to package dependencies separately, promoting code reuse across multiple Lambda functions and reducing deployment package sizes. This is particularly useful for Node.js projects with many dependencies.

8.1. Create the Dependency Layer

  1. Open the AWS Lambda page.
  2. Go to Layers.
  3. Click Create layer.
  4. Enter a name for your layer (e.g., dependencies-layer).
  5. Upload a .zip file containing your dependencies.
  6. Select the Node.js 22.x runtime.
  7. Choose the x86_64 architecture.
  8. Click Create.

The structure of the .zip file is crucial. The layer must have a nodejs/node_modules/ structure at the root of the package for Lambda to find the dependencies.

Create the layer using the following command:

npm run create:layer:x64 # Installs Linux x64 production dependencies and creates layer.zip

8.2. Attach the Layer to Your Lambda Function

  1. Go to your Lambda function.
  2. Click Layers.
  3. Click Add a layer.
  4. Select Choose from layers.
  5. Choose your dependencies layer.
  6. Click Add.

9. Upload Your Code to Lambda Using the AWS CLI

Finally, let’s upload your Lambda function code using the AWS CLI.

  1. Compress your Lambda code into a zip file:

    npm run build:zip
    
  2. Upload the zip file to Lambda:

    aws lambda update-function-code \
      --function-name YOUR_LAMBDA_FUNCTION_NAME \
      --zip-file fileb://function.zip
    

Replace YOUR_LAMBDA_FUNCTION_NAME with your function name.

It’s important that the file structure in function.zip has index.js at the root level:

function.zip
├── index.js

Congratulations! You’ve now set up your static report generator.

Using SQS to Trigger Lambda

Let’s test our setup by sending messages to the SQS queue.

Sending Messages

You can send messages to your SQS queue using the AWS CLI:

Using a File:

aws sqs send-message \
  --queue-url https://sqs.REGION.amazonaws.com/ACCOUNT_ID/QUEUE_NAME \
  --message-body file://your-file.json

Replace REGION, ACCOUNT_ID, and QUEUE_NAME with your actual values.

Message Format:

The message body should be in JSON format. For example:

{
  "data": [
    {
      "id": "001",
      "name": "Alice Johnson",
      "value": 1250.5,
      "category": "Sales",
      "date": "2024-01-15"
    }
  ],
  "reportTitle": "Monthly Performance Report"
}

Remember, you can also trigger this process via an API Gateway or programmatically by sending messages to the SQS queue from one of your APIs.

Lambda Function Verification

To verify that your Lambda function is working correctly, check the logs and S3 bucket contents.

Checking Logs

Check the recent Lambda function executions:

aws logs tail /aws/lambda/YOUR_LAMBDA_FUNCTION_NAME --since 5m

Replace YOUR_LAMBDA_FUNCTION_NAME with your function name.

Checking S3 Bucket Contents

Verify that the reports are being generated in your S3 bucket:

aws s3 ls s3://YOUR_BUCKET_NAME_HERE/ --recursive

Replace YOUR_BUCKET_NAME_HERE with your bucket name.

Checking CloudFront Distribution

You can also check the CloudFront distribution:

aws cloudfront get-distribution --id YOUR_DISTRIBUTION_ID

Replace YOUR_DISTRIBUTION_ID with your distribution ID.

Alternatively, you can view all these details in the AWS console.

Important Considerations

This setup is a common requirement for many companies, but some adjustments might be necessary depending on your specific needs:

  1. Configure retry mechanisms in the SQS queue to handle failures.

  2. Add a DLQ (Dead-Letter Queue) to Lambda to handle failed messages that exceed the maximum retry attempts.

  3. Use CloudWatch for monitoring and logging to track errors, execution times, and performance metrics.

  4. Consider scalability to ensure SQS and Lambda can handle bursts of messages efficiently.

    • Avoid bottlenecks, such as SQS filling up faster than Lambda can process messages (queue bottleneck).
    • Ensure Lambda concurrency is sufficient to avoid processing bottlenecks.
  5. Establish a CI/CD pipeline for automated testing, easy rollbacks, and other best practices.

Conclusion

Setting up an automated report generator with AWS SQS and Lambda is a powerful way to streamline data processing and reporting. By leveraging these serverless services, you can build a scalable, cost-effective, and reliable system. Remember to consider the best practices mentioned above to ensure your setup is robust and efficient.

For further reading on AWS services and best practices, check out the AWS Documentation.

You may also like