Get ready to dive into the exciting world of serverless computing! In this project, we will build a powerful, automated, and incredibly efficient event driven pipeline to process images. We are leaving traditional servers behind and embracing an architecture that responds to events as they happen. The best part? We will use the AWS Serverless Application Model (SAM), a developer friendly extension of CloudFormation that makes building and deploying serverless applications a breeze.
This project is the perfect way to get hands on experience with a core serverless pattern. You will learn how services can work together seamlessly without you needing to manage any underlying infrastructure. Let’s build something amazing!
Architecture Overview
So, what exactly are we building? Imagine an automated photo booth. Our architecture works just like one, with each AWS service playing a specific role in a chain reaction.
- The Upload: A user uploads a new image file to a designated S3 bucket. This is like putting your photo into the machine's slot. This S3 bucket is our "uploads" location.
- The Trigger: The moment the image arrives in the S3 bucket, it creates an event. This event is like the flash going off, signaling that there is a new picture to process.
- The Processor: This event automatically triggers an AWS Lambda function. Lambda is our serverless compute engine, the smart machine inside the booth. It runs our code without us needing to provision or manage any servers.
- The Action: The Lambda function's code will download the image, create a smaller thumbnail version of it, and upload that thumbnail to a second S3 bucket, our "thumbnails" location.
- The Record Keeping: After successfully creating the thumbnail, the function will write metadata about the process (like the original image's name, its size, and the location of its new thumbnail) into an Amazon DynamoDB table. This is like the booth printing a receipt with all the details of your photo session.
This entire flow is completely automated, highly scalable, and very cost effective because you only pay for the compute time you use, down to the millisecond.
Globals and Transform
Every AWS SAM template begins with a couple of important declarations. Think of this as setting the ground rules for our project before we start building.
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: A serverless image processing pipeline.
Globals:
Function:
Timeout: 10 # Default timeout for all functions
Runtime: python3.9
Resources:
# ... (Resources defined below)
The Transform: AWS::Serverless-2016-10-31 line is the magic wand. It is a mandatory declaration that tells AWS CloudFormation, "Hey, this isn't a regular template. It contains the simplified SAM syntax, so please process it accordingly." This transform unlocks all the special shortcuts that make SAM so powerful.
The Globals section is a fantastic feature for keeping your template clean and DRY (Don't Repeat Yourself). Anything you define here will apply to all the resources of that type in your template. In this case, we are setting a default Timeout of 10 seconds and a Runtime of python3.9 for every AWS::Serverless::Function we define later. This saves us from typing the same lines over and over again.
The Storage Layer (S3 & DynamoDB)
Our pipeline needs places to store files and data. We will define two S3 buckets for our images and a DynamoDB table for our metadata.
Resources:
# S3 bucket for original image uploads
UploadsBucket:
Type: AWS::S3::Bucket
# S3 bucket for resized thumbnails
ThumbnailsBucket:
Type: AWS::S3::Bucket
# DynamoDB table to store image metadata
ImageMetadataTable:
Type: AWS::DynamoDB::Table
Properties:
AttributeDefinitions:
- AttributeName: "ImageID"
AttributeType: "S"
KeySchema:
- AttributeName: "ImageID"
KeyType: "HASH"
BillingMode: PAY_PER_REQUEST
This is quite straightforward. We declare two resources of type AWS::S3::Bucket, giving them logical names UploadsBucket and ThumbnailsBucket.
For our database, we declare an AWS::DynamoDB::Table. We define its schema by specifying an AttributeDefinitions (the "columns" in our table) and a KeySchema (the primary key that uniquely identifies each item). Here, we are using a simple string attribute called ImageID as our primary key.
Notice the BillingMode: PAY_PER_REQUEST. This is a core benefit of serverless databases. Instead of paying for provisioned capacity you might not use, you pay only for the read and write operations you actually perform. This makes it incredibly cost effective for applications with new, spiky, or unpredictable traffic patterns.
The Compute Layer (Lambda Function & IAM Role)
This is the heart of our pipeline, where the actual work gets done. Here we define our Lambda function and, more importantly, what triggers it and what permissions it has.
Resources:
# ... (Storage resources from above)
# The Lambda function that processes the images
ImageProcessorFunction:
Type: AWS::Serverless::Function
Properties:
CodeUri: src/ # Points to a local folder with your Python code
Handler: app.lambda_handler # The file and function to execute
Policies:
# These SAM Policies automatically generate the correct IAM permissions
- S3ReadPolicy:
BucketName: !Ref UploadsBucket
- S3WritePolicy:
BucketName: !Ref ThumbnailsBucket
- DynamoDBCrudPolicy:
TableName: !Ref ImageMetadataTable
Environment:
Variables:
THUMBNAILS_BUCKET_NAME: !Ref ThumbnailsBucket
TABLE_NAME: !Ref ImageMetadataTable
Events:
# This section defines what triggers the function
FileUpload:
Type: S3
Properties:
Bucket: !Ref UploadsBucket
Events: s3:ObjectCreated:*
Let's unpack this powerful resource, AWS::Serverless::Function.
CodeUri and Handler: These properties tell Lambda where to find your code.
CodeUripoints to the local directory (e.g.,src/) containing your Python files, andHandlerspecifies the exact file and function name to run (e.g.,app.lambda_handler).Policies: This is a major superpower of SAM. Instead of manually crafting complex IAM policy documents, SAM provides Policy Templates. Here, we simply state that our function needs to read from the
UploadsBucket, write to theThumbnailsBucket, and have full create, read, update, and delete permissions on ourImageMetadataTable. SAM automatically generates the correct, least privilege IAM policies for you. This is simpler, faster, and more secure.Environment: This section allows you to pass variables to your function at runtime. We are telling our function the names of the thumbnail bucket and the DynamoDB table it needs to talk to.
Events: This is the critical connection that makes our pipeline event driven. We are defining a trigger named
FileUpload. ItsTypeisS3, and we configure it to listen on ourUploadsBucketfor anys3:ObjectCreated:*events. This simple block is what creates the automatic trigger. Now, anytime a new file is uploaded, AWS will automatically invoke ourImageProcessorFunctionand pass it information about the event. ✨
Deployment Steps (SAM CLI)
The AWS SAM CLI is your command line tool for building, testing, and deploying your serverless applications. It simplifies the entire workflow into a few easy commands.
Install: First, make sure you have the AWS SAM CLI installed on your machine.
Build: Navigate to your project's root directory in your terminal and run the
sam buildcommand. This command looks at yourtemplate.yml, finds your function code, pulls down any dependencies, and packages everything into a format that is ready for deployment.sam buildDeploy: Now, deploy your application to the AWS cloud. The
--guidedflag makes this super easy for the first time, as it will walk you through a series of prompts to configure your deployment, such as the stack name and AWS region.sam deploy --guidedClean Up: When you are finished with your project and want to avoid any future costs, you can tear down the entire stack and all of its resources with a single command.
sam delete
Congratulations! You have just designed and deployed a sophisticated, event driven, serverless application using the power and simplicity of AWS SAM. You are well on your way to mastering modern cloud architecture.