Sharing notes from my ongoing learning journey — what I build, break and understand along the way.
AWS Serverless Guide: From Basics to Advanced
AWS Serverless Guide – From Zero to Advanced
To support my journey as a system administrator and to lay solid foundations in the security field, I’ve gathered my notes from an AWS Lambda training I completed. I wanted this write-up to cover both the basics and the finer details, so that you can understand the core concepts while also picking up tips that make a difference in real-world scenarios.
What exactly is “Serverless”?
- Are there no servers? There are — but you don’t manage them. OS patching, scaling, capacity planning are all AWS’s job.
- Billing model: Pay only for what you use (invocation time, request count, data transfer). Almost zero cost when idle.
- Core building blocks:
- Lambda (run code)
- API Gateway / ALB (entry point)
- S3 (object storage), DynamoDB (NoSQL), Aurora Serverless (RDS without server mgmt)
- SQS/SNS/EventBridge (events & messaging)
- Step Functions (orchestration)
- Cognito (identity)
- CloudFront + Lambda@Edge (CDN + edge logic)
- AppSync (GraphQL)
- Kinesis/MSK (streaming data)
Where it shines: Event-driven architectures, background jobs, APIs, ETL, integrations, mobile backends, low/variable traffic.
When to reconsider: Long-lived TCP connections, custom kernel modules, sustained high CPU/GPU — maybe go with Fargate/EKS/EC2.
1) The big picture (architecture map)
[Client/Web/Mobile]
|
CloudFront ──(Lambda@Edge, WAF)
|
API Gateway (REST/HTTP/WebSocket) ──> Lambda ──> DynamoDB
| | \
| | -> S3
| -> EventBridge → Step Functions → (Lambda/SageMaker/SQS...)
|
Cognito (AuthN/AuthZ)
Side channels:
S3 → Event → Lambda/SNS/SQS
SQS <→ Lambda (polling)
SNS → fan-out (SQS, Lambda, HTTP)
CloudWatch (logs/metrics/alarms), X-Ray (tracing)
Secrets Manager / Param Store (secrets/config)
2) Lambda: how it works at its core
- Runtimes: Node.js, Python, Go, Java, .NET, custom.
- Time limit: Up to 15 minutes per invocation.
- Memory: 128 MB–10+ GB (CPU scales with memory).
- Concurrency: Number of simultaneous copies running. AWS scales automatically; use reserved concurrency to cap it if needed.
- Start types:
- Cold start: New container setup (code download, runtime init).
- Warm start: Reusing an existing container → much faster.
- VPC? Putting Lambda inside a VPC can add latency due to ENI setup (less of an issue with AWS improvements, but still measurable).

Fine-tuning tips:
- Initialize SDK clients outside the handler (reuse on warm starts).
- Trim dependencies, keep deployment package small.
- Use Provisioned Concurrency for critical endpoints to remove cold start delays.
- Idempotency: Avoid duplicate processing on retries (idempotency keys + DynamoDB conditional writes).
Basic handler (Node.js)
// index.mjs
import { DynamoDBClient, PutItemCommand } from "@aws-sdk/client-dynamodb";
const ddb = new DynamoDBClient({});
export const handler = async (event) => {
const id = crypto.randomUUID();
await ddb.send(new PutItemCommand({
TableName: process.env.TABLE,
Item: { pk: { S: id }, ts: { N: String(Date.now()) } }
}));
return { statusCode: 200, body: JSON.stringify({ ok: true, id }) };
};
3) API Gateway: your single door in
- HTTP API (cheaper, simpler) and REST API (feature-rich)
- Auth: Cognito JWT, Lambda Authorizer, IAM permissions.
- Caching: REST API offers built-in caching (extra cost).
- Sync trigger: Invokes Lambda and returns its result.
- WebSocket: Real-time messaging.
Best practices:
- For common GETs, use CloudFront + S3 + Lambda@Edge/CloudFront Functions.
- Apply rate limits / usage plans.
- Use custom domain + ACM certs.
- Clean up error responses with mapping templates.
4) Storage: S3 and DynamoDB (serverless backbone)
S3
- File/object storage, data lake, static site hosting.
- Event triggers: Put/Delete → Lambda/SNS/SQS.
- Lifecycle transitions: Standard → IA → Glacier.
- Pre-signed URL for secure uploads/downloads.
DynamoDB
- Fully managed NoSQL. Single-digit ms latency, auto-scaling.
- Partition key (and optional sort key) design is crucial!
- On-demand billing is great for starters.
- GSI/LSI: Alternate query paths.
- Streams: Capture changes and process with Lambda.
- TTL: Auto-delete expired items.
- Transactions: ACID across items.
Design tips:
- Model based on your access patterns; prefer single-table design.
- Store large blobs in S3, metadata in DynamoDB.
- Avoid hot partitions (spread keys evenly).
5) Messaging & events: SQS, SNS, EventBridge
- SNS (push / fan-out): Broadcast one event to many subscribers.
- SQS (queue / pull): Buffer jobs for consumers, absorb traffic spikes. FIFO for ordering/deduplication.
- EventBridge: Service bus with pattern-based routing, cron/schedule, SaaS integrations.
Example pattern:
- S3 → EventBridge → Rule filters → SQS queue → Lambda consumes via polling.
- Failed jobs go to a DLQ after 3 retries with exponential backoff.
6) Step Functions: orchestration (retry, parallelism, human steps)
- Standard: Long-lived, durable workflows (months).
- Express: High-volume, low-latency (seconds).
- 200+ service integrations (do S3 copy, DynamoDB write, SageMaker call… without writing Lambda).
- Visual workflow, retry/catch, parallel/Map, Wait, Choice states.
Simple example (JSON):
{
"StartAt": "Validate",
"States": {
"Validate": {
"Type": "Task",
"Resource": "arn:aws:lambda:REGION:ACCT:function:validate",
"Next": "Process",
"Retry": [{ "ErrorEquals": ["States.ALL"], "MaxAttempts": 2, "BackoffRate": 2 }]
},
"Process": {
"Type": "Parallel",
"Branches": [
{ "StartAt": "Resize", "States": { "Resize": { "Type": "Task", "Resource": "arn:...:resize", "End": true } } },
{ "StartAt": "Index", "States": { "Index": { "Type": "Task", "Resource": "arn:...:index", "End": true } } }
],
"End": true
}
}
}
7) Identity & permissions: Cognito, IAM, resource-based policy
- Cognito User Pool: User registration, MFA, social login.
- Identity Pool: Gives federated identities temporary AWS creds.
- IAM execution role: Grants Lambda access to other AWS services.
- Resource-based policy: Controls who can invoke the function.
JWT-protected API (summary):
- Client gets token from Cognito.
- API Gateway’s JWT authorizer verifies it.
- Lambda gets user ID/claims via headers.
8) Edge logic: CloudFront + Lambda@Edge / CloudFront Functions
- Geo-based routing, header rewrites, A/B testing, bot blocking.
- Runs close to the user → low latency.
- CloudFront Functions (V8 runtime) are ultra-fast for lightweight tasks; Lambda@Edge is more powerful.
9) Observability: CloudWatch, Logs, Metrics, Alarms, X-Ray
- CloudWatch Logs:
console.log/print
output lives here; set retention to control costs. - Metrics/Alarms: p95 latency, error rate, throttles → set alerts.
- X-Ray tracing: See call chains, find slow steps.
- Use Embedded Metrics Format for custom app metrics.
10) Security essentials
- Least privilege IAM (only grant what’s needed).
- Secrets Manager / Parameter Store for secrets; never hard-code.
- VPC endpoints for private traffic.
- KMS encryption (S3/DynamoDB/Logs).
- WAF + Shield for L7 protection.
- Schema validation at API Gateway or Lambda entry.
- Auditing: CloudTrail, Config for change tracking.
11) Cost awareness (watch out for these)
- Lambda: invocations + duration(ms) × memory = GB-seconds.
- API Gateway REST is pricier; prefer HTTP API if possible.
- DynamoDB on-demand is easy but at scale, provisioned + autoscaling may be cheaper.
- CloudWatch Logs retention: reduce; lower log levels in prod.
- S3 Lifecycle: move to colder tiers.
- NAT Gateway data egress is costly; consider PrivateLink/VPC endpoints.
Rough calc example:
- 1M/day, 50ms, 512MB → ~0.5 GB-s / 1M → a few dozen USD/month.
- Surprise costs often come from data transfer & CloudWatch Logs.
12) Deployment & local dev: SAM, Serverless Framework, CDK
- AWS SAM: CloudFormation-based, local emulation.
- CDK: Infrastructure as code in TypeScript/Python.
- Serverless Framework: Multi-cloud, plugin ecosystem.
Minimal SAM example (template.yaml):
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Resources:
Api:
Type: AWS::Serverless::Api
Properties:
StageName: prod
EndpointConfiguration: REGIONAL
Table:
Type: AWS::DynamoDB::Table
Properties:
BillingMode: PAY_PER_REQUEST
AttributeDefinitions:
- AttributeName: pk
AttributeType: S
KeySchema:
- AttributeName: pk
KeyType: HASH
Fn:
Type: AWS::Serverless::Function
Properties:
CodeUri: src/
Handler: index.handler
Runtime: nodejs20.x
MemorySize: 512
Timeout: 10
Environment:
Variables:
TABLE: !Ref Table
Policies:
- DynamoDBCrudPolicy:
TableName: !Ref Table
Events:
Get:
Type: Api
Properties:
Path: /items
Method: get
RestApiId: !Ref Api
13) Patterns worth remembering
- S3 → EventBridge → SQS → Lambda: Process new files with retry isolation.
- API Gateway → Lambda → DynamoDB (idempotent): Avoid duplicate POSTs.
- Step Functions Saga: Compensate actions in multi-step workflows.
- Fan-out: One event to multiple consumers (SNS → SQS/Lambda).
- CQRS/Materialized View: Write to DynamoDB, read from GSI/ElasticSearch/OpenSearch.
- Event Sourcing: Store changes in a stream, rebuild current state.
14) Performance & resilience nuances
- Cold starts: Large packages, VPC init, heavy frameworks → slower. Use provisioned concurrency for critical paths.
- DynamoDB hot partitions: Spread keys.
- SQS visibility timeout: Must exceed processing time; adjust if needed.
- Retry strategies: Async (S3/SNS/EventBridge) retries + DLQ; sync errors → return proper status codes.
- Backpressure: Limit Lambda concurrency to protect downstreams.
- Large payloads: API Gateway max 10 MB; use pre-signed S3 for big uploads.
15) Common mistakes (and fixes)
- Overly broad IAM: Avoid
*:*
; use managed or scoped policies. - Infinite log retention: Set 3/7/14 days.
- DB connections in handler: Reuse global client; for RDS use RDS Proxy.
- Using Lambda for everything: Offload simple integrations to Step Functions service integrations.
- Forcing API Gateway REST: If simple, HTTP API saves money.
- S3 public buckets: Use CloudFront + OAC.
16) Pre-go-live checklist
Performance
- Minimal package size
- Global client reuse
- Provisioned concurrency for hot paths
- p95/p99 metrics alarmed
Security
- Least privilege IAM
- Secrets in Secrets Manager/Param Store
- Encryption at rest + in transit
- WAF / throttling / schema validation
Cost
- CloudWatch log retention set
- API Gateway type reviewed
- DynamoDB billing mode tuned
- Data transfer/NAT usage checked
Resilience
- Retry/backoff set
- DLQ/on-failure targets
- Idempotency handled
- Chaos/latency tested
In Short
Serverless delivers on “start fast, near-zero ops overhead, scale to demand.” Its real power is in composition: Lambda by itself is useful, but combined with S3, DynamoDB, SQS, SNS, EventBridge, Step Functions, you can build robust, cost-efficient architectures. The art is in deciding where to use Lambda, where to go direct service-to-service, and where to cache or VPC-isolate.