AWS Serverless Guide: From Basics to Advanced

AWS Serverless Guide – From Zero to Advanced

To support my journey as a system administrator and to lay solid foundations in the security field, I’ve gathered my notes from an AWS Lambda training I completed. I wanted this write-up to cover both the basics and the finer details, so that you can understand the core concepts while also picking up tips that make a difference in real-world scenarios.

What exactly is “Serverless”?

  • Are there no servers? There are — but you don’t manage them. OS patching, scaling, capacity planning are all AWS’s job.
  • Billing model: Pay only for what you use (invocation time, request count, data transfer). Almost zero cost when idle.
  • Core building blocks:
    • Lambda (run code)
    • API Gateway / ALB (entry point)
    • S3 (object storage), DynamoDB (NoSQL), Aurora Serverless (RDS without server mgmt)
    • SQS/SNS/EventBridge (events & messaging)
    • Step Functions (orchestration)
    • Cognito (identity)
    • CloudFront + Lambda@Edge (CDN + edge logic)
    • AppSync (GraphQL)
    • Kinesis/MSK (streaming data)

Where it shines: Event-driven architectures, background jobs, APIs, ETL, integrations, mobile backends, low/variable traffic.
When to reconsider: Long-lived TCP connections, custom kernel modules, sustained high CPU/GPU — maybe go with Fargate/EKS/EC2.

1) The big picture (architecture map)

[Client/Web/Mobile]
|
CloudFront ──(Lambda@Edge, WAF)
|
API Gateway (REST/HTTP/WebSocket) ──> Lambda ──> DynamoDB
| | \
| | -> S3
| -> EventBridge → Step Functions → (Lambda/SageMaker/SQS...)
|
Cognito (AuthN/AuthZ)

Side channels:
S3 → Event → Lambda/SNS/SQS
SQS <→ Lambda (polling)
SNS → fan-out (SQS, Lambda, HTTP)
CloudWatch (logs/metrics/alarms), X-Ray (tracing)
Secrets Manager / Param Store (secrets/config)

2) Lambda: how it works at its core

  • Runtimes: Node.js, Python, Go, Java, .NET, custom.
  • Time limit: Up to 15 minutes per invocation.
  • Memory: 128 MB–10+ GB (CPU scales with memory).
  • Concurrency: Number of simultaneous copies running. AWS scales automatically; use reserved concurrency to cap it if needed.
  • Start types:
    • Cold start: New container setup (code download, runtime init).
    • Warm start: Reusing an existing container → much faster.
  • VPC? Putting Lambda inside a VPC can add latency due to ENI setup (less of an issue with AWS improvements, but still measurable).

Fine-tuning tips:

  • Initialize SDK clients outside the handler (reuse on warm starts).
  • Trim dependencies, keep deployment package small.
  • Use Provisioned Concurrency for critical endpoints to remove cold start delays.
  • Idempotency: Avoid duplicate processing on retries (idempotency keys + DynamoDB conditional writes).

Basic handler (Node.js)

// index.mjs
import { DynamoDBClient, PutItemCommand } from "@aws-sdk/client-dynamodb";
const ddb = new DynamoDBClient({});

export const handler = async (event) => {
const id = crypto.randomUUID();
await ddb.send(new PutItemCommand({
TableName: process.env.TABLE,
Item: { pk: { S: id }, ts: { N: String(Date.now()) } }
}));
return { statusCode: 200, body: JSON.stringify({ ok: true, id }) };
};

3) API Gateway: your single door in

  • HTTP API (cheaper, simpler) and REST API (feature-rich)
  • Auth: Cognito JWT, Lambda Authorizer, IAM permissions.
  • Caching: REST API offers built-in caching (extra cost).
  • Sync trigger: Invokes Lambda and returns its result.
  • WebSocket: Real-time messaging.

Best practices:

  • For common GETs, use CloudFront + S3 + Lambda@Edge/CloudFront Functions.
  • Apply rate limits / usage plans.
  • Use custom domain + ACM certs.
  • Clean up error responses with mapping templates.

4) Storage: S3 and DynamoDB (serverless backbone)

S3

  • File/object storage, data lake, static site hosting.
  • Event triggers: Put/Delete → Lambda/SNS/SQS.
  • Lifecycle transitions: Standard → IA → Glacier.
  • Pre-signed URL for secure uploads/downloads.

DynamoDB

  • Fully managed NoSQL. Single-digit ms latency, auto-scaling.
  • Partition key (and optional sort key) design is crucial!
  • On-demand billing is great for starters.
  • GSI/LSI: Alternate query paths.
  • Streams: Capture changes and process with Lambda.
  • TTL: Auto-delete expired items.
  • Transactions: ACID across items.

Design tips:

  • Model based on your access patterns; prefer single-table design.
  • Store large blobs in S3, metadata in DynamoDB.
  • Avoid hot partitions (spread keys evenly).

5) Messaging & events: SQS, SNS, EventBridge

  • SNS (push / fan-out): Broadcast one event to many subscribers.
  • SQS (queue / pull): Buffer jobs for consumers, absorb traffic spikes. FIFO for ordering/deduplication.
  • EventBridge: Service bus with pattern-based routing, cron/schedule, SaaS integrations.

Example pattern:

  • S3 → EventBridge → Rule filters → SQS queue → Lambda consumes via polling.
  • Failed jobs go to a DLQ after 3 retries with exponential backoff.

6) Step Functions: orchestration (retry, parallelism, human steps)

  • Standard: Long-lived, durable workflows (months).
  • Express: High-volume, low-latency (seconds).
  • 200+ service integrations (do S3 copy, DynamoDB write, SageMaker call… without writing Lambda).
  • Visual workflow, retry/catch, parallel/Map, Wait, Choice states.

Simple example (JSON):

{
"StartAt": "Validate",
"States": {
"Validate": {
"Type": "Task",
"Resource": "arn:aws:lambda:REGION:ACCT:function:validate",
"Next": "Process",
"Retry": [{ "ErrorEquals": ["States.ALL"], "MaxAttempts": 2, "BackoffRate": 2 }]
},
"Process": {
"Type": "Parallel",
"Branches": [
{ "StartAt": "Resize", "States": { "Resize": { "Type": "Task", "Resource": "arn:...:resize", "End": true } } },
{ "StartAt": "Index", "States": { "Index": { "Type": "Task", "Resource": "arn:...:index", "End": true } } }
],
"End": true
}
}
}

7) Identity & permissions: Cognito, IAM, resource-based policy

  • Cognito User Pool: User registration, MFA, social login.
  • Identity Pool: Gives federated identities temporary AWS creds.
  • IAM execution role: Grants Lambda access to other AWS services.
  • Resource-based policy: Controls who can invoke the function.

JWT-protected API (summary):

  • Client gets token from Cognito.
  • API Gateway’s JWT authorizer verifies it.
  • Lambda gets user ID/claims via headers.

8) Edge logic: CloudFront + Lambda@Edge / CloudFront Functions

  • Geo-based routing, header rewrites, A/B testing, bot blocking.
  • Runs close to the user → low latency.
  • CloudFront Functions (V8 runtime) are ultra-fast for lightweight tasks; Lambda@Edge is more powerful.

9) Observability: CloudWatch, Logs, Metrics, Alarms, X-Ray

  • CloudWatch Logs: console.log/print output lives here; set retention to control costs.
  • Metrics/Alarms: p95 latency, error rate, throttles → set alerts.
  • X-Ray tracing: See call chains, find slow steps.
  • Use Embedded Metrics Format for custom app metrics.

10) Security essentials

  • Least privilege IAM (only grant what’s needed).
  • Secrets Manager / Parameter Store for secrets; never hard-code.
  • VPC endpoints for private traffic.
  • KMS encryption (S3/DynamoDB/Logs).
  • WAF + Shield for L7 protection.
  • Schema validation at API Gateway or Lambda entry.
  • Auditing: CloudTrail, Config for change tracking.

11) Cost awareness (watch out for these)

  • Lambda: invocations + duration(ms) × memory = GB-seconds.
  • API Gateway REST is pricier; prefer HTTP API if possible.
  • DynamoDB on-demand is easy but at scale, provisioned + autoscaling may be cheaper.
  • CloudWatch Logs retention: reduce; lower log levels in prod.
  • S3 Lifecycle: move to colder tiers.
  • NAT Gateway data egress is costly; consider PrivateLink/VPC endpoints.

Rough calc example:

  • 1M/day, 50ms, 512MB → ~0.5 GB-s / 1M → a few dozen USD/month.
  • Surprise costs often come from data transfer & CloudWatch Logs.

12) Deployment & local dev: SAM, Serverless Framework, CDK

  • AWS SAM: CloudFormation-based, local emulation.
  • CDK: Infrastructure as code in TypeScript/Python.
  • Serverless Framework: Multi-cloud, plugin ecosystem.

Minimal SAM example (template.yaml):

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Resources:
Api:
Type: AWS::Serverless::Api
Properties:
StageName: prod
EndpointConfiguration: REGIONAL
Table:
Type: AWS::DynamoDB::Table
Properties:
BillingMode: PAY_PER_REQUEST
AttributeDefinitions:
- AttributeName: pk
AttributeType: S
KeySchema:
- AttributeName: pk
KeyType: HASH
Fn:
Type: AWS::Serverless::Function
Properties:
CodeUri: src/
Handler: index.handler
Runtime: nodejs20.x
MemorySize: 512
Timeout: 10
Environment:
Variables:
TABLE: !Ref Table
Policies:
- DynamoDBCrudPolicy:
TableName: !Ref Table
Events:
Get:
Type: Api
Properties:
Path: /items
Method: get
RestApiId: !Ref Api

13) Patterns worth remembering

  • S3 → EventBridge → SQS → Lambda: Process new files with retry isolation.
  • API Gateway → Lambda → DynamoDB (idempotent): Avoid duplicate POSTs.
  • Step Functions Saga: Compensate actions in multi-step workflows.
  • Fan-out: One event to multiple consumers (SNS → SQS/Lambda).
  • CQRS/Materialized View: Write to DynamoDB, read from GSI/ElasticSearch/OpenSearch.
  • Event Sourcing: Store changes in a stream, rebuild current state.

14) Performance & resilience nuances

  • Cold starts: Large packages, VPC init, heavy frameworks → slower. Use provisioned concurrency for critical paths.
  • DynamoDB hot partitions: Spread keys.
  • SQS visibility timeout: Must exceed processing time; adjust if needed.
  • Retry strategies: Async (S3/SNS/EventBridge) retries + DLQ; sync errors → return proper status codes.
  • Backpressure: Limit Lambda concurrency to protect downstreams.
  • Large payloads: API Gateway max 10 MB; use pre-signed S3 for big uploads.

15) Common mistakes (and fixes)

  • Overly broad IAM: Avoid *:*; use managed or scoped policies.
  • Infinite log retention: Set 3/7/14 days.
  • DB connections in handler: Reuse global client; for RDS use RDS Proxy.
  • Using Lambda for everything: Offload simple integrations to Step Functions service integrations.
  • Forcing API Gateway REST: If simple, HTTP API saves money.
  • S3 public buckets: Use CloudFront + OAC.

16) Pre-go-live checklist

Performance

  • Minimal package size
  • Global client reuse
  • Provisioned concurrency for hot paths
  • p95/p99 metrics alarmed

Security

  • Least privilege IAM
  • Secrets in Secrets Manager/Param Store
  • Encryption at rest + in transit
  • WAF / throttling / schema validation

Cost

  • CloudWatch log retention set
  • API Gateway type reviewed
  • DynamoDB billing mode tuned
  • Data transfer/NAT usage checked

Resilience

  • Retry/backoff set
  • DLQ/on-failure targets
  • Idempotency handled
  • Chaos/latency tested

In Short

Serverless delivers on “start fast, near-zero ops overhead, scale to demand.” Its real power is in composition: Lambda by itself is useful, but combined with S3, DynamoDB, SQS, SNS, EventBridge, Step Functions, you can build robust, cost-efficient architectures. The art is in deciding where to use Lambda, where to go direct service-to-service, and where to cache or VPC-isolate.

Leave a Reply

Your email address will not be published. Required fields are marked *