AWS Serverless Guide – From Zero to Advanced

To support my journey as a system administrator and to lay solid foundations in the security field, I’ve gathered my notes from an AWS Lambda training I completed. I wanted this write-up to cover both the basics and the finer details, so that you can understand the core concepts while also picking up tips that make a difference in real-world scenarios.

What exactly is “Serverless”?

Are there no servers? There are — but you don’t manage them. OS patching, scaling, capacity planning are all AWS’s job.
Billing model: Pay only for what you use (invocation time, request count, data transfer). Almost zero cost when idle.
Core building blocks:
- Lambda (run code)
- API Gateway / ALB (entry point)
- S3 (object storage), DynamoDB (NoSQL), Aurora Serverless (RDS without server mgmt)
- SQS/SNS/EventBridge (events & messaging)
- Step Functions (orchestration)
- Cognito (identity)
- CloudFront + Lambda@Edge (CDN + edge logic)
- AppSync (GraphQL)
- Kinesis/MSK (streaming data)

Where it shines: Event-driven architectures, background jobs, APIs, ETL, integrations, mobile backends, low/variable traffic.
When to reconsider: Long-lived TCP connections, custom kernel modules, sustained high CPU/GPU — maybe go with Fargate/EKS/EC2.

1) The big picture (architecture map)

[Client/Web/Mobile]
      |
   CloudFront ──(Lambda@Edge, WAF)
      |
  API Gateway (REST/HTTP/WebSocket) ──> Lambda ──> DynamoDB
      |                                  |  \
      |                                  |   -> S3
      |                                  -> EventBridge → Step Functions → (Lambda/SageMaker/SQS...)
      |
   Cognito (AuthN/AuthZ)

Side channels:
S3 → Event → Lambda/SNS/SQS
SQS <→ Lambda (polling)
SNS → fan-out (SQS, Lambda, HTTP)
CloudWatch (logs/metrics/alarms), X-Ray (tracing)
Secrets Manager / Param Store (secrets/config)

2) Lambda: how it works at its core

Runtimes: Node.js, Python, Go, Java, .NET, custom.
Time limit: Up to 15 minutes per invocation.
Memory: 128 MB–10+ GB (CPU scales with memory).
Concurrency: Number of simultaneous copies running. AWS scales automatically; use reserved concurrency to cap it if needed.
Start types:
- Cold start: New container setup (code download, runtime init).
- Warm start: Reusing an existing container → much faster.
VPC? Putting Lambda inside a VPC can add latency due to ENI setup (less of an issue with AWS improvements, but still measurable).

Fine-tuning tips:

Initialize SDK clients outside the handler (reuse on warm starts).
Trim dependencies, keep deployment package small.
Use Provisioned Concurrency for critical endpoints to remove cold start delays.
Idempotency: Avoid duplicate processing on retries (idempotency keys + DynamoDB conditional writes).

Basic handler (Node.js)

// index.mjs
import { DynamoDBClient, PutItemCommand } from "@aws-sdk/client-dynamodb";
const ddb = new DynamoDBClient({});

export const handler = async (event) => {
  const id = crypto.randomUUID();
  await ddb.send(new PutItemCommand({
    TableName: process.env.TABLE,
    Item: { pk: { S: id }, ts: { N: String(Date.now()) } }
  }));
  return { statusCode: 200, body: JSON.stringify({ ok: true, id }) };
};

3) API Gateway: your single door in

HTTP API (cheaper, simpler) and REST API (feature-rich)
Auth: Cognito JWT, Lambda Authorizer, IAM permissions.
Caching: REST API offers built-in caching (extra cost).
Sync trigger: Invokes Lambda and returns its result.
WebSocket: Real-time messaging.

Best practices:

For common GETs, use CloudFront + S3 + Lambda@Edge/CloudFront Functions.
Apply rate limits / usage plans.
Use custom domain + ACM certs.
Clean up error responses with mapping templates.

4) Storage: S3 and DynamoDB (serverless backbone)

S3

File/object storage, data lake, static site hosting.
Event triggers: Put/Delete → Lambda/SNS/SQS.
Lifecycle transitions: Standard → IA → Glacier.
Pre-signed URL for secure uploads/downloads.

DynamoDB

Fully managed NoSQL. Single-digit ms latency, auto-scaling.
Partition key (and optional sort key) design is crucial!
On-demand billing is great for starters.
GSI/LSI: Alternate query paths.
Streams: Capture changes and process with Lambda.
TTL: Auto-delete expired items.
Transactions: ACID across items.

Design tips:

Model based on your access patterns; prefer single-table design.
Store large blobs in S3, metadata in DynamoDB.
Avoid hot partitions (spread keys evenly).

5) Messaging & events: SQS, SNS, EventBridge

SNS (push / fan-out): Broadcast one event to many subscribers.
SQS (queue / pull): Buffer jobs for consumers, absorb traffic spikes. FIFO for ordering/deduplication.
EventBridge: Service bus with pattern-based routing, cron/schedule, SaaS integrations.

Example pattern:

S3 → EventBridge → Rule filters → SQS queue → Lambda consumes via polling.
Failed jobs go to a DLQ after 3 retries with exponential backoff.

6) Step Functions: orchestration (retry, parallelism, human steps)

Standard: Long-lived, durable workflows (months).
Express: High-volume, low-latency (seconds).
200+ service integrations (do S3 copy, DynamoDB write, SageMaker call… without writing Lambda).
Visual workflow, retry/catch, parallel/Map, Wait, Choice states.

Simple example (JSON):

{
  "StartAt": "Validate",
  "States": {
    "Validate": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:REGION:ACCT:function:validate",
      "Next": "Process",
      "Retry": [{ "ErrorEquals": ["States.ALL"], "MaxAttempts": 2, "BackoffRate": 2 }]
    },
    "Process": {
      "Type": "Parallel",
      "Branches": [
        { "StartAt": "Resize", "States": { "Resize": { "Type": "Task", "Resource": "arn:...:resize", "End": true } } },
        { "StartAt": "Index",  "States": { "Index":  { "Type": "Task", "Resource": "arn:...:index",  "End": true } } }
      ],
      "End": true
    }
  }
}

7) Identity & permissions: Cognito, IAM, resource-based policy

Cognito User Pool: User registration, MFA, social login.
Identity Pool: Gives federated identities temporary AWS creds.
IAM execution role: Grants Lambda access to other AWS services.
Resource-based policy: Controls who can invoke the function.

JWT-protected API (summary):

Client gets token from Cognito.
API Gateway’s JWT authorizer verifies it.
Lambda gets user ID/claims via headers.

8) Edge logic: CloudFront + Lambda@Edge / CloudFront Functions

Geo-based routing, header rewrites, A/B testing, bot blocking.
Runs close to the user → low latency.
CloudFront Functions (V8 runtime) are ultra-fast for lightweight tasks; Lambda@Edge is more powerful.

9) Observability: CloudWatch, Logs, Metrics, Alarms, X-Ray

CloudWatch Logs: console.log/print output lives here; set retention to control costs.
Metrics/Alarms: p95 latency, error rate, throttles → set alerts.
X-Ray tracing: See call chains, find slow steps.
Use Embedded Metrics Format for custom app metrics.

10) Security essentials

Least privilege IAM (only grant what’s needed).
Secrets Manager / Parameter Store for secrets; never hard-code.
VPC endpoints for private traffic.
KMS encryption (S3/DynamoDB/Logs).
WAF + Shield for L7 protection.
Schema validation at API Gateway or Lambda entry.
Auditing: CloudTrail, Config for change tracking.

11) Cost awareness (watch out for these)

Lambda: invocations + duration(ms) × memory = GB-seconds.
API Gateway REST is pricier; prefer HTTP API if possible.
DynamoDB on-demand is easy but at scale, provisioned + autoscaling may be cheaper.
CloudWatch Logs retention: reduce; lower log levels in prod.
S3 Lifecycle: move to colder tiers.
NAT Gateway data egress is costly; consider PrivateLink/VPC endpoints.

Rough calc example:

1M/day, 50ms, 512MB → ~0.5 GB-s / 1M → a few dozen USD/month.
Surprise costs often come from data transfer & CloudWatch Logs.

12) Deployment & local dev: SAM, Serverless Framework, CDK

AWS SAM: CloudFormation-based, local emulation.
CDK: Infrastructure as code in TypeScript/Python.
Serverless Framework: Multi-cloud, plugin ecosystem.

Minimal SAM example (template.yaml):

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Resources:
  Api:
    Type: AWS::Serverless::Api
    Properties:
      StageName: prod
      EndpointConfiguration: REGIONAL
  Table:
    Type: AWS::DynamoDB::Table
    Properties:
      BillingMode: PAY_PER_REQUEST
      AttributeDefinitions:
        - AttributeName: pk
          AttributeType: S
      KeySchema:
        - AttributeName: pk
          KeyType: HASH
  Fn:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: src/
      Handler: index.handler
      Runtime: nodejs20.x
      MemorySize: 512
      Timeout: 10
      Environment:
        Variables:
          TABLE: !Ref Table
      Policies:
        - DynamoDBCrudPolicy:
            TableName: !Ref Table
      Events:
        Get:
          Type: Api
          Properties:
            Path: /items
            Method: get
            RestApiId: !Ref Api

13) Patterns worth remembering

S3 → EventBridge → SQS → Lambda: Process new files with retry isolation.
API Gateway → Lambda → DynamoDB (idempotent): Avoid duplicate POSTs.
Step Functions Saga: Compensate actions in multi-step workflows.
Fan-out: One event to multiple consumers (SNS → SQS/Lambda).
CQRS/Materialized View: Write to DynamoDB, read from GSI/ElasticSearch/OpenSearch.
Event Sourcing: Store changes in a stream, rebuild current state.

14) Performance & resilience nuances

Cold starts: Large packages, VPC init, heavy frameworks → slower. Use provisioned concurrency for critical paths.
DynamoDB hot partitions: Spread keys.
SQS visibility timeout: Must exceed processing time; adjust if needed.
Retry strategies: Async (S3/SNS/EventBridge) retries + DLQ; sync errors → return proper status codes.
Backpressure: Limit Lambda concurrency to protect downstreams.
Large payloads: API Gateway max 10 MB; use pre-signed S3 for big uploads.

15) Common mistakes (and fixes)

Overly broad IAM: Avoid *:*; use managed or scoped policies.
Infinite log retention: Set 3/7/14 days.
DB connections in handler: Reuse global client; for RDS use RDS Proxy.
Using Lambda for everything: Offload simple integrations to Step Functions service integrations.
Forcing API Gateway REST: If simple, HTTP API saves money.
S3 public buckets: Use CloudFront + OAC.

16) Pre-go-live checklist

Performance

Minimal package size
Global client reuse
Provisioned concurrency for hot paths
p95/p99 metrics alarmed

Security

Least privilege IAM
Secrets in Secrets Manager/Param Store
Encryption at rest + in transit
WAF / throttling / schema validation

Cost

CloudWatch log retention set
API Gateway type reviewed
DynamoDB billing mode tuned
Data transfer/NAT usage checked

Resilience

Retry/backoff set
DLQ/on-failure targets
Idempotency handled
Chaos/latency tested

In Short

Serverless delivers on “start fast, near-zero ops overhead, scale to demand.” Its real power is in composition: Lambda by itself is useful, but combined with S3, DynamoDB, SQS, SNS, EventBridge, Step Functions, you can build robust, cost-efficient architectures. The art is in deciding where to use Lambda, where to go direct service-to-service, and where to cache or VPC-isolate.

AWS Serverless Guide: From Basics to Advanced