What is High-Level Design?

High-Level Design is the practice of choosing the right components, services, and infrastructure for a system and defining how they communicate. It's the architectural blueprint that determines whether your system will scale to millions of users or collapse under load.

When someone says "Design Twitter" or "Design a URL shortener" in an interview, they're asking for HLD — the bird's-eye view of how the entire system fits together.

mermaid

graph TB
    subgraph "High-Level Design Scope"
        CLIENT["Client<br/>(Web/Mobile)"]
        CDN["CDN<br/>(CloudFront)"]
        LB["Load Balancer<br/>(ALB)"]
        API1["API Server 1"]
        API2["API Server 2"]
        API3["API Server N"]
        CACHE["Cache Layer<br/>(ElastiCache/Redis)"]
        DB_PRIMARY["Primary DB<br/>(RDS PostgreSQL)"]
        DB_REPLICA["Read Replica"]
        QUEUE["Message Queue<br/>(SQS)"]
        WORKER["Worker<br/>(Lambda)"]
        STORAGE["Object Storage<br/>(S3)"]
        SEARCH["Search<br/>(OpenSearch)"]
 
        CLIENT --> CDN
        CDN --> LB
        LB --> API1
        LB --> API2
        LB --> API3
        API1 --> CACHE
        API2 --> CACHE
        CACHE --> DB_PRIMARY
        DB_PRIMARY --> DB_REPLICA
        API1 --> QUEUE
        QUEUE --> WORKER
        WORKER --> STORAGE
        API3 --> SEARCH
        SEARCH --> DB_REPLICA
    end
 
    style CLIENT fill:#f1f5f9,stroke:#94a3b8
    style CDN fill:#dbeafe,stroke:#3b82f6
    style LB fill:#dbeafe,stroke:#3b82f6
    style CACHE fill:#fef3c7,stroke:#f59e0b
    style DB_PRIMARY fill:#dcfce7,stroke:#22c55e
    style DB_REPLICA fill:#dcfce7,stroke:#22c55e
    style QUEUE fill:#f3e8ff,stroke:#a855f7
    style WORKER fill:#f3e8ff,stroke:#a855f7
    style STORAGE fill:#fce7f3,stroke:#ec4899
    style SEARCH fill:#fce7f3,stroke:#ec4899

HLD is the architectural blueprint of a system

Every box in this diagram is an HLD decision. Why ALB instead of NLB? Why Redis instead of Memcached? Why SQS instead of Kafka? HLD is about understanding these trade-offs and making the right choice for your specific requirements.

HLD vs LLD — The Full Picture

mermaid

graph TB
    subgraph SYSTEM["The System"]
        subgraph HLD_SCOPE["HLD Scope"]
            direction LR
            C["Client"]
            S1["Service A"]
            S2["Service B"]
            DB["Database"]
            Q["Queue"]
            C --> S1
            S1 --> S2
            S1 --> DB
            S2 --> Q
        end
 
        subgraph LLD_SCOPE["LLD Scope (inside Service A)"]
            CTRL["Controller"]
            SVC["Service Layer"]
            REPO["Repository"]
            MODEL["Domain Models"]
            PATTERN["Design Patterns"]
            
            CTRL --> SVC
            SVC --> REPO
            SVC --> MODEL
            SVC --> PATTERN
        end
    end
 
    S1 -.->|"Zoom in"| LLD_SCOPE
 
    style HLD_SCOPE fill:#dbeafe,stroke:#3b82f6
    style LLD_SCOPE fill:#dcfce7,stroke:#22c55e

HLD and LLD are two zoom levels of the same system

Aspect	HLD	LLD
Question	What components do we need? How do they talk?	How do we implement one component internally?
Decisions	SQL vs NoSQL, REST vs gRPC, sync vs async	Which design pattern? Interface or abstract class?
Failure mode	"The system can't handle 10K requests/sec"	"Adding a new payment method requires changing 15 files"
Diagram type	Architecture diagram, data flow diagram	Class diagram, sequence diagram
AWS mapping	EC2, RDS, SQS, ElastiCache, CloudFront	Spring Boot, JPA, Design Patterns
Interview	"Design Instagram" (whiteboard)	"Design a parking lot" (code)

Client-server model — the foundation of every distributed system

The Building Blocks of Every Distributed System

Every system, from a startup's MVP to Netflix's global infrastructure, is assembled from these building blocks. In this series, we map every concept to AWS services.

1. Compute — Where Does Your Code Run?

mermaid

graph LR
    subgraph "More Control ←→ Less Management"
        EC2["EC2<br/>Full server<br/>You manage everything"]
        ECS["ECS/Fargate<br/>Containers<br/>AWS manages servers"]
        LAMBDA["Lambda<br/>Functions<br/>AWS manages everything"]
    end
 
    EC2 -->|"Need OS-level<br/>access, GPUs"| EC2
    ECS -->|"Microservices,<br/>long-running tasks"| ECS
    LAMBDA -->|"Event-driven,<br/>short tasks"| LAMBDA
 
    style EC2 fill:#fee2e2,stroke:#ef4444
    style ECS fill:#fef3c7,stroke:#f59e0b
    style LAMBDA fill:#dcfce7,stroke:#22c55e

Compute options on AWS — from most control to least

Service	When to Use	Cost Model	Max Execution
EC2	Full control needed, GPU workloads, legacy apps	Per hour (running)	Unlimited
ECS/Fargate	Microservices, Docker containers, consistent workloads	Per vCPU + memory (running)	Unlimited
Lambda	Event handlers, API endpoints, async processing	Per request + duration	15 minutes

Rule of Thumb

Start with Lambda for new services. Move to Fargate when you need long-running processes or consistent throughput. Use EC2 only when you need OS-level access or specific hardware.

2. Storage — How Do You Persist Data?

Choosing the right database is the most impactful HLD decision you'll make. There is no "best" database — only the best database for your access pattern.

mermaid

flowchart TD
    START["What's your data like?"]
 
    Q1{"Need complex<br/>joins and<br/>transactions?"}
    Q2{"Need flexible<br/>schema?"}
    Q3{"Need sub-ms<br/>reads?"}
    Q4{"Need full-text<br/>search?"}
    Q5{"Need to store<br/>files/images?"}
    Q6{"Need time-series<br/>data?"}
 
    RDS["RDS (PostgreSQL/MySQL)<br/>Relational, ACID,<br/>strong consistency"]
    DYNAMO["DynamoDB<br/>Key-value/document,<br/>single-digit ms at any scale"]
    REDIS["ElastiCache (Redis)<br/>In-memory, sub-ms,<br/>TTL-based expiry"]
    SEARCH["OpenSearch<br/>Full-text search,<br/>fuzzy matching, analytics"]
    S3["S3<br/>Object storage,<br/>unlimited scale, $0.023/GB"]
    TIMESTREAM["Timestream<br/>Purpose-built for<br/>IoT/metrics/logs"]
 
    START --> Q1
    Q1 -->|Yes| RDS
    Q1 -->|No| Q2
    Q2 -->|Yes| DYNAMO
    Q2 -->|No| Q3
    Q3 -->|Yes| REDIS
    Q3 -->|No| Q4
    Q4 -->|Yes| SEARCH
    Q4 -->|No| Q5
    Q5 -->|Yes| S3
    Q5 -->|No| Q6
    Q6 -->|Yes| TIMESTREAM
 
    style RDS fill:#dcfce7,stroke:#22c55e
    style DYNAMO fill:#dbeafe,stroke:#3b82f6
    style REDIS fill:#fef3c7,stroke:#f59e0b
    style SEARCH fill:#f3e8ff,stroke:#a855f7
    style S3 fill:#fce7f3,stroke:#ec4899
    style TIMESTREAM fill:#e0e7ff,stroke:#6366f1

Database decision tree

Database	Type	Consistency	Latency	Scale	Cost
RDS (PostgreSQL)	Relational	Strong (ACID)	~5ms	Vertical + read replicas	$$$
DynamoDB	Key-value/Document	Eventual (configurable strong)	~5ms	Horizontal (unlimited)	$ per request
ElastiCache (Redis)	In-memory	Eventual	<1ms	Clustered	$$ per node
S3	Object store	Strong (as of 2020)	~100ms	Unlimited	$ per GB
OpenSearch	Search engine	Near real-time	~50ms	Horizontal	$$$ per node

Cloud computing — services interact across multiple layers

3. Networking — How Do Components Communicate?

mermaid

graph TB
    subgraph Sync["Synchronous (request-response)"]
        C1["Client"]
        API_GW["API Gateway"]
        SVC1["Service"]
        
        C1 -->|"HTTP request"| API_GW
        API_GW -->|"Forward"| SVC1
        SVC1 -->|"HTTP response"| API_GW
        API_GW -->|"Response"| C1
    end
 
    subgraph Async["Asynchronous (fire-and-forget)"]
        SVC2["Producer<br/>Service"]
        SQS["SQS Queue"]
        CONSUMER["Consumer<br/>Service"]
        
        SVC2 -->|"Send message"| SQS
        SQS -->|"Poll message"| CONSUMER
        Note1["Producer doesn't wait<br/>for consumer to finish"]
    end
 
    subgraph Event["Event-Driven (pub-sub)"]
        PUB["Publisher"]
        SNS["SNS Topic"]
        SUB1["Subscriber 1<br/>(Email)"]
        SUB2["Subscriber 2<br/>(Analytics)"]
        SUB3["Subscriber 3<br/>(Notification)"]
 
        PUB -->|"Publish event"| SNS
        SNS -->|"Fan out"| SUB1
        SNS -->|"Fan out"| SUB2
        SNS -->|"Fan out"| SUB3
    end
 
    style Sync fill:#dbeafe,stroke:#3b82f6
    style Async fill:#dcfce7,stroke:#22c55e
    style Event fill:#fef3c7,stroke:#f59e0b

Synchronous vs asynchronous communication

Pattern	AWS Service	When to Use	Trade-off
Synchronous	API Gateway, ALB	User-facing APIs, need immediate response	Tight coupling, cascading failures
Async (queue)	SQS	Background jobs, email sending, order processing	Eventual consistency, harder to debug
Pub-Sub (events)	SNS, EventBridge	Fan-out notifications, event-driven architectures	Message ordering not guaranteed, at-least-once delivery
Streaming	Kinesis, MSK (Kafka)	Real-time analytics, log aggregation, CDC	Complex to operate, expensive at scale

The Golden Rule of Communication

Use synchronous communication only when the caller needs the result immediately. Everything else should be async. This single decision prevents more outages than any other architectural choice.

4. Caching — The Single Biggest Performance Lever

Caching is the most effective way to improve system performance. A well-placed cache can reduce database load by 90% and cut response times from 100ms to 1ms.

mermaid

flowchart LR
    USER["User Request"]
    
    subgraph L1["Layer 1: Edge Cache"]
        CF["CloudFront CDN<br/>Static assets, API responses<br/>Global, ~10ms"]
    end
 
    subgraph L2["Layer 2: Application Cache"]
        REDIS2["ElastiCache Redis<br/>Session data, hot queries<br/>Regional, ~1ms"]
    end
 
    subgraph L3["Layer 3: Database Cache"]
        DAX["DAX<br/>(DynamoDB Accelerator)<br/>Table-level, ~μs"]
    end
 
    subgraph L4["Layer 4: Database"]
        DB2["RDS / DynamoDB<br/>Source of truth<br/>~5-50ms"]
    end
 
    USER --> CF
    CF -->|"Cache MISS"| REDIS2
    REDIS2 -->|"Cache MISS"| DAX
    DAX -->|"Cache MISS"| DB2
 
    style L1 fill:#dbeafe,stroke:#3b82f6
    style L2 fill:#fef3c7,stroke:#f59e0b
    style L3 fill:#dcfce7,stroke:#22c55e
    style L4 fill:#f1f5f9,stroke:#94a3b8

Multi-layer caching strategy

Cache Invalidation Strategies

Strategy	How It Works	Best For
TTL (Time-to-Live)	Cache entry expires after N seconds	Data that's acceptable to be slightly stale (product catalog, user profiles)
Write-Through	Every write goes to cache AND database simultaneously	Data that must always be fresh (account balance, inventory count)
Write-Behind	Write to cache immediately, flush to database asynchronously	High-write workloads where slight lag is acceptable (analytics, metrics)
Cache-Aside (Lazy Loading)	Application checks cache first, loads from DB on miss, stores in cache	General purpose — most common pattern

java

// Cache-Aside pattern in Java (most common)
public class UserService {
    private final RedisTemplate<String, User> cache;
    private final UserRepository repository;
 
    public User getUser(String userId) {
        String key = "user:" + userId;
 
        User cached = cache.opsForValue().get(key);
        if (cached != null) {
            return cached; // cache HIT — sub-millisecond
        }
 
        User user = repository.findById(userId); // cache MISS — hit database
        cache.opsForValue().set(key, user, Duration.ofMinutes(15));
        return user;
    }
 
    public void updateUser(User user) {
        repository.save(user);
        cache.delete("user:" + user.getId()); // invalidate stale cache
    }
}

A Complete Example: Scaling a Java Web App on AWS

Let's walk through how a real application evolves from a single server to a production-grade distributed system. Each step addresses a specific bottleneck.

Stage 1: The Monolith

Everything on one EC2 instance. Works for ~100 concurrent users.

mermaid

graph LR
    USER["Users<br/>(~100 concurrent)"]
    EC2["EC2 Instance<br/>Java App + PostgreSQL"]
 
    USER --> EC2
 
    style EC2 fill:#fee2e2,stroke:#ef4444

Stage 1 — single server monolith

Bottleneck: The app and database compete for CPU and memory on the same machine. A traffic spike kills both.

Stage 2: Separate the Database

Decouple compute from storage. Now they scale independently.

mermaid

graph LR
    USER["Users"]
    EC2_2["EC2<br/>Java App"]
    RDS["RDS PostgreSQL<br/>Multi-AZ, automated<br/>backups, failover"]
 
    USER --> EC2_2 --> RDS
 
    style RDS fill:#dcfce7,stroke:#22c55e

Stage 2 — database on managed service

Gain: Database gets automated backups, failover, and can scale storage independently. App server can be resized without affecting the database.

Bottleneck: Single app server. If it dies, the entire system is down.

Stage 3: Load Balancing + Auto Scaling

Multiple app servers behind a load balancer. No single point of failure.

mermaid

graph LR
    USER2["Users<br/>(~10K concurrent)"]
    ALB["ALB<br/>Application<br/>Load Balancer"]
    ASG["Auto Scaling Group"]
    EC2A["EC2 (App)"]
    EC2B["EC2 (App)"]
    EC2C["EC2 (App)"]
    RDS2["RDS PostgreSQL<br/>+ Read Replica"]
 
    USER2 --> ALB
    ALB --> ASG
    ASG --> EC2A
    ASG --> EC2B
    ASG --> EC2C
    EC2A --> RDS2
    EC2B --> RDS2
    EC2C --> RDS2
 
    style ALB fill:#dbeafe,stroke:#3b82f6
    style ASG fill:#fef3c7,stroke:#f59e0b

Stage 3 — horizontal scaling with load balancer

Key decisions:

ALB (not NLB) because we need HTTP-level routing (path-based, host-based)
Auto Scaling Group scales EC2 instances based on CPU or request count
Read Replica handles read-heavy queries (product listings, search results)

Bottleneck: Every request hits the database. At 10K req/sec, the database becomes the bottleneck.

Stage 4: Add Caching

Redis absorbs 80–90% of read traffic. Database only handles writes and cache misses.

mermaid

graph TB
    USER3["Users<br/>(~100K concurrent)"]
    CF["CloudFront CDN<br/>Static assets cached<br/>at 400+ edge locations"]
    ALB2["ALB"]
    APP["EC2 Fleet<br/>(Auto Scaling)"]
    REDIS3["ElastiCache Redis<br/>Cluster mode,<br/>6 nodes, 3 shards"]
    RDS3["RDS PostgreSQL<br/>Multi-AZ + 2 Read Replicas"]
 
    USER3 --> CF
    CF --> ALB2
    ALB2 --> APP
    APP -->|"80% cache HIT"| REDIS3
    APP -->|"20% cache MISS"| RDS3
    REDIS3 -.->|"Load on miss"| RDS3
 
    style CF fill:#dbeafe,stroke:#3b82f6
    style REDIS3 fill:#fef3c7,stroke:#f59e0b
    style RDS3 fill:#dcfce7,stroke:#22c55e

Stage 4 — caching layer reduces database load by 90%

Impact: Database queries drop from 100K/sec to 10K/sec. P99 latency drops from 200ms to 15ms.

Bottleneck: Synchronous processing. Sending a confirmation email during checkout adds 2 seconds to the response.

Stage 5: Async Processing

Move non-critical work to background queues. The user gets an instant response.

mermaid

graph TB
    USER4["Users"]
    ALB3["ALB"]
    APP2["EC2 Fleet"]
    REDIS4["Redis Cache"]
    RDS4["RDS"]
 
    subgraph Async["Async Pipeline"]
        SQS2["SQS Queue"]
        LAMBDA2["Lambda Workers"]
        SES["SES (Email)"]
        S3_2["S3 (Files)"]
        ANALYTICS["Analytics<br/>Pipeline"]
    end
 
    USER4 --> ALB3 --> APP2
    APP2 --> REDIS4 --> RDS4
    APP2 -->|"Fire-and-forget"| SQS2
    SQS2 --> LAMBDA2
    LAMBDA2 --> SES
    LAMBDA2 --> S3_2
    LAMBDA2 --> ANALYTICS
 
    style Async fill:#f3e8ff,stroke:#a855f7

Stage 5 — async processing for non-blocking operations

What goes async: Confirmation emails, invoice generation, image processing, analytics events, search index updates, notification pushes.

Result: Checkout API responds in 200ms instead of 2.5 seconds. Background work completes within minutes.

Architecture Summary

mermaid

graph LR
    S1["Stage 1<br/>Single Server<br/>~100 users"]
    S2["Stage 2<br/>Managed DB<br/>~1K users"]
    S3["Stage 3<br/>Load Balanced<br/>~10K users"]
    S4["Stage 4<br/>Cached<br/>~100K users"]
    S5["Stage 5<br/>Async<br/>~1M users"]
 
    S1 -->|"Separate DB"| S2
    S2 -->|"Add LB + ASG"| S3
    S3 -->|"Add Redis + CDN"| S4
    S4 -->|"Add SQS + Lambda"| S5
 
    style S1 fill:#fee2e2,stroke:#ef4444
    style S5 fill:#dcfce7,stroke:#22c55e

The complete evolution — from monolith to production-grade

Key Insight

You don't start with Stage 5. Each architectural decision adds complexity and cost. You evolve the architecture when a specific bottleneck appears — not before. Premature optimization is the root of all evil in HLD.

Core Concepts You Must Know

Every HLD interview will test your understanding of these fundamental concepts:

CAP Theorem

In a distributed system, you can only guarantee two out of three:

mermaid

graph TD
    C["Consistency<br/>Every read gets the<br/>latest write"]
    A["Availability<br/>Every request gets<br/>a response"]
    P["Partition Tolerance<br/>System works despite<br/>network failures"]
 
    C --- A
    A --- P
    P --- C
 
    CP["CP Systems<br/>MongoDB, HBase, Redis<br/>(sacrifice availability)"]
    AP["AP Systems<br/>DynamoDB, Cassandra<br/>(sacrifice consistency)"]
    CA["CA Systems<br/>Traditional RDBMS<br/>(can't handle partitions)"]
 
    C -.-> CP
    P -.-> CP
    A -.-> AP
    P -.-> AP
    C -.-> CA
    A -.-> CA
 
    style C fill:#dbeafe,stroke:#3b82f6
    style A fill:#dcfce7,stroke:#22c55e
    style P fill:#fef3c7,stroke:#f59e0b

CAP theorem — pick two

In practice, network partitions will happen in any distributed system. So the real choice is between CP (consistent but sometimes unavailable) and AP (always available but sometimes stale).

Use Case	Choose	Why
Bank transactions	CP (RDS)	A stale balance could cause overdrafts
Social media feed	AP (DynamoDB)	Seeing a post 2 seconds late is fine
Shopping cart	AP (DynamoDB)	Availability > consistency for user experience
Inventory count	CP (RDS)	Overselling is worse than temporary unavailability

Horizontal vs Vertical Scaling

mermaid

graph TB
    subgraph Vertical["Vertical Scaling (Scale Up)"]
        V1["4 CPU, 16GB RAM<br/>$100/month"]
        V2["16 CPU, 64GB RAM<br/>$400/month"]
        V3["64 CPU, 256GB RAM<br/>$1,600/month"]
        V1 -->|"Upgrade"| V2 -->|"Upgrade"| V3
        V4["⚠️ Hardware limit<br/>~448 vCPU max on AWS"]
    end
 
    subgraph Horizontal["Horizontal Scaling (Scale Out)"]
        H1["Instance 1<br/>4 CPU"]
        H2["Instance 2<br/>4 CPU"]
        H3["Instance 3<br/>4 CPU"]
        H4["Instance N<br/>4 CPU"]
        HN["✅ No limit<br/>Add more instances"]
    end
 
    style Vertical fill:#fef3c7,stroke:#f59e0b
    style Horizontal fill:#dcfce7,stroke:#22c55e

Vertical (scale up) vs Horizontal (scale out)

	Vertical	Horizontal
Approach	Bigger machine	More machines
Limit	Hardware ceiling	Theoretically unlimited
Downtime	Yes (restart to resize)	No (add instances live)
Complexity	Low (same architecture)	High (need load balancer, stateless design)
Cost curve	Exponential (2x CPU ≠ 2x cost)	Linear (2x instances ≈ 2x cost)
Best for	Databases, legacy apps	Stateless web/API servers

What You'll Learn in This Series

This series covers HLD from the ground up, with every concept mapped to AWS services.

mermaid

graph LR
    C0["Class 0<br/>What is HLD<br/>(you are here)"]
    C1["Class 1<br/>Scalability<br/>Fundamentals"]
    C2["Class 2<br/>Load Balancing<br/>& API Design"]
    C3["Class 3-5<br/>Databases,<br/>Caching, Queues"]
    C4["Class 6-8<br/>Architecture Patterns<br/>(Microservices, CQRS,<br/>Event-Driven)"]
    C5["Class 9+<br/>HLD Problems<br/>(URL Shortener,<br/>Chat, News Feed)"]
 
    C0 --> C1 --> C2 --> C3 --> C4 --> C5
 
    style C0 fill:#3b82f6,stroke:#2563eb,color:#fff
    style C5 fill:#22c55e,stroke:#16a34a,color:#fff

HLD learning path

Scalability Fundamentals (Class 1)

Vertical vs horizontal scaling, stateless services, session management, AWS Auto Scaling Groups, and how to design services that can scale to millions of requests.

Load Balancing & API Design (Class 2)

ALB vs NLB vs API Gateway, routing strategies, rate limiting, API versioning, and REST vs gRPC vs GraphQL trade-offs.

Databases, Caching & Queues (Classes 3–5)

SQL vs NoSQL deep dive, sharding strategies, replication, consistent hashing, Redis patterns, SQS vs Kafka, and when to use each.

Architecture Patterns (Classes 6–8)

Microservices vs monolith, event-driven architecture, CQRS, saga pattern, circuit breaker, and service mesh — each with AWS implementation.

HLD Interview Problems (Classes 9+)

URL Shortener, Chat Application, News Feed, Notification Service, Video Streaming, Ride-Sharing — complete system designs with architecture diagrams, AWS service mappings, capacity estimation, and trade-off analysis.

How to Approach an HLD Interview

mermaid

graph TD
    subgraph Phase1["Minutes 0-5: Requirements"]
        A1["Functional: What does the system DO?"]
        A2["Non-functional: Scale, latency, availability"]
        A3["Constraints: Budget, team size, timeline"]
    end
 
    subgraph Phase2["Minutes 5-10: Estimation"]
        B1["Users: DAU, peak concurrent"]
        B2["Traffic: Reads/sec, writes/sec"]
        B3["Storage: Data size, growth rate"]
        B4["Bandwidth: Upload/download"]
    end
 
    subgraph Phase3["Minutes 10-25: Core Design"]
        C1["Draw high-level architecture"]
        C2["Define API contracts"]
        C3["Choose database + schema"]
        C4["Define data flow"]
    end
 
    subgraph Phase4["Minutes 25-40: Deep Dive"]
        D1["Address bottlenecks"]
        D2["Add caching strategy"]
        D3["Design for failure"]
        D4["Scale specific components"]
    end
 
    subgraph Phase5["Minutes 40-45: Trade-offs"]
        E1["What are the compromises?"]
        E2["What breaks at 100x scale?"]
        E3["What would you change with more time?"]
    end
 
    Phase1 --> Phase2 --> Phase3 --> Phase4 --> Phase5
 
    style Phase1 fill:#dbeafe,stroke:#3b82f6
    style Phase2 fill:#dcfce7,stroke:#22c55e
    style Phase3 fill:#fef3c7,stroke:#f59e0b
    style Phase4 fill:#f3e8ff,stroke:#a855f7
    style Phase5 fill:#fce7f3,stroke:#ec4899

HLD interview framework — 45-minute breakdown

The Interviewer's Secret

HLD interviewers don't care about the "right" answer — there isn't one. They care about your thought process: Do you ask clarifying questions? Do you consider trade-offs? Can you identify bottlenecks? Can you evolve a design as requirements change?

What's Next

In the next class, we'll cover Scalability Fundamentals — vertical vs horizontal scaling, stateless services, session management, and how AWS Auto Scaling Groups work under the hood.