A
अजय उपाध्याय● Available
Contact
All Posts
System Design Fundamentals: A Beginner's Guide to Building Scalable Systems
System DesignSoftware ArchitectureScalability

System Design Fundamentals: A Beginner's Guide to Building Scalable Systems

23 March 202610 min read
System DesignSoftware ArchitectureScalabilityBackend DevelopmentDistributed SystemsLoad Balancing

TL;DR: System design is the practice of deciding how different parts of a software system fit together — where data is stored, how requests flow, and how the system handles growth. This guide covers the essential building blocks: load balancers, caches, databases, message queues, CDNs, and API design. Whether you are preparing for interviews or building real systems, these fundamentals apply everywhere.

What is System Design?

System design is structured problem-solving for software. You break a problem into parts, assign responsibilities to each part, and define how they communicate. Every application you use — Google Search, Netflix, WhatsApp — is a system made up of these fundamental components working together.

Google processes over 8.5 billion searches per day. Netflix serves 260+ million subscribers across 190 countries. These systems did not start complex — they evolved from simple architectures by applying the same principles covered in this guide.

The Building Blocks of System Design

Every scalable system is built from a combination of these core components:

ComponentWhat It DoesReal-World Analogy
Load BalancerDistributes traffic across multiple serversA receptionist directing patients to available doctors
CacheStores frequently accessed data in fast memoryA sticky note on your desk with common phone numbers
DatabasePersistent storage for application dataA filing cabinet with organized records
CDNServes static files from servers close to usersLocal branches of a national library
Message QueueDecouples services by buffering messagesA mailbox that holds letters until the recipient reads them
API GatewaySingle entry point for all client requestsA front desk that routes visitors to the right department
Reverse ProxySits in front of servers, handling SSL, compressionA security guard who also carries your bags

Load Balancers

A load balancer distributes incoming requests across multiple servers so no single server gets overwhelmed.

Why it matters: Without load balancing, a single server handles all traffic. If it crashes, your entire application goes down. With load balancing, traffic is spread across multiple servers — if one fails, the others continue serving users.

Load Balancing Algorithms

  • Round Robin — Sends requests to servers in order (1, 2, 3, 1, 2, 3...)
  • Least Connections — Sends to the server with the fewest active connections
  • IP Hash — Routes the same user to the same server (useful for session persistence)
  • Weighted Round Robin — Assigns more traffic to more powerful servers
# Nginx load balancer configuration example
upstream backend_servers {
    least_conn;
    server backend1.example.com:3000 weight=3;
    server backend2.example.com:3000 weight=2;
    server backend3.example.com:3000 weight=1;
}
 
server {
    listen 80;
    location / {
        proxy_pass http://backend_servers;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

Trade-off: Load balancers add a network hop (slight latency increase) but dramatically improve reliability and throughput.

Caching

Caching stores frequently accessed data in a fast storage layer (usually in-memory) to avoid expensive database queries or API calls.

According to research by Amazon, every 100ms of latency costs 1% in sales. Caching is one of the most effective ways to reduce response times.

Caching Strategies

StrategyHow It WorksBest For
Cache-AsideApp checks cache first; on miss, fetches from DB and updates cacheGeneral-purpose, most common
Write-ThroughEvery write goes to cache AND database simultaneouslyStrong consistency needed
Write-BehindWrites go to cache first, then async to databaseHigh write throughput
Read-ThroughCache automatically fetches from DB on missSimplified application code
// Cache-Aside pattern example with Redis
import Redis from "ioredis";
 
const redis = new Redis();
 
async function getUserProfile(userId: string) {
  // Step 1: Check cache
  const cached = await redis.get(`user:${userId}`);
  if (cached) {
    return JSON.parse(cached);
  }
 
  // Step 2: Cache miss — fetch from database
  const user = await db.users.findById(userId);
 
  // Step 3: Store in cache with 1-hour expiry
  await redis.set(`user:${userId}`, JSON.stringify(user), "EX", 3600);
 
  return user;
}

Trade-off: Caching improves read performance dramatically but introduces cache invalidation complexity — one of the hardest problems in computer science.

Databases: SQL vs NoSQL

Choosing the right database is one of the most impactful system design decisions.

FeatureSQL (PostgreSQL, MySQL)NoSQL (MongoDB, DynamoDB)
Data ModelTables with rows and columnsDocuments, key-value, graphs, or wide-column
SchemaFixed schema, enforcedFlexible schema, can vary per document
RelationshipsStrong (JOINs, foreign keys)Weak (denormalized, embedded documents)
ScalingVertical (scale up the server)Horizontal (add more servers)
ACID ComplianceFull ACID supportVaries (eventual consistency common)
Best ForComplex queries, transactions, relationshipsHigh throughput, flexible data, rapid iteration
ExamplesBanking, e-commerce, ERP systemsSocial media feeds, IoT data, content management

The practical answer: Start with SQL (PostgreSQL is an excellent default). Move to NoSQL when you have a specific need — like storing millions of documents with varying structures, or when you need horizontal scaling beyond what a single database server can handle. If you go with MongoDB, make sure to secure your queries against NoSQL injection attacks.

Content Delivery Networks (CDNs)

A CDN is a network of servers distributed across the globe that caches and serves static content (images, CSS, JavaScript) from the server closest to the user.

Without CDN: A user in Tokyo requests an image from a server in New York — ~200ms round trip. With CDN: The same image is served from a CDN edge server in Tokyo — ~20ms round trip.

Major CDN providers include Cloudflare (serves ~20% of all web traffic), AWS CloudFront, and Vercel Edge Network (built into Next.js).

Trade-off: CDNs dramatically improve performance for static content but add complexity for dynamic content that changes frequently.

Message Queues

Message queues decouple services by allowing them to communicate asynchronously. Instead of Service A calling Service B directly (and waiting for a response), Service A puts a message in a queue, and Service B processes it when ready.

Without Queue:     User → API → Send Email → Wait... → Response (slow)
With Queue:        User → API → Queue Message → Response (fast)
                                    ↓
                              Email Worker → Send Email (async)

Popular message queues: RabbitMQ, Apache Kafka (handles trillions of messages per day at LinkedIn), Amazon SQS, Redis Pub/Sub.

// Message queue example with BullMQ (Redis-based)
import { Queue, Worker } from "bullmq";
 
// Producer: Add job to queue
const emailQueue = new Queue("emails");
 
async function handleContactForm(data: ContactFormData) {
  // Save to database (fast)
  await db.contacts.create(data);
 
  // Queue email sending (non-blocking)
  await emailQueue.add("send-welcome", {
    to: data.email,
    name: data.name,
  });
 
  return { success: true }; // Respond immediately
}
 
// Consumer: Process jobs from queue
const worker = new Worker("emails", async (job) => {
  await sendEmail({
    to: job.data.to,
    subject: `Thanks, ${job.data.name}!`,
    template: "welcome",
  });
});

Trade-off: Message queues improve responsiveness and reliability but add infrastructure complexity and make debugging harder (messages can get lost or duplicated).

API Design

APIs define how different parts of your system (and external clients) communicate. The two dominant styles in 2026:

REST — Resource-based, uses HTTP methods (GET, POST, PUT, DELETE). Widely understood and well-tooled.

GraphQL — Query-based, clients request exactly the data they need. Reduces over-fetching.

// REST API endpoint
// GET /api/users/123
// Returns: { id: 123, name: "Ajay", email: "...", posts: [...] }
 
// GraphQL query — client chooses what it needs
// query {
//   user(id: 123) {
//     name
//     posts { title }
//   }
// }

Practical recommendation: Use REST for most applications. Consider GraphQL when you have complex, nested data requirements and multiple client types (web, mobile, third-party).

The CAP Theorem

The CAP theorem states that a distributed system can guarantee at most two of three properties:

  • Consistency — Every read receives the most recent write
  • Availability — Every request receives a response
  • Partition Tolerance — The system continues working despite network failures

Since network partitions are unavoidable in distributed systems, the real choice is between CP (consistency + partition tolerance) and AP (availability + partition tolerance).

CP systems (e.g., MongoDB with majority write concern, HBase): Prioritize correct data over availability. If a partition happens, some requests may fail.

AP systems (e.g., Cassandra, DynamoDB): Prioritize availability over consistency. Every request gets a response, but data might be slightly stale.

Practical example: A banking system needs CP — you cannot show incorrect balances. A social media feed can use AP — showing a slightly stale feed for a few seconds is acceptable.

Horizontal vs Vertical Scaling

Vertical Scaling (Scale Up): Add more CPU, RAM, or storage to your existing server. Like upgrading from a sedan to a truck.

Horizontal Scaling (Scale Out): Add more servers. Like adding more sedans to a fleet.

AspectVertical ScalingHorizontal Scaling
ComplexityLow — upgrade hardwareHigh — distributed system design
CostExpensive at high endCost-effective with commodity hardware
LimitHardware ceilingVirtually unlimited
DowntimeOften requires restartZero downtime possible
Data ConsistencyEasy (single server)Complex (distributed state)
Best ForSmall-medium apps, databasesLarge-scale web apps, microservices

Practical advice: Scale vertically first (it is much less complex). Switch to horizontal scaling when you hit the limits of a single machine or need high availability.

Putting It All Together

Here is how these components combine in a typical web application architecture:

User → CDN (static files)
     → Load Balancer
         → API Server 1 → Cache (Redis) → Database (Primary)
         → API Server 2 → Cache (Redis) → Database (Replica)
         → API Server 3 → Message Queue → Background Workers
  1. CDN serves static assets (images, CSS, JS)
  2. Load Balancer distributes API requests across servers
  3. API Servers handle business logic
  4. Cache stores hot data to reduce database load
  5. Database with read replicas for scalability
  6. Message Queue handles async tasks (emails, notifications)

FAQ

Do I need to know system design for job interviews?

Yes. According to interviewing.io, system design is the most weighted round in senior engineering interviews at FAANG companies. For junior roles, understanding the fundamentals (this guide) is enough. For senior roles, you need to design systems end-to-end.

Should I start with microservices or a monolith?

Start with a monolith. Microservices add significant operational complexity (deployment, monitoring, inter-service communication). Build a well-structured monolith first, then extract services as needed when you hit specific scaling or team-size bottlenecks.

How much traffic can a single server handle?

A well-optimized Node.js server can handle 10,000-50,000 requests per second for simple API endpoints. A PostgreSQL database can handle 10,000-20,000 queries per second depending on query complexity. These numbers cover the needs of most applications.

What is the best resource to learn system design?

Start with "Designing Data-Intensive Applications" by Martin Kleppmann — it is the gold standard. For interview prep, "System Design Interview" by Alex Xu is excellent. For free resources, the system design roadmap at roadmap.sh is comprehensive.

Resources

  • Designing Data-Intensive Applications — Martin Kleppmann
  • System Design Roadmap — roadmap.sh
  • System Design Primer — GitHub
  • The Complete Guide to System Design in 2026 — DEV Community
  • CAP Theorem Explained — IBM
  • Redis Documentation — Caching Patterns

More Blogs

Prevent NoSQL Injection in Node.js and MongoDB

Prevent NoSQL Injection in Node.js and MongoDB

25 Mar10 min read
How AI is Transforming Software Development in 2026: A Complete Guide

How AI is Transforming Software Development in 2026: A Complete Guide

24 Mar8 min read
Complete Analytics & Ad Tracking Setup for Next.js — GA4, Google Ads & Meta Pixel

Complete Analytics & Ad Tracking Setup for Next.js — GA4, Google Ads & Meta Pixel

21 Mar11 min read
Why Next.js is the Best Choice for Business Websites in 2026

Why Next.js is the Best Choice for Business Websites in 2026

16 Mar6 min read
How I Build Fast, SEO-Friendly Websites Using React & Next.js

How I Build Fast, SEO-Friendly Websites Using React & Next.js

15 Mar6 min read
View all blogs