Top 30 System Design Interview Questions and Answers for All Levels

Prepare for your next system design interview with these 30 carefully curated questions and answers. This guide progresses from basic concepts for freshers to advanced scenarios for experienced professionals (1-6+ years), covering conceptual, practical, and scenario-based topics exclusively in system design.

Basic System Design Questions (1-10)

1. What is High-Level Design (HLD) in system design?

High-Level Design (HLD) provides an overview of the system architecture, including major components like load balancers, application servers, databases, and data flows without implementation details.

2. What is Low-Level Design (LLD) in system design?

Low-Level Design (LLD) focuses on detailed implementation, including class diagrams, data models, APIs, design patterns, and component interactions.

3. Explain functional vs non-functional requirements in system design.

Functional requirements define what the system does (e.g., user login). Non-functional requirements define how it performs (e.g., latency under 200ms, 99.9% availability).

4. What is the role of a load balancer in system design?

A load balancer distributes incoming traffic across multiple application servers to ensure scalability, fault tolerance, and prevent overload on any single server.

5. Differentiate between horizontal and vertical scaling.

Vertical scaling adds resources to a single server (e.g., more CPU/RAM). Horizontal scaling adds more servers and uses load balancers for distribution.

6. What is the CAP Theorem?

CAP Theorem states a distributed system cannot simultaneously guarantee Consistency (all nodes see same data), Availability (every request gets response), and Partition tolerance (network failures handled). Choose 2 of 3.

7. Explain strong consistency vs eventual consistency.

Strong consistency ensures all reads see the latest write immediately. Eventual consistency allows temporary inconsistencies but guarantees convergence over time, favoring availability.

8. What is database sharding?

Sharding partitions data across multiple database instances based on a shard key (e.g., user ID) to improve scalability and performance.

9. What is caching and why use it?

Caching stores frequently accessed data in fast memory (e.g., Redis) to reduce database load and improve read latency.

10. What is replication in databases?

Replication creates copies of data across multiple nodes for high availability, read scaling, and fault tolerance (master-slave or multi-master).

Intermediate System Design Questions (11-20)

11. How do you estimate capacity for a system?

Calculate based on daily active users, requests per user, data size per request, and read/write ratios. Example: 1M users, 10 req/user/day = 10M req/day; provision servers accordingly.

12. Design the APIs for a URL shortening service like Bitly.

Key APIs: POST /shorten (input: long URL, output: short URL), GET /{shortCode} (redirect to long URL). Use base62 encoding for short codes.

13. How would you handle rate limiting in system design?

Use token bucket or leaky bucket algorithm per user/IP. Store counters in Redis with expiration. Reject excess requests with 429 status.

14. Explain read path vs write path in a scalable system.

Read path: Client → Load Balancer → Cache → DB Replica. Write path: Client → Load Balancer → App Server → Master DB → Replicas (async).

15. What is an API Gateway?

API Gateway acts as a single entry point for clients, handling authentication, rate limiting, request routing, and aggregation from microservices.

16. How do you design database indexes?

Create indexes on frequently queried columns (e.g., user_id, timestamp). Use composite indexes for common filters. Monitor for write overhead.

17. What are queues used for in system design?

Queues (e.g., Kafka, RabbitMQ) decouple services for async processing, like sending notifications or processing uploads, ensuring reliability.

18. Scenario: Design a notification system for Paytm users.

Components: API servers receive events → Pub/Sub pushes to queues → Workers process and deliver via email/SMS/push. Use sharded DB for user prefs.

19. How does consistent hashing work for sharding?

Consistent hashing maps data and nodes to a ring. Adding/removing nodes minimizes data movement by only reassigning adjacent keys.

20. What is a CDN and its role?

Content Delivery Network (CDN) caches static content (images, videos) on edge servers worldwide to reduce latency and origin server load.

Advanced System Design Questions (21-30)

21. Scenario: Scale a messaging system for Zoho to 100M users.

High-level: Load balancer → Stateless app servers → NoSQL (sharded by chat ID) for messages → WebSockets for real-time via pub/sub. Cache recent messages.

22. How do you achieve fault tolerance and disaster recovery?

Use multi-AZ deployment, replication, auto-scaling, health checks, circuit breakers, and regular backups with point-in-time recovery.

23. Design a file storage system like Dropbox for Salesforce.

Core: Clients upload chunks → Metadata in SQL → Blobs in object storage (sharded). Sync via versioned diffs. Deduplication by hash.

24. Explain leader election in distributed systems.

Use consensus algorithms like Raft/Zab. Nodes vote for leader; elected leader handles writes, others replicate. Handles failures via timeouts.

25. How to handle high write throughput?

Strategies: Sharding, write-ahead logging, batching writes, multi-master replication, append-only logs (e.g., Kafka).

26. Scenario: Design a recommendation system for Flipkart.

Offline: Batch ML jobs on user/item data → Store rankings in NoSQL. Online: API queries cache → Compute personalized ranks. Shard by user.

27. What are design patterns like Factory or Observer used for?

Factory creates objects without specifying class. Observer notifies dependents of state changes. Ensures modularity in LLD.

28. How do you optimize for low latency in global systems?

Use CDNs, edge computing, geo-replication, anycast routing, and client-side caching. Follow read replicas in each region.

29. Scenario: Design a ride-sharing geolocation service for Swiggy.

Store lat/long in geospatial index (e.g., S2 cells). Query nearby drivers via radius search. Real-time updates via pub/sub.

30. Discuss trade-offs in choosing SQL vs NoSQL for Atlassian-scale system.

SQL: ACID, complex joins (e.g., JIRA issues). NoSQL: Horizontal scale, flexible schema (e.g., chats). Hybrid: SQL for metadata, NoSQL for blobs.