Top 30 System Design Interview Questions and Answers for All Levels

Prepare for your next system design interview with this comprehensive guide. These 30 questions progress from basic concepts to advanced scenarios, helping freshers, 1-3 year experienced professionals, and 3-6 year veterans master scalable system architecture.

Basic System Design Questions (1-10)

1. What is System Design and why is it important?

System Design is the process of defining the architecture, components, modules, interfaces, and data for a system to satisfy specified requirements. It ensures scalability, reliability, and maintainability for real-world applications[1][3].

2. Explain the difference between High-Level Design (HLD) and Low-Level Design (LLD).

HLD provides an overview of system architecture, components, and data flow. LLD details class diagrams, methods, data structures, and interactions between components[3].

3. What are Functional and Non-Functional Requirements in system design?

Functional requirements define what the system should do (features like user registration). Non-functional requirements specify how the system performs (scalability, latency, availability)[4][5].

4. What is the CAP Theorem?

CAP Theorem states that a distributed system can provide at most two of three guarantees: Consistency (all nodes see same data), Availability (every request gets response), Partition tolerance (system continues during network failures)[1][3].

5. Differentiate between Horizontal and Vertical Scaling.

Vertical scaling adds more resources to existing servers (more CPU/RAM). Horizontal scaling adds more servers and distributes load across them, better for high traffic[3].

6. What is a Load Balancer and its role in system design?

A Load Balancer distributes incoming traffic across multiple servers to ensure no single server is overwhelmed, improving availability and scalability[1].

7. Explain Monolithic vs Microservices architecture.

Monolithic: Single deployable unit with all components tightly coupled. Microservices: Independent, loosely coupled services that communicate via APIs, enabling independent scaling[3].

8. What is Caching and when should you use it?

Caching stores frequently accessed data in fast storage (like Redis/Memcached) to reduce database load and improve response times for read-heavy operations[1][3].

9. What is Database Sharding?

Sharding divides a large database into smaller, distributed partitions (shards) across multiple servers based on a shard key, enabling horizontal scaling[1][3].

10. Difference between SQL and NoSQL databases?

SQL databases (MySQL, PostgreSQL) use structured schemas with ACID transactions. NoSQL (Cassandra, DynamoDB) handle unstructured data with eventual consistency for high scalability[1].

Intermediate System Design Questions (11-20)

11. How do you approach a system design interview question?

1. Clarify requirements (functional/non-functional)
2. Define scope and assumptions
3. High-level design with components
4. Deep dive into critical components
5. Discuss trade-offs and scaling[2][4][5].

12. What is an API Gateway?

API Gateway acts as a single entry point for clients, handling authentication, rate limiting, request routing, and aggregating responses from multiple microservices[3].

13. Explain Read Path vs Write Path in system design.

Read Path: Client → Load Balancer → Cache → App Servers → Database. Write Path: Client → Load Balancer → App Servers → Database → Cache invalidation[1].

14. What is Database Replication and its types?

Replication creates database copies for redundancy and read scaling. Master-Slave: Master handles writes, slaves handle reads. Master-Master: Both handle reads/writes[1].

15. Design a URL Shortening Service like Bitly (High-level).

Components: Load Balancer → App Servers → Redis (for counter) → MySQL (URL mappings). Use base62 encoding for short codes. Handle collisions with unique ID generation[6].

16. How does Consistent Hashing work for load distribution?

Consistent Hashing maps data and nodes to a hash ring. When nodes are added/removed, only a subset of keys need remapping, minimizing data movement[3].

17. What is Rate Limiting and common algorithms?

Rate Limiting controls request frequency per user/IP. Algorithms: Token Bucket (fixed tokens/sec), Leaky Bucket (constant rate), Fixed Window Counter[3].

18. Explain the role of Message Queues in system design.

Message Queues (RabbitMQ, Kafka) decouple services, handle async processing, provide reliability through retries, and enable load smoothing[1].

19. How would you design file storage like Dropbox (Paytm scenario)?

For Paytm’s file sharing: Clients → Load Balancer → App Servers → CDN (static files) → Object Storage (S3-like) → Metadata DB (MySQL). Use chunked uploads for large files[1][5].

20. What are Circuit Breakers in distributed systems?

Circuit Breakers prevent cascading failures by stopping requests to failing services. States: Closed (normal), Open (fail fast), Half-Open (test recovery)[3].

Advanced System Design Questions (21-30)

21. Design a Messaging System like WhatsApp (Zoho scenario).

Components: Load Balancer → WebSocket servers → NoSQL DB (Cassandra) → Message Queue → Push Notification Service. Use pub/sub for real-time delivery[1].

22. How do you handle Database Schema Evolution in production?

Strategies: Backward/forward compatible changes, feature flags, dual writes (old+new schema), gradual migration, blue-green deployment[3].

23. Design a News Feed System (Flipkart recommendations).

For Flipkart: Fan-out on write for active users, fan-out on read for others. Use Redis for ranking, Cassandra for storage, ML service for personalization[1].

24. Explain Leader Election in distributed systems (Atlassian use case).

Leader Election selects one coordinator node among many. Algorithms: Bully Algorithm, Raft, ZooKeeper. Used for locking, coordination in Atlassian tools[3].

25. How to design a system with Strong vs Eventual Consistency?

Strong consistency: All reads see latest writes (2PC, Paxos). Eventual consistency: Replicas converge over time (Dynamo, Cassandra). Choose based on use case[1].

26. Design a Ride-Sharing Service like Uber (Swiggy delivery adaptation).

Swiggy delivery: Driver App → Matching Service → Geospatial Index (GeoHash) → Order Service → Payment → ETA Calculator. Real-time updates via WebSockets[7].

27. What is the Two-Phase Commit (2PC) protocol?

2PC ensures atomicity across distributed transactions: Phase 1 (Prepare – can commit?), Phase 2 (Commit/Rollback). Coordinator manages voting[3].

28. How do you estimate capacity for a system (Salesforce scenario)?

For Salesforce: Estimate DAU, QPS, storage needs. Example: 1M DAU, 10% active/hour = 28K RPS. Each request 1KB → 28MB/s bandwidth[3][8].

29. Design a Content Delivery Network (CDN) architecture (Adobe use case).

Adobe CDN: Origin Server → Regional Edge Servers → Client. Use DNS for geo-routing, cache invalidation via purge API, HTTPS termination at edge[3].

30. How would you design a distributed counter for Oracle analytics?

Oracle scenario: Use Redis with Lua scripts for atomic increments, sharding counters across multiple Redis instances, leader election for coordination, periodic merging[1][3].