Posted in

Top 30 System Design Interview Questions and Answers for All Levels

Basic System Design Concepts (Questions 1-10)

These foundational questions help freshers and beginners understand core system design principles like scalability, components, and basic trade-offs.

1. What is System Design and why is it important?

System Design is the process of defining the architecture, components, modules, interfaces, and data for a system to satisfy specified requirements. It ensures the system is scalable, reliable, and maintainable under real-world loads.[3]

2. Explain the difference between High-Level Design (HLD) and Low-Level Design (LLD).

HLD provides an overview of system architecture, components, and data flow. LLD focuses on detailed implementation like class diagrams, database schemas, and APIs.[3]

3. What are functional and non-functional requirements in system design?

Functional requirements define what the system should do (e.g., user login). Non-functional requirements cover how the system performs (e.g., latency under 200ms, 99.9% uptime).[4]

4. What is horizontal vs vertical scaling?

Vertical scaling adds more resources to a single server (e.g., more CPU/RAM). Horizontal scaling adds more servers and distributes load using load balancers.[3]

5. What is a Load Balancer and its role in system design?

A Load Balancer distributes incoming traffic across multiple servers to ensure no single server is overwhelmed, improving availability and scalability.[1]

6. Differentiate between SQL and NoSQL databases.

SQL databases (e.g., MySQL) use structured schemas and ACID transactions for consistency. NoSQL (e.g., Cassandra) handles unstructured data with high scalability and eventual consistency.[1]

7. What is Caching and why use it?

Caching stores frequently accessed data in fast memory (e.g., Redis) to reduce database load and improve response times.[1]

8. Explain the CAP Theorem.

CAP Theorem states a distributed system can provide at most two of Consistency, Availability, Partition tolerance. Most systems choose Availability and Partition tolerance with eventual consistency.[1]

9. What is sharding in databases?

Sharding divides a large database into smaller, distributed partitions (shards) based on a shard key to improve scalability and performance.[1]

10. What is replication in databases?

Replication creates copies of data across multiple database nodes for high availability, fault tolerance, and read scaling (read replicas).[1]

Intermediate System Design Questions (Questions 11-20)

These questions target candidates with 1-3 years experience, focusing on practical components and trade-offs.

11. How would you design a URL shortening service like Bit.ly?

Use a load balancer → API servers → counter for unique IDs → store mappings in a database with hashing for short codes. Cache popular URLs in Redis for fast redirects.[7]

12. Design a notification system for a SaaS platform like Zoho.

Clients send notifications via API → queue (e.g., Pub/Sub) → worker processes deliver via email/SMS/push. Use database for user preferences and retry failed deliveries.[1]

13. What is an API Gateway and when to use it?

API Gateway acts as a single entry point for client requests, handling authentication, rate limiting, and routing to microservices.[3]

14. How do you handle rate limiting in system design?

Use token bucket or leaky bucket algorithms at API Gateway. Track requests per user/IP in Redis with expiration (e.g., 100 req/min).[3]

15. Explain read path vs write path in a scalable system.

Read path: Client → Load Balancer → Cache (hit) or Database replica. Write path: Client → Load Balancer → API Server → Primary Database → Replicate.[1]

16. Design a file storage system like Dropbox for Atlassian users.

Upload via API → chunk files → store in object storage → metadata in SQL database. Use CDN for downloads and sync via WebSockets.[6]

17. What strategies ensure high availability (99.99% uptime)?

Use multi-AZ deployment, auto-scaling groups, health checks, database replication, and circuit breakers for fault isolation.[2]

18. How do you estimate capacity for a system?

Calculate daily active users × operations/user × peak factor. Example: 1M DAU, 10 ops/user, 2x peak = 20M ops/day → divide by 86400s ≈ 232 req/sec.[3]

19. What is eventual consistency vs strong consistency?

Strong consistency ensures all reads see latest writes immediately. Eventual consistency allows temporary inconsistencies but higher availability.[1]

20. Design a simple e-commerce checkout flow for Flipkart.

Cart API → inventory check (optimistic locking) → payment gateway → order service → queue for fulfillment. Use saga pattern for distributed transactions.[2]

Advanced System Design Scenarios (Questions 21-30)

These scenario-based questions are for 3-6 years experienced candidates, testing deep trade-offs and real-world constraints.

21. Design a ride-sharing system like Uber at Swiggy scale.

Match riders/drivers via geohashing → WebSocket for real-time location → sharded SQL for trips → pub/sub for notifications. Handle surges with dynamic pricing.[7]

22. How would you scale a social media feed like Instagram for Adobe?

Fan-out writes for hot users to followers’ feeds (NoSQL). Push model for celebrities. Cache feeds, use CDN for images, timeline service for ranking.[1]

23. Design a real-time chat system for Paytm users.

WebSocket connections → message broker (Kafka) → sharded NoSQL per chat room → read replicas. Handle offline delivery via queues.[1]

24. What is consistent hashing and why use it for sharding?

Consistent hashing maps data/keys to nodes on a ring, minimizing data movement when nodes are added/removed. Used in Cassandra for dynamic scaling.[3]

25. Design a recommendation engine for Salesforce.

Offline: ML batch jobs on user data → store vectors in vector DB. Online: API queries nearest neighbors → rank and cache results.[2]

26. How to handle database hotspots in a high-traffic system?

Identify via monitoring → use composite shard keys → cache hotspots → denormalize data → read replicas for hot reads.[1]

27. Design a video streaming service like Netflix for Oracle scale.

Adaptive bitrate → CDN edge caches → origin servers with peer-to-peer assists. Metadata in SQL, video chunks in object storage.[2]

28. Explain circuit breaker pattern with an example.

In microservices, if downstream service fails repeatedly, open circuit to fail-fast and fallback. Example: Payment service calls failing → direct to queue.[3]

29. How to design a global distributed system with low latency?

Multi-region deployment → anycast DNS → regional data centers → geo-replication → client-side routing to nearest region.[1]

30. Scenario: Your messaging system at SAP handles 1B messages/day but spikes to 10x. How to scale?

Add auto-scaling workers → partition queues by user ID → use multi-master replication → throttle non-critical notifications → monitor bottlenecks with distributed tracing.[2]

Leave a Reply

Your email address will not be published. Required fields are marked *