Basic System Design Concepts (Questions 1-10)
These foundational questions help freshers and beginners understand core system design principles like scalability, components, and basic trade-offs.
1. What is System Design and why is it important?
System Design is the process of defining the architecture, components, modules, interfaces, and data for a system to satisfy specified requirements. It ensures the system is scalable, reliable, and maintainable under real-world loads.[3]
2. Explain the difference between High-Level Design (HLD) and Low-Level Design (LLD).
HLD provides an overview of system architecture, components, and data flow. LLD focuses on detailed implementation like class diagrams, database schemas, and APIs.[3]
3. What are functional and non-functional requirements in system design?
Functional requirements define what the system should do (e.g., user login). Non-functional requirements cover how the system performs (e.g., latency under 200ms, 99.9% uptime).[4]
4. What is horizontal vs vertical scaling?
Vertical scaling adds more resources to a single server (e.g., more CPU/RAM). Horizontal scaling adds more servers and distributes load using load balancers.[3]
5. What is a Load Balancer and its role in system design?
A Load Balancer distributes incoming traffic across multiple servers to ensure no single server is overwhelmed, improving availability and scalability.[1]
6. Differentiate between SQL and NoSQL databases.
SQL databases (e.g., MySQL) use structured schemas and ACID transactions for consistency. NoSQL (e.g., Cassandra) handles unstructured data with high scalability and eventual consistency.[1]
7. What is Caching and why use it?
Caching stores frequently accessed data in fast memory (e.g., Redis) to reduce database load and improve response times.[1]
8. Explain the CAP Theorem.
CAP Theorem states a distributed system can provide at most two of Consistency, Availability, Partition tolerance. Most systems choose Availability and Partition tolerance with eventual consistency.[1]
9. What is sharding in databases?
Sharding divides a large database into smaller, distributed partitions (shards) based on a shard key to improve scalability and performance.[1]
10. What is replication in databases?
Replication creates copies of data across multiple database nodes for high availability, fault tolerance, and read scaling (read replicas).[1]
Intermediate System Design Questions (Questions 11-20)
These questions target candidates with 1-3 years experience, focusing on practical components and trade-offs.
11. How would you design a URL shortening service like Bit.ly?
Use a load balancer → API servers → counter for unique IDs → store mappings in a database with hashing for short codes. Cache popular URLs in Redis for fast redirects.[7]
12. Design a notification system for a SaaS platform like Zoho.
Clients send notifications via API → queue (e.g., Pub/Sub) → worker processes deliver via email/SMS/push. Use database for user preferences and retry failed deliveries.[1]
13. What is an API Gateway and when to use it?
API Gateway acts as a single entry point for client requests, handling authentication, rate limiting, and routing to microservices.[3]
14. How do you handle rate limiting in system design?
Use token bucket or leaky bucket algorithms at API Gateway. Track requests per user/IP in Redis with expiration (e.g., 100 req/min).[3]
15. Explain read path vs write path in a scalable system.
Read path: Client → Load Balancer → Cache (hit) or Database replica. Write path: Client → Load Balancer → API Server → Primary Database → Replicate.[1]
16. Design a file storage system like Dropbox for Atlassian users.
Upload via API → chunk files → store in object storage → metadata in SQL database. Use CDN for downloads and sync via WebSockets.[6]
17. What strategies ensure high availability (99.99% uptime)?
Use multi-AZ deployment, auto-scaling groups, health checks, database replication, and circuit breakers for fault isolation.[2]
18. How do you estimate capacity for a system?
Calculate daily active users × operations/user × peak factor. Example: 1M DAU, 10 ops/user, 2x peak = 20M ops/day → divide by 86400s ≈ 232 req/sec.[3]
19. What is eventual consistency vs strong consistency?
Strong consistency ensures all reads see latest writes immediately. Eventual consistency allows temporary inconsistencies but higher availability.[1]
20. Design a simple e-commerce checkout flow for Flipkart.
Cart API → inventory check (optimistic locking) → payment gateway → order service → queue for fulfillment. Use saga pattern for distributed transactions.[2]
Advanced System Design Scenarios (Questions 21-30)
These scenario-based questions are for 3-6 years experienced candidates, testing deep trade-offs and real-world constraints.
21. Design a ride-sharing system like Uber at Swiggy scale.
Match riders/drivers via geohashing → WebSocket for real-time location → sharded SQL for trips → pub/sub for notifications. Handle surges with dynamic pricing.[7]
22. How would you scale a social media feed like Instagram for Adobe?
Fan-out writes for hot users to followers’ feeds (NoSQL). Push model for celebrities. Cache feeds, use CDN for images, timeline service for ranking.[1]
23. Design a real-time chat system for Paytm users.
WebSocket connections → message broker (Kafka) → sharded NoSQL per chat room → read replicas. Handle offline delivery via queues.[1]
24. What is consistent hashing and why use it for sharding?
Consistent hashing maps data/keys to nodes on a ring, minimizing data movement when nodes are added/removed. Used in Cassandra for dynamic scaling.[3]
25. Design a recommendation engine for Salesforce.
Offline: ML batch jobs on user data → store vectors in vector DB. Online: API queries nearest neighbors → rank and cache results.[2]
26. How to handle database hotspots in a high-traffic system?
Identify via monitoring → use composite shard keys → cache hotspots → denormalize data → read replicas for hot reads.[1]
27. Design a video streaming service like Netflix for Oracle scale.
Adaptive bitrate → CDN edge caches → origin servers with peer-to-peer assists. Metadata in SQL, video chunks in object storage.[2]
28. Explain circuit breaker pattern with an example.
In microservices, if downstream service fails repeatedly, open circuit to fail-fast and fallback. Example: Payment service calls failing → direct to queue.[3]
29. How to design a global distributed system with low latency?
Multi-region deployment → anycast DNS → regional data centers → geo-replication → client-side routing to nearest region.[1]
30. Scenario: Your messaging system at SAP handles 1B messages/day but spikes to 10x. How to scale?
Add auto-scaling workers → partition queues by user ID → use multi-master replication → throttle non-critical notifications → monitor bottlenecks with distributed tracing.[2]