System Design
1. What is System Design?
System Design is the process of planning how different parts of a system (like
databases, servers, and user interfaces) will work together to meet the system’s
goals. It focuses on creating a structure that is efficient, scalable, reliable, and easy to
maintain.
2. Horizontal vs. Vertical Scaling?
- Horizontal Scaling: Adding more machines or servers to distribute the load. For
example, adding more servers to handle more users.
- Vertical Scaling: Upgrading a single machine by adding more resources (like CPU or
RAM) to handle more load.
Horizontal is better for handling a growing number of users, while vertical is limited
by the capacity of one machine.
3. What is Capacity Estimation?
Capacity Estimation is the process of determining the resources (like servers, storage,
bandwidth) needed for a system to handle expected traffic or load. It helps in
planning the system's size and scaling requirements to ensure smooth performance
without overloading.
4. What is HTTP?
HTTP (Hypertext Transfer Protocol) is a communication protocol used for transferring
data between a web browser (client) and a web server. It is the foundation of data
exchange on the web, enabling the loading of web pages, images, videos, etc. HTTP
works by sending requests from the client to the server and receiving responses with
the requested data.
5. What is the Internet TCP/IP stack?
The Internet TCP/IP stack is a set of communication protocols used to connect
devices over the internet. It has four layers:
1. Application Layer: Handles high-level protocols like HTTP, FTP, and SMTP (e.g., web
browsing, email).
2. Transport Layer: Manages data transfer between devices, using protocols like TCP
(reliable) and UDP (fast but less reliable).
3. Internet Layer: Routes data between devices using IP addresses (e.g., IPv4, IPv6).
4. Link Layer: Handles the physical connection between devices (e.g., Ethernet, Wi-
Fi).
Each layer serves a specific role to ensure smooth data communication.
6. What happens when you enter Google.com?
When you enter google.com in your browser, the following steps occur:
1. DNS Resolution: The browser translates google.com into an IP address using a
Domain Name System (DNS) server.
2. Establish Connection: The browser establishes a connection to the server using the
IP address. This typically involves a TCP handshake.
3. Send HTTP Request: The browser sends an HTTP request to the server, asking for
the Google homepage.
4. Server Response: The server processes the request and sends back the HTML, CSS,
JavaScript, and other resources needed to render the page.
5. Rendering the Page: The browser receives the data and renders the Google
homepage, displaying it to you.
6. Additional Requests: If the page includes images or scripts, the browser makes
additional requests to fetch those resources.
This process happens very quickly, often in just a few seconds!
7. What are Relational Databases?
Relational databases are structured databases that store data in tables with rows and
columns. Each table represents a different entity, and relationships between tables are
established using foreign keys. They use Structured Query Language (SQL) for
querying and managing data. Examples include MySQL, PostgreSQL, and Oracle
Database. Relational databases ensure data integrity and support complex queries.
8. What are Database Indexes?
Database indexes are data structures that improve the speed of data retrieval
operations on a database table. They work like a book's index, allowing the database
to find rows more quickly without scanning the entire table. Indexes can be created
on one or more columns and help optimize queries by reducing the amount of data
the database needs to process. However, they can slow down write operations (like
inserts, updates, and deletes) since the index also needs to be updated.
9. What are NoSQL databases?
NoSQL databases are non-relational databases designed to handle large volumes of
unstructured or semi-structured data. They provide flexible schemas, allowing for
varied data types and structures. NoSQL databases are typically scalable and can
distribute data across multiple servers. Common types include document stores (e.g.,
MongoDB), key-value stores (e.g., Redis), column-family stores (e.g., Cassandra), and
graph databases (e.g., Neo4j). They are often used in big data applications, real-time
web apps, and scenarios requiring high availability and scalability.
10. What is a Cache?
A cache is a temporary storage layer that holds frequently accessed data to speed up
retrieval. By storing copies of data closer to the processing unit, it reduces the time
needed to fetch data from slower storage (like databases). Caches can be used in
various contexts, such as CPU caches, web caches (for storing web pages), and
application-level caches (like Redis or Memcached). Using a cache improves
performance and reduces load on underlying data sources.
11. What is Thrashing?
Thrashing is a situation in a computer system where excessive paging or swapping
occurs between main memory and disk storage. It happens when a system spends
more time managing memory than executing processes, leading to significantly
decreased performance. This usually occurs when too many processes are competing
for limited memory resources, causing the operating system to constantly swap
pages in and out of memory instead of executing instructions. As a result, the system
becomes slow and unresponsive.
12. What are Threads?
Threads are the smallest units of processing that can be scheduled and executed by
an operating system. They allow multiple sequences of instructions to run
concurrently within a single process, sharing the same memory space. This enables
efficient multitasking and improves the performance of applications, especially on
multi-core processors. Threads can be used for tasks like handling user input,
performing background operations, or managing network communications, allowing
for a more responsive user experience.
13. What is Load Balancing?
Load balancing is the method of distributing network traffic or application requests
across multiple servers to ensure no single server is overwhelmed. This helps
improve system performance, reliability, and availability. Load balancers can direct
traffic based on factors like server capacity, response time, or proximity to the user.
By spreading the workload, load balancing enhances user experience and minimizes
downtime, making it essential for large-scale web applications and services.
14. What is Consistent Hashing?
Consistent hashing is a technique used in distributed systems to efficiently distribute
data across a dynamic set of nodes (servers). It allows for minimal disruption when
nodes are added or removed. In consistent hashing, both the nodes and the data are
mapped to a fixed-size circular space (often called a hash ring). When a new node is
added, only a subset of data items needs to be redistributed, reducing the impact on
the overall system.
This method is commonly used in distributed caching and data storage systems to
ensure that data is evenly distributed and to maintain performance even as the
system scales.
15. What is Sharding?
Sharding is a database partitioning technique that divides a large dataset into
smaller, more manageable pieces called shards. Each shard is stored on a separate
server or database instance, allowing for improved performance, scalability, and
availability. Sharding helps distribute the load across multiple servers, reducing
bottlenecks and enabling the system to handle more requests simultaneously.
Sharding can be done based on various criteria, such as range (dividing data by
value), hash (using a hash function to assign data), or directory-based (using a lookup
table to determine shard location). This technique is commonly used in large-scale
applications to optimize data storage and retrieval.
16. What are Bloom Filters?
Bloom filters are probabilistic data structures used to test whether an element is a
member of a set. They are efficient in terms of space and time, allowing for quick
membership checks. A Bloom filter can yield false positives (indicating an element is
in the set when it isn't) but never false negatives (indicating an element isn't in the
set when it is). They use multiple hash functions to map elements to a fixed-size bit
array, setting bits to 1 for added elements. Bloom filters are commonly used in
applications like caching, databases, and network systems.
17. What is Data Replication?
Data replication is the process of copying and maintaining data in multiple locations
or systems to ensure consistency, availability, and reliability. It helps protect against
data loss, improves access speed by allowing users to retrieve data from the nearest
location, and provides redundancy in case of system failures. There are different
types of data replication, including:
1. Synchronous Replication: Data is copied to the secondary location at the same
time as it is written to the primary location, ensuring real-time consistency.
2. Asynchronous Replication: Data is copied after it has been written to the primary
location, which may lead to slight delays in consistency but improves performance.
Data replication is commonly used in distributed databases, cloud storage, and
backup systems.
18. How are NoSQL databases optimized?
NoSQL databases are optimized through various techniques to enhance
performance, scalability, and flexibility. Key optimization strategies include:
1. Schema Flexibility: NoSQL databases often use a flexible schema, allowing for the
storage of different data types and structures without predefined schemas, making it
easier to adapt to changing data needs.
2. Data Partitioning: Data is distributed across multiple servers (sharding) to balance
the load and improve access speed. This allows the system to handle large volumes
of data and user requests efficiently.
3. Replication: Data is replicated across multiple nodes to ensure high availability and
fault tolerance. This allows for quick recovery in case of hardware failures and
improved read performance.
4. Caching: Frequently accessed data is cached in memory to reduce read times and
improve response rates.
5. Indexing: NoSQL databases use various indexing techniques to speed up query
performance, such as secondary indexes, full-text search indexes, and geospatial
indexes.
6. Optimized Data Models: NoSQL databases are designed with specific use cases in
mind, such as document-oriented, key-value, column-family, or graph databases,
allowing for tailored optimizations based on access patterns.
7. Eventual Consistency: Many NoSQL systems adopt an eventual consistency model,
allowing for faster write operations while still ensuring that all replicas will converge
to the same state over time.
These optimization techniques help NoSQL databases efficiently handle large
volumes of unstructured or semi-structured data while providing high availability and
low latency.
19. What are Location-based Databases?
Location-based databases are systems that store, manage, and query data
based on geographical locations. They enable applications to use location
information to enhance user experiences or provide relevant data. Key features
include:
1. Geospatial Data: They store data points with geographic coordinates (latitude and
longitude) and can handle various types of geospatial data, such as points, lines, and
polygons.
2. Spatial Indexing: These databases use spatial indexes (like R-trees) to optimize
querying and retrieval of location-based data, allowing for efficient searches within
specified geographic areas.
3. Proximity Queries: Users can perform queries based on proximity, such as finding
nearby places, calculating distances, or searching within a specific radius.
4. Geofencing: Location-based databases can support geofencing, which triggers
actions when a user enters or leaves a defined geographical area.
5. Mapping and Visualization: They often integrate with mapping services to visualize
data on maps, helping users understand spatial relationships.
Location-based databases are commonly used in applications like ride-sharing,
navigation, location-based marketing, and social networking to provide relevant
services based on a user's geographic location.
20. Database Migrations?
Database migrations are processes that manage changes to a database schema over
time. They are used to update the structure of a database while preserving existing
data. Key points about database migrations include:
1. Schema Changes: Migrations facilitate modifications to the database schema, such
as adding or removing tables, columns, and indexes.
2. Version Control: Migrations track changes to the database schema, allowing
developers to maintain a history of modifications and apply changes consistently
across different environments.
3. Automation: Many development frameworks provide tools for automating
migration processes, making it easier to apply updates and manage changes in a
systematic way.
4. Data Transformation: Migrations can also include scripts for transforming existing
data to fit new schema requirements or to migrate data from one format to another.
5. Rollback Capabilities: Migrations often support the ability to revert changes if
needed, allowing developers to restore the database to a previous state in case of
errors.
Overall, database migrations are essential for maintaining the integrity and
consistency of a database as applications evolve.
21. What is Data Consistency?,
Data consistency refers to the accuracy and reliability of data across a database. It
ensures that data remains valid and correct, especially after transactions. Consistency
is crucial for maintaining data integrity and preventing anomalies.
22.Data consistency levels ?
1. Strong consistency: Guarantees that all reads return the most recent write for a
given piece of data, ensuring immediate visibility of updates across the system.
2. Eventual consistency: Allows for temporary discrepancies between replicas. All
updates will eventually propagate, ensuring that all nodes converge to the same state
over time.
3. Causal consistency: Ensures that operations that are causally related are seen by
all nodes in the same order, allowing for some concurrency while maintaining causal
relationships.
4. Read your writes: Guarantees that a user will see their own writes immediately,
but others may see the update later, depending on the consistency level.
23.Transaction isolation levels ?
1. Read uncommitted: Allows dirty reads, meaning a transaction can read data
modified by other transactions that have not yet been committed.
2. Read committed: Prevents dirty reads, ensuring that a transaction only reads data
that has been committed by other transactions.
3. Repeatable read: Ensures that if a transaction reads the same row multiple times,
it will see the same data throughout its duration. However, it does not prevent
phantom reads.
4. Serializable: The highest isolation level, ensuring complete isolation from other
transactions. It prevents dirty reads, non-repeatable reads, and phantom reads,
making transactions appear as if they were executed sequentially.
These consistency and isolation levels help balance performance and data integrity in
database systems.
24. What is a Message Queue?
A message queue is a communication mechanism that allows applications to send
and receive messages asynchronously. It acts as a temporary storage area where
messages are held until they can be processed by the receiving application. This
decouples the sender and receiver, enabling them to operate independently and
improving the scalability and reliability of systems.
25.What is the publisher-subscriber model?
The publisher-subscriber model is a messaging pattern where publishers send
messages without knowing who will receive them, while subscribers express interest
in specific messages or topics. When a message is published, it is delivered to all
subscribers that have registered to receive messages on that topic. This model
promotes loose coupling between components, allowing for greater flexibility and
scalability in applications.
26.What are event-driven systems?
Event-driven systems are architectures that respond to events or changes in state
rather than relying on a predefined sequence of operations. In these systems,
components communicate by producing and consuming events, enabling
asynchronous processing. This model allows applications to react in real-time to user
actions, system changes, or external inputs, enhancing responsiveness and
adaptability.
27.Database as a Message Queue ?
A database can be used as a message queue by leveraging its ability to store and
manage messages. In this approach, applications can write messages to a specific
table and read them later for processing. While this can provide some message
queuing functionality, it may not offer the same performance, scalability, or reliability
as dedicated message queue systems, which are designed specifically for this
purpose. Using a database as a message queue may also lead to challenges like data
locking and increased complexity in managing message states.
28. What is a Single Point of Failure?
A single point of failure (SPOF) refers to a component or system whose failure would
lead to the failure of the entire system or service. Identifying and eliminating SPOFs is
critical in designing resilient systems, as it helps ensure that if one component fails,
the system can continue to operate without interruption.
29.What are Containers?
Containers are lightweight, portable, and self-sufficient units that package software
and its dependencies, allowing it to run consistently across different environments.
They provide isolation and ensure that applications run reliably regardless of the
underlying infrastructure. Containers are commonly used in microservices
architectures for efficient resource utilization and simplified deployment.
30.What is Service Discovery and Heartbeats?
Service discovery is the process by which services in a distributed system find and
communicate with each other. It typically involves maintaining a registry of available
services and their locations. Heartbeats are periodic signals sent by services to
indicate that they are alive and functioning. They help detect failures and ensure that
service instances are available for communication.
31.How to avoid Cascading Failures?
To avoid cascading failures, systems can implement several strategies, including
isolating components, using circuit breakers to prevent overload, implementing rate
limiting to control traffic, and employing load balancing to distribute requests evenly.
Additionally, monitoring and alerting can help identify issues before they lead to
larger failures.
32.Anomaly Detection in Distributed Systems?
Anomaly detection in distributed systems involves identifying unusual patterns or
behaviors in system metrics or logs that could indicate problems, such as
performance degradation or security breaches. Techniques include statistical
methods, machine learning algorithms, and monitoring tools that analyze data in
real-time to detect deviations from normal behavior.
33.Distributed Rate Limiting?
Distributed rate limiting is a technique used to control the rate of requests sent to a
service across multiple instances in a distributed environment. It ensures that no
single instance is overwhelmed by too many requests and helps maintain service
availability. Techniques include token buckets, leaky buckets, or centralized services
that track and enforce rate limits across all service instances.
34. What is Distributed Caching?
Distributed caching is a technique that stores frequently accessed data across
multiple cache servers to improve application performance and reduce latency. By
distributing the cache, it allows for quicker data retrieval and can handle higher loads
than a single cache instance. It also helps in scaling applications by reducing the load
on the primary data store.
35.What are Content Delivery Networks?
Content Delivery Networks (CDNs) are networks of distributed servers that deliver
web content to users based on their geographic location. CDNs cache static content,
such as images and videos, closer to users to reduce latency and improve load times.
They also enhance availability and reliability by providing redundancy and mitigating
the impact of traffic spikes or outages.
36.Write Policies?
Write policies determine how data is written to the cache in a caching system.
Common write policies include write-through, where data is written to both the
cache and the underlying data store simultaneously; write-back, where data is
initially written only to the cache and later flushed to the data store; and write-
around, where data is written directly to the data store, bypassing the cache.
37.Replacement Policies?
Replacement policies dictate how cached data is managed when the cache reaches
its capacity. Common replacement policies include Least Recently Used (LRU), which
evicts the least recently accessed data; First-In-First-Out (FIFO), which removes the
oldest data; and Least Frequently Used (LFU), which removes data that is accessed
the least often. These policies help optimize cache usage and maintain performance.
38. Microservices vs. Monoliths?
loosely coupled services, each responsible for a specific function and independently
deployable. Monoliths, on the other hand, are traditional applications where all
components are interconnected and run as a single unit. Microservices offer greater
flexibility, scalability, and resilience, while monoliths can be simpler to develop and
deploy initially but may become difficult to manage as they grow.
39.How monoliths are migrated?
Monoliths can be migrated to microservices through several approaches, including
the "Strangler Fig" pattern, where new features are developed as microservices while
gradually replacing existing monolith functionalities. Other strategies include
breaking the monolith into smaller, manageable parts, refactoring components, and
using domain-driven design to identify bounded contexts that can be extracted as
independent services.
40.How are APIs designed?
APIs are designed by defining the endpoints, request and response formats, and the
data structures used for communication. Key considerations include ensuring RESTful
principles, maintaining clear documentation, and providing versioning for backward
compatibility. Designing APIs also involves considering security measures, error
handling, and performance optimization to facilitate smooth interactions between
different systems.
41.What are asynchronous APIs?
Asynchronous APIs allow clients to send requests and continue processing without
waiting for the server to respond. Instead of blocking, the client receives a response
at a later time, often through callbacks, webhooks, or polling. This model improves
performance and user experience by enabling non-blocking operations, particularly
in scenarios where long processing times are expected.
42.OAuth?
OAuth is an open standard for access delegation commonly used for token-based
authentication and authorization. It allows third-party applications to access user
data without exposing their credentials. OAuth enables users to grant limited access
to their resources hosted on one service to another service, facilitating secure
interactions and integrations.
43.Token Based Auth?
Token-based authentication is a method where users receive a token upon successful
login, which they then include in subsequent requests to access protected resources.
This approach decouples authentication from the application and allows for stateless
session management. Tokens are typically signed and can carry claims about the user,
enabling authorization without the need for repeated logins.
44.Access Control Lists and Rule Engines?
Access Control Lists (ACLs) define permissions for users or groups regarding specific
resources, specifying who can access what and in what manner. Rule engines are
systems that evaluate conditions and trigger actions based on defined rules, often
used in complex authorization scenarios. Together, ACLs and rule engines help
manage permissions and enforce security policies in applications.
45. Pull vs. Push?
Pull and push are two communication models used in data transfer. In the pull
model, a client requests data from a server when needed, meaning the client actively
fetches updates. In contrast, the push model allows the server to send data to clients
automatically without a request, keeping them updated in real-time. The choice
between pull and push depends on application requirements and resource efficiency.
46.Memory vs. Latency?
Memory refers to the storage capacity available in a system, which can affect the
amount of data that can be processed simultaneously. Latency, on the other hand, is
the time delay between a request and its corresponding response. While having
more memory can help reduce latency by allowing faster access to data, latency can
also be influenced by other factors, such as network speed and processing time.
47.Throughput vs. Latency?
Throughput is the amount of data processed or transmitted within a given time
frame, often measured in bits per second. Latency is the delay before a transfer of
data begins following a request. High throughput indicates efficient data handling,
while low latency signifies quick response times. An optimal system aims for both
high throughput and low latency to ensure performance.
48.Consistency vs. Availability?
Consistency and availability are two key principles of the CAP theorem in distributed
systems. Consistency ensures that all nodes have the same data at any given time,
while availability guarantees that every request receives a response, even if some of
the nodes are not up to date. Balancing consistency and availability can be
challenging, especially in systems with high scalability demands.
49.Latency vs. Accuracy?
Latency refers to the time it takes for data to travel from the source to the
destination, while accuracy indicates how close a data measurement is to the true
value. In some systems, there may be a trade-off between latency and accuracy,
where faster responses (lower latency) could compromise the accuracy of the data.
It's essential to find a balance based on application requirements.
50.SQL vs. NoSQL databases?
SQL databases are relational databases that use structured query language (SQL) to
define and manipulate data, emphasizing consistency and structured data models.
NoSQL databases, in contrast, are non-relational and designed to handle
unstructured or semi-structured data, offering more flexibility and scalability. SQL
databases are often chosen for complex queries and transactions, while NoSQL
databases are preferred for high scalability and handling large volumes of diverse
data.
Practical Questions:
51. System Design of a Live-Streaming App?,
52.System Design of Instagram?,
53.System Design of Tinder?,
54.System Design of WhatsApp?,
55.System Design of TikTok?,
56.System Design of an Online Coding Judge - Part 1?,
57.System Design of an Online Coding Judge - Part 2?,
58.System Design of UPI Payments?,
59.System Design of IRCTC?,
60.System Design of Netflix Video Onboarding Pipeline?,
61.System Design of Doordash?,
62.System Design of Amazon Online Shops?,
63.System Design of Google Maps?,
64.System Design of Gmail?,
65.System Design of a Chess Website?,
66.System Design of Uber?,
System Design of Google Docs?, give me without bold text and mention question with
respective answer