Distributed systems are by now commonplace, yet remain an often difficult area of research. This is partly explained by the many facets of such systems and the inherent difficulty to isolate these facets from each other. In this article, we provide a brief overview of distributed systems: what they are, their general design goals, and some of the most common types.
What is a Distributed System?
A distributed system is a collection of computer programs that utilize computational resources across multiple, separate computation nodes to achieve a common, shared goal. Also known as distributed computing or distributed databases, it relies on separate nodes to communicate and synchronize over a common network. These nodes typically represent separate physical hardware devices but can also represent separate software processes, or other recursive encapsulated systems. Distributed systems aim to remove bottlenecks or central points of failure from a system.
Distributed computing systems have the following characteristics:
- Resource sharing – It can share hardware, software, or data
- Simultaneous processing – Multiple machines can process the same function simultaneously
- Scalability – The computing and processing capacity can scale up as needed when extended to additional machines
- Error detection – Failures can be more easily detected
- Transparency – A node can access and communicate with other nodes in the system
How does it work
Distributed systems have evolved, but today’s most common implementations are largely designed to operate via the internet and, more specifically, the cloud. It begins with a task, such as rendering a video to create a finished product ready for release.
The web application, or distributed applications, managing this task — like a video editor on a client computer — splits the job into pieces. In this simple example, the algorithm gives one frame of the video to each of a dozen different computers (or nodes) to complete the rendering. Once the frame is complete, the managing application gives the node a new frame to work on. This process continues until the video is finished and all the pieces are put back together.
A system like this doesn’t have to stop at just 12 nodes — the job may be distributed among hundreds or even thousands of nodes, turning a task that might have taken days for a single computer to complete into one that is finished in a matter of minutes.
Types of distributed systems
Distributed systems generally fall into one of four different basic architecture models:
- Client-server: Clients contact the server for data, then format it and display it to the end-user. The end-user can also make a change from the client-side and commit it back to the server to make it permanent.
- Three-tier: Information about the client is stored in a middle tier rather than on the client to simplify application deployment. This architecture model is most common for web applications.
- n-tier: Generally used when an application or server needs to forward requests to additional enterprise services on the network.
- Peer-to-peer: There are no additional machines used to provide services or manage resources. Responsibilities are uniformly distributed among machines in the system, known as peers, which can serve as either clients or servers.
- All the nodes in the distributed system are connected. So nodes can easily share data with other nodes.
- More nodes can easily be added to the distributed system i.e. it can be scaled as required.
- Failure of one node does not lead to the failure of the entire distributed system.
- Other nodes can still communicate with each other.
- Resources like printers can be shared with multiple nodes rather than being restricted to just one.
- Relevant Software for distributed systems does not exist currently.
- Security possess a problem due to easy access to data as the resources are shared to multiple systems.
- Networking Saturation may cause a hurdle in data transfer i.e., if there is a lag in the network then the user will face a problem accessing data.
How are distributed systems used?
Distributed systems are used when a workload is too great for a single computer or device to handle. They’re also helpful in situations when the workload is subject to change, such as e-commerce traffic on Cyber Monday. Today, virtually every internet-connected web application that exists is built on top of some form of a distributed system.
Some of the most common examples:
- Telecommunications networks (including cellular networks and the fabric of the internet)
- Graphical and video-rendering systems
- Scientific computing, such as protein folding and genetic research
- Airline and hotel reservation systems
- Multiuser video conferencing systems
- Cryptocurrency processing systems (e.g. Bitcoin)
- Peer-to-peer file-sharing systems (e.g. BitTorrent)
- Distributed community compute systems (e.g. [email protected])
- Multiplayer video games
- Global, distributed retailers and supply chain management (e.g. Amazon)
Distributed systems are not without challenges. Complex architectural design, construction, and debugging processes that are required to create an effective distributed system can be overwhelming.
Three more challenges you may encounter include:
- Scheduling: It has to decide which jobs need to run, when they should run, and where they should run. Schedulers ultimately have limitations, leading to underutilized hardware and unpredictable runtimes.
- Latency: The more widely your system is distributed, the more latency you can experience with communications. This often leads to teams making tradeoffs between availability, consistency, and latency.
- Observability: Gathering, processing, presenting, and monitoring hardware usage metrics for large clusters is a significant challenge.
Distributed systems are the most significant benefactor behind modern computing systems due to their capability of providing scalable and improved performance. It is an essential component of wireless networks, cloud computing, and the internet. Since they can draw on the resources of other devices and processes, distributed systems offer some features that would be hard or even impossible to develop on a singular system and have become immensely reliable by combining the power of multiple machines.