Lesson 1.3: Architectural models

The architecture of a system is its structure in terms of separately specified components and their interrelationships. The overall goal is to ensure that the structure will meet present and likely future demands on it. Major concerns are to make the system reliable, manageable, adaptable and cost-effective. The architectural design of a building has similar aspects – it determines not only its appearance but also its general structure and architectural style (gothic, neo-classical, modern) and provides a consistent frame of reference for the design.

Architectural elements

What are the entities that are communicating in the distributed system?
How do they communicate, or, more specifically, what communication paradigm is used?
What (potentially changing) roles and responsibilities do they have in the overall architecture?
How are they mapped on the physical distributed infrastructure (what is their placement)?

Communicating Entities

In distributed systems, communicating entities are the fundamental units that interact to achieve a common goal. These entities can be viewed from two perspectives:

System-oriented (low-level, infrastructure-focused).
Problem-oriented (high-level, abstraction-focused).

1. System-Oriented Perspective

The primary communicating entities are processes (or threads), which execute programs and exchange messages.

Processes:
- Independent instances of running programs.
- Communicate via interprocess communication (IPC) mechanisms (e.g., sockets, RPC).
- Example: A web server process communicating with a database process.
Threads:
- Lightweight sub-processes within a process.
- Share memory but coordinate via message passing or synchronization primitives.
- Example: Multiple threads in a server handling concurrent client requests.

2. Problem-Oriented Perspective

Higher-level abstractions simplify distributed system design:

Abstraction	Description	Key Features	Example
Objects	Encapsulate data + behavior; accessed via interfaces (e.g., Java RMI).	Object-Oriented Design (OOD) , Interface Definition Language (IDL)	Bank account object with deposit() method.
Components	Like objects but with explicit dependencies (e.g., COM, CORBA).	Contractual interfaces, Third-party reuse, Deployment support	Payment gateway component in e-commerce.
Web Services	Self-contained services using web standards (e.g., SOAP, REST).	XML/JSON messages, Discoverable via URIs , Cross-organizational	Google Maps API for location data.

3. Key Communication Paradigms

Remote Procedure Call (RPC): Objects/components invoke methods across nodes.
Message Passing: Processes exchange raw data (e.g., HTTP, MQTT).
Publish-Subscribe: Decoupled entities communicate via events (e.g., Kafka).

4. Placement & Roles

Roles: Entities can be clients, servers, peers, or brokers.
Placement: Mapping to physical nodes affects performance (e.g., edge computing for low latency).

Example: Smart Home System:

Processes: Thermostat controller (client) + Cloud server (server).
Abstraction: Thermostat as a web service with a REST API.
Paradigm: HTTP messages over Wi-Fi.

Communication Paradigms

Distributed systems rely on structured communication methods to coordinate between entities. These paradigms fall into three broad categories:

1. Interprocess Communication (IPC)

Definition: Low-level message exchange between processes.
Characteristics:

Uses message-passing primitives (e.g., sockets, pipes).
Supports multicast (one-to-many messaging).
Directly accesses network protocols (e.g., TCP/IP via socket APIs).
Use Case: Embedded systems, real-time applications where performance is critical.

Example:

A sensor node sending temperature data to a central server via UDP.

2. Remote Invocation

Definition: High-level abstractions for calling remote operations.

Type	Description	Key Features	Example
Request-Reply	Basic two-way message exchange (client → server → client).	- Simple, stateless - Used in HTTP/1.1	Embedded systems, REST APIs.
Remote Procedure Call (RPC)	Calls remote functions as if they were local.	- Hides network complexity - Supports transparency (location/access)	gRPC, XML-RPC.
Remote Method Invocation (RMI)	Object-oriented RPC (invokes methods on remote objects).	- Preserves object identity - Tight language integration (e.g., Java RMI)	Enterprise JavaBeans (EJB).

Trade-offs:

RPC/RMI simplify development but introduce overhead (serialization, network latency).
Request-reply is lightweight but lacks advanced features.

3. Indirect Communication

Definition: Decoupled interaction via intermediaries.

Technique	Description	Decoupling	Example
Group Communication	One-to-many messaging to a group (membership managed dynamically).	Space: Senders unaware of recipients.	Chat applications, IoT device coordination.
Publish-Subscribe	Producers publish events; subscribers receive based on interests.	Space + Time: No direct sender-receiver link.	Stock market feeds, IoT event systems.
Message Queues	Producers send to queues; consumers pull messages (point-to-point).	Time: Async processing (e.g., batch jobs).	RabbitMQ, AWS SQS.
Tuple Spaces	Shared memory-like storage for structured data (read/write by pattern).	Space + Time: Persistent storage.	Scientific workflows, parallel computing.
Distributed Shared Memory (DSM)	Abstracts shared memory across nodes.	Space: Processes see a unified address space.	High-performance computing (HPC).

Advantages:

Scalability: No direct dependencies between entities.
Flexibility: Supports async, event-driven architectures.

Example:

Publish-Subscribe: A weather station publishing data to subscribers (e.g., apps, dashboards).

Comparison of Paradigms

Paradigm	Coupling	Scalability	Complexity	Use Case
IPC	Tight (direct)	Low	Low	Real-time systems.
Remote Invocation	Moderate (RPC/RMI)	Medium	Medium	Client-server apps.
Indirect	Loose	High	High	Large-scale, dynamic systems.

Key Takeaways

✅ IPC is foundational but limited to low-level control.
✅ Remote Invocation (RPC/RMI) balances abstraction and performance.
✅ Indirect Communication enables scalability and fault tolerance via decoupling.

Design Choice:

Need low latency? Use IPC or request-reply.
Need abstraction? Use RPC/RMI.
Need scalability? Use publish-subscribe or queues.

💡 Tip: Combine paradigms (e.g., RPC + message queues) for hybrid architectures.

Roles and Responsibilities in Distributed Systems

In distributed systems, processes (or objects/components/services) interact by taking on specific roles that define the system's architecture. The two primary architectural styles are:

1. Client-Server Architecture

Definition: A centralized model where processes assume distinct roles as clients (requesters) or servers (providers).
Characteristics:

Servers: Manage shared resources (e.g., files, databases).
Clients: Access resources by sending requests to servers.
Hierarchy: Servers can act as clients to other servers (e.g., a web server querying a DNS server).
Examples:
Web Browsers (Client) ↔ Web Servers (Server)
Search Engines:
- Acts as a server to user queries.
- Acts as a client when running web crawlers to index other sites.

Pros:

Simple to design and manage.
Clear separation of concerns.

Cons:

Scalability limitations (bottlenecks at central servers).

2. Peer-to-Peer (P2P) Architecture

Definition: A decentralized model where all processes (peers) have equal roles, sharing resources directly.

Characteristics:

No central server: Peers collaborate symmetrically.
Scalability: Resources grow with the number of participants.
Fault tolerance: No single point of failure.

Examples:

BitTorrent: File sharing across peers.
Blockchain Networks: Distributed ledger maintained by nodes.

Pros:

Highly scalable.
Resilient to failures.

Cons:

Complex to manage (e.g., coordination, security).

Placement Strategies

How entities map to physical infrastructure affects performance, reliability, and security. Key strategies:

Strategy	Description	Example
Multiple Servers	Partition or replicate services across servers.	Web servers hosting different sites; NIS replicating password files.
Caching	Store copies of frequently accessed data closer to clients.	Browser caches; web proxy servers.
Mobile Code	Download and execute code locally (e.g., applets).	Stock trading applets fetching real-time data.
Mobile Agents	Programs that migrate between nodes to perform tasks.	Automated price comparison agents.

Trade-offs:

Caching improves speed but risks stale data.
Mobile code/agents reduce network traffic but pose security risks.

Key Takeaways

✅ Client-server: Best for structured, centralized services (e.g., web apps).
✅ Peer-to-peer: Ideal for scalable, decentralized systems (e.g., file sharing).
✅ Placement matters: Optimize for performance (caching), scalability (P2P), or flexibility (mobile agents).

Design Choice:

Need central control? Use client-server.
Need scalability? Use P2P.
Need low latency? Use caching or mobile code.

💡 Tip: Hybrid architectures (e.g., edge computing) combine these paradigms for balanced performance.

Architectural Patterns in Distributed Systems

Architectural patterns provide reusable solutions to common design challenges in distributed systems. Below are key patterns:

1. Layering (Vertical Organization)

Definition: A system is partitioned into hierarchical layers, where each layer uses services from the layer below.
Purpose:

Encapsulates complexity (lower layers hide implementation details).
Promotes modularity and interoperability.

Example:

Network Time Protocol (NTP):
- Layer 1: Physical clock synchronization.
- Layer 2: Time server coordination.
- Layer 3: Client-facing time API.

Key Terms:

Platform: Hardware/OS layers (e.g., Intel x86/Linux).
Middleware: Masks heterogeneity (e.g., RPC, event notification).

2. Vertical Distribution (Multi-Tier Architecture)

Definition: Splits functionality into logical tiers, each running on separate nodes.

Tier	Role	Example
Client Tier	UI rendering (thin/fat client).	Web browser, mobile app.
Logic Tier	Business rules (middleware).	API servers, microservices.
Data Tier	Persistent storage.	Databases, file systems.

Communication Flow:

Client → Logic Tier (HTTP/RPC) → Data Tier (SQL/NoSQL)

Use Case: E-commerce apps (UI ↔ Backend ↔ Database).

3. Horizontal Distribution

Definition: Scales by adding more instances of the same tier (parallelism).

Approaches:

Load Balancing: Distributes requests across servers (e.g., NGINX).
Sharding: Partitions data across nodes (e.g., MongoDB shards).

Example:

Web Servers: Multiple instances handle user requests concurrently.

4. Thin vs. Fat Clients

Aspect	Thin Client	Fat Client
Processing	Minimal (relies on server).	Heavy (local computation).
Network Use	High (constant server calls).	Low (caches data locally).
Example	Web apps (React/Angular SPAs).	Desktop apps (Photoshop, games).

Trade-offs:

Thin: Easy updates, cross-platform, but latency-sensitive.
Fat: Offline-capable, but complex to maintain.

Key Takeaways

✅ Layering abstracts complexity (e.g., OSI model).
✅ Multi-tier separates concerns (UI/logic/data).
✅ Horizontal scaling handles load (add more servers).
✅ Thin clients reduce deployment headaches; fat clients optimize performance.

Design Choice:

Need scalability? Use horizontal distribution.
Need maintainability? Use thin clients + multi-tier.
Need offline functionality? Use fat clients.

💡 Tip: Combine patterns (e.g., horizontally scaled microservices with thin clients).