Lesson 2.3: Synchronizing physical clocks


Clock Synchronization in Distributed Systems

1. The Core Problem

Every computer in a distributed system has its own physical clock that drifts over time due to:

  • Hardware limitations: Quartz oscillators have inherent inaccuracies
  • Environmental factors: Temperature changes affect clock rates
  • Manufacturing variances: No two clocks are identical

The drift is mathematically expressed as: (1ρ)(tt)Hi(t)Hi(t)(1+ρ)(tt)(1-ρ)(t'-t) ≤ H_i(t')-H_i(t) ≤ (1+ρ)(t'-t) Where:

  • HiH_i is the hardware clock
  • ρρ is the maximum drift rate (e.g., 10610^{-6} for quartz)
  • tt and tt' are real times

2. Synchronization Types

Clock synchronization in distributed systems ensures that different computers (nodes) maintain consistent time measurements despite hardware and network variability. There are two primary synchronization types: external synchronization and internal synchronization.

External Synchronization
Ensures all local clocks match an authoritative external time source (e.g., UTC) within a bounded deviation D:

S(t)Ci(t)<Dfor all i{1,2,...,N}|S(t) - C_i(t)| < D \quad \text{for all } i \in \{1, 2, ..., N\}

Key Characteristics:

  • Requires connection to external time sources like:
    • GPS (μs accuracy)
    • NTP servers (ms accuracy)
    • Atomic clocks (ns accuracy)
  • Critical for applications needing absolute time:
    • Financial transaction logging
    • Regulatory compliance systems
    • Scientific experiment coordination

Implementation Methods:

  1. Network Time Protocol (NTP):

    • Hierarchical stratum model (Stratum 0-15)
    • Compensates for network latency using: Tcorrected=Tserver+(T2T1)(T4T3)2T_{corrected} = T_{server} + \frac{(T_2-T_1)-(T_4-T_3)}{2}
    • Typical accuracy: 1-50ms
  2. Precision Time Protocol (PTP):

    • Hardware timestamping for μs accuracy
    • Master-slave architecture
    • Requires specialized network hardware

Internal Synchronization
Maintains consistency between system clocks without UTC reference:

Ci(t)Cj(t)<Dfor all i,j{1,2,...,N}|C_i(t) - C_j(t)| < D' \quad \text{for all } i,j \in \{1, 2, ..., N\}

Key Characteristics:

  • Focuses on relative time consistency
  • Essential for:
    • Distributed transaction ordering
    • Event causality tracking
    • Consistent snapshots Implementation Methods:
  • Berkeley Algorithm
  • Gossip-based Protocols

3. Clock Correctness

Essential Properties:

  1. Bounded Drift: dCdt1<ρ\left|\frac{dC}{dt} - 1\right| < ρ Ensures clocks don't run too fast/slow

  2. Monotonicity: t>tC(t)>C(t)t' > t ⇒ C(t') > C(t) Critical for operations like file timestamps

Failure Modes:

  • Crash Failure: Clock stops (e.g., power loss)
  • Arbitrary Failure: Violates monotonicity (e.g., Y2K bug)

4. Synchronization Algorithms

Cristian's Algorithm (Basic Time Service):

Work Flow

  1. Client sends request at T1T_1
  2. Server responds at T2T_2 with S(Ts)S(T_s)
  3. Client calculates adjusted time: Tadjusted=S(Ts)+T2T12T_{adjusted} = S(T_s) + \frac{T_2-T_1}{2}

Algorithm

  1. The process on the client machine sends the request for fetching clock time(time at the server) to the Clock Server at time T0T_0
  2. The Clock Server listens to the request made by the client process and returns the response in form of clock server time.
  3. The client process fetches the response from the Clock Server at time T1T_1 and calculates the synchronized client clock time using the formula given below. Tclient=TServer+T1T02T_{client} = T_{Server} + \frac{T_1 - T_0}{2}

Berkeley Algorithm (Internal Sync):

Work Flow

  1. Master collects all clock values C1(t),...,CN(t)C_1(t),...,C_N(t)
  2. Computes average: Cˉ(t)=1N+1(Cmaster+i=1NCi(t))\bar{C}(t) = \frac{1}{N+1}\left(C_{master} + \sum_{i=1}^N C_i(t)\right)
  3. Sends individual corrections: Δi=Cˉ(t)Ci(t)\Delta_i = \bar{C}(t) - C_i(t)

Algorithm

  1. An individual node is chosen as the master node from a pool node in the network. This node is the main node in the network which acts as a master and the rest of the nodes act as slaves. The master node is chosen using an election process/leader election algorithm.
  2. Master node periodically pings slaves nodes and fetches clock time at them using Cristian's algorithm.
  3. Master node calculates the average time difference between all the clock times received and the clock time given by the master's system clock itself. This average time difference is added to the current time at the master's system clock and broadcasted over the network.

Network Time Protocol (NTP)

Cristian’s method and the Berkeley algorithm are intended primarily for use within intranets. The Network Time Protocol (NTP) [Mills 1995] defines an architecture for a time service and a protocol to distribute time information over the Internet. Features of NTP : Some features of NTP are -

  • NTP servers have access to highly precise atomic clocks and GPU clocks
  • It uses Coordinated Universal Time (UTC) to synchronize CPU clock time.
  • Avoids even having a fraction of vulnerabilities in information exchange communication.
  • Provides consistent timekeeping for file servers Working of NTP : NTP is a protocol that works over the application layer, it uses a hierarchical system of time resources and provides synchronization within the stratum servers. First, at the topmost level, there is highly accurate time resources' ex. atomic or GPS clocks. These clock resources are called stratum 0 servers, and they are linked to the below NTP server called Stratum 1,2 or 3 and so on. These servers then provide the accurate date and time so that communicating hosts are synced to each other. Advantages of NTP :
  • It provides internet synchronization between the devices.
  • It provides enhanced security within the premises.
  • It is used in the authentication systems like Kerberos.
  • It provides network acceleration which helps in troubleshooting problems. Used in file systems that are difficult in network synchronization. Disadvantages of NTP :
  • When the servers are down the sync time is affected across a running communication.
  • Servers are prone to error due to various time zones and conflict may occur.
  • Minimal reduction of time accuracy.
  • When NTP packets are increased synchronization is conflicted.
  • Manipulation can be done in synchronization.

5. Practical Considerations

Accuracy vs. Cost Tradeoff:

MethodAccuracyCostUse Case
GPS1μsHighFinancial systems
NTP1-50msLowEnterprise networks
PTP (IEEE 1588)1μsModerateIndustrial systems

Implementation Challenges:

  1. Network latency variability affects sync accuracy
  2. Security risks from fake time servers
  3. Temperature-induced drift in data centers: ρeffective=ρbase+0.001ΔTρ_{effective} = ρ_{base} + 0.001ΔT

6. Real-World Impact

Consequences of Poor Sync:

  • Database replication conflicts
  • Incorrect transaction ordering
  • Unreliable system logs

Best Practices:

  1. Use hierarchical time sources (stratum servers)
  2. Implement multiple sync protocols as fallback
  3. Monitor clock drift continuously: Drift Alert Threshold=ρ×Uptime\text{Drift Alert Threshold} = ρ × \text{Uptime}

💡 Critical Insight: The choice between external and internal synchronization depends on whether you need absolute time (for compliance) or just consistent ordering (for coordination).

All systems normal

© 2025 2023 Sanjeeb KC. All rights reserved.