Internet DRAFT - draft-ietf-ntp-ntpv4-algorithms
draft-ietf-ntp-ntpv4-algorithms
NTP WG W. Kasch, Ed.
Internet-Draft J. Burbank, Ed.
Expires: May 13, 2006 JHU/APL
D. Mills
U. Del.
November 9, 2005
The Network Time Protocol Version 4 Algorithm Specification
draft-ietf-ntp-ntpv4-algorithms-01
Status of this Memo
By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on May 13, 2006.
Copyright Notice
Copyright (C) The Internet Society (2005).
Abstract
The Network Time Protocol (NTP) is widely used to synchronize
computer clocks in the Internet. This memorandum describes the
algorithms used by Version 4 of the NTP (NTPv4) to calculate time
values.
Kasch, et al. Expires May 13, 2006 [Page 1]
Internet-Draft NTPv4 Algorithms Specification November 2005
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Clock Filter Algorithm . . . . . . . . . . . . . . . . . . . . 6
3. Clock Selection Algorithm . . . . . . . . . . . . . . . . . . 8
4. Clustering Algorithm . . . . . . . . . . . . . . . . . . . . . 9
5. Clock Combining Algorithm . . . . . . . . . . . . . . . . . . 10
6. Polling Algorithm . . . . . . . . . . . . . . . . . . . . . . 11
7. Clock Discipline Algorithm . . . . . . . . . . . . . . . . . . 17
7.1. Poll Interval Control . . . . . . . . . . . . . . . . . . 20
7.2. State Machine . . . . . . . . . . . . . . . . . . . . . . 20
8. Security Considerations . . . . . . . . . . . . . . . . . . . 22
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22
10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 22
11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 22
11.1. Normative References . . . . . . . . . . . . . . . . . . . 22
11.2. Informative References . . . . . . . . . . . . . . . . . . 23
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 24
Intellectual Property and Copyright Statements . . . . . . . . . . 25
Kasch, et al. Expires May 13, 2006 [Page 2]
Internet-Draft NTPv4 Algorithms Specification November 2005
1. Introduction
The Network Time Protocol Version 3 (NTPv3) specified in [1] has been
widely used to synchronize computer clocks in the global Internet.
It provides comprehensive mechanisms to access national time and
frequency dissemination services, organize the NTP subnet of servers
and clients and adjust the system clock in each participant. In most
places of the Internet of today, NTP provides accuracies of 1-50 ms,
depending on the characteristics of the synchronization source and
network paths.
NTP is designed for use by clients and servers with a wide range of
capabilities and over a wide range of network jitter and clock
frequency wander characteristics. Many users of NTP in the Internet
of today use a software distribution available from www.ntp.org. The
distribution, which includes the full suite of NTP options,
mitigation algorithms and security schemes, is a relatively complex,
real-time application. While the software has been ported to a wide
variety of hardware platforms ranging from personal computers to
supercomputers, its sheer size and complexity is not appropriate for
many applications. This facilitated the development of the Simple
Network Time Protocol Version 4 (SNTPv4) as described in [2].
Since the standardization of NTPv3, there has been significant
development which has led to Version 4 of the Network Time Protocol
(NTPv4). This document describes NTPv4, which introduces new
functionality to NTPv3 as described in RFC 1305, and functionality
expanded from that of SNTPv4 as described in RFC 2030 (SNTPv4 is a
subset of NTPv4).
When operating with current and previous versions of NTP and SNTP,
NTPv4 requires no changes to the protocol or implementations now
running or likely to be implemented specifically for future NTP or
SNTP versions. The NTP and SNTP packet formats are the same and the
arithmetic operations to calculate the client time, clock offset and
round trip delay are the same. To a NTP or SNTP server, NTP and SNTP
clients are indistinguishable; to a NTP or SNTP client, NTP and SNTP
servers are indistinguishable.
NTP usually operates simultaneously with multiple servers and may
have multiple clients of its own. NTP employs several algorithms
that together allow the calculation of time from messages that come
from an NTP or SNTP server. The overall organization of the
algorithms is illustrated in Figure 1. For every server there are
two processes, a peer process which receives and processes each
packet, and a companion poll process which sends packets to the
server at programmed intervals. State variables and data
measurements are maintained separately for each pair of processes in
Kasch, et al. Expires May 13, 2006 [Page 3]
Internet-Draft NTPv4 Algorithms Specification November 2005
a block of memory called the peer variables. The peer and poll
processes together with their variables collectively belong to an
association. Associations can be either temporary or permanent.
Permanent associations are described as persistent, while temporary
associations are referred to as preemptable or ephemeral.
..........................................................
. Remote .. Peer/Poll .. System .
. Servers .. Processes .. Process .
. .. .. .
.----------..-------------..-------------- .
.| |->| |..| | .
.|Server 1|..|Peer/Poll 1|->| | .
.| |<-| |..| | ............
.----------..-------------..| | . Clock .
.Discipline.
. .. ^ ..| | .. Process .
. .. | ..| | .. .
.----------..-------------..| | |-----------|.. .
.| |->| |..| Selection |->| ..-------- .
.|Server 2|..|Peer/Poll 2|->| and | | Combining |->| Loop | .
.| |<-| |..| Clustering | | Algorithm |..|Filter| .
.----------..-------------..| Algorithms |->| |.-----------
. .. ^ ..| | |-----------|. |
. .. | ..| | . |
.----------..-------------..| | . |
.| |->| |..| | . |
.|Server 3|..|Peer/Poll 3|->| | . |
.| |<-| |..| | . |
.----------..-------------..|------------| . |
....................^..................................... |
| |
| \|/
| ...............
| . /-----\ .
'----------------------------------<-| VFO |-<-.
. \-----/ .
. Clock Adjust.
. Process .
...............
Figure 1 NTPv4 Algorithm Interactions
As each NTP packet arrives, the server time is compared to the system
clock and an offset specific to that server is determined. The
system process refines these offsets using the selection, clustering
and combining algorithms and delivers a correction to the clock
discipline process, which functions as a lowpass filter to smooth the
data and close the feedback loop. The clock adjust process runs at
Kasch, et al. Expires May 13, 2006 [Page 4]
Internet-Draft NTPv4 Algorithms Specification November 2005
one-second intervals to amortize the corrections in small adjustments
that approximate a continuous, monotonic clock. The output of the
combining algorithm represents the best estimate of the system clock
offset relative to the server ensemble. The discipline algorithm
adjusts the frequency of the variable frequency oscillator (VFO) to
minimize this offset. Finally, the timestamps of each server are
compared to timestamps derived from the VFO in order to calculate the
server offsets and close the feedback loop.
Depending on whether an NTP host is acting as server or client, or
whether the host is an SNTP or full NTP host, the subset of
algorithms it employs varies. The relationship between host role/
type and algorithm employment is summarized in Table 1.
Table 1. Relationship between Algorithms and NTP Host Type/Role
=====================================================================
| Algorithm | Applicability |
---------------------------------------------------------------------
| Clock Filter | Required by all NTP Servers |
---------------------------------------------------------------------
| Clock Selection | Applies to NTP hosts utilizing more than one |
| source |
---------------------------------------------------------------------
| Clustering | Applies to NTP hosts utilizing more than one |
| | source |
---------------------------------------------------------------------
| Clock Combining | Applies to NTP hosts utilizing more than one |
| | source |
---------------------------------------------------------------------
| Polling | Applies to all NTP hosts. |
---------------------------------------------------------------------
| Clock Discipline | Not required by SNTP clients. Applies to all|
| | other NTP hosts. |
---------------------------------------------------------------------
This document is organized as follows. Section 2 describes the clock
filter algorithm. Section 3 describes the clock selection algorithm.
Section 4 describes the clustering algorithm. Section 5 describes
the clock combining algorithm. Section 6 describes the polling
algorithm. Section 7 describes the clock discipline algorithm.
Sections 8 and 9 presents Security Considerations and IANA
Considerations, respectively. Much of the information contained
within this document is based on material from. [3]
NTPv4 is hereafter referred to simply as NTP, unless explicitly
noted.
The remainder of this document contains numerous variables and
Kasch, et al. Expires May 13, 2006 [Page 5]
Internet-Draft NTPv4 Algorithms Specification November 2005
mathematical expressions. Those variables take the form of Greek
characters. Those Greek characters are spelled out by their full
name, with capitolized variables referring to the upper case Greek
character. For example Delta refers to the uppercase Greek
character, where delta refers to the lowercase Greek character.
Furthermore, subscripts are denoted with a '_' separating the
variable name and the subscript. For example 'theta_i' refers to the
variable lowercase Greek character theta with subscript i, or
phenotically 'theta sub i.'
2. Clock Filter Algorithm
The NTP clock filter algorithm selects the most appropriate sample
data while rejecting noise spikes due to packet collisions and
network congestion. The clock offset (theta) and roundtrip delay
(delta) samples are computed from the four most recent timestamps.
Without making any assumptions about the delay distributions, but
assuming the frequency difference or skew between the server and peer
clocks can be neglected, let (theta, delta) represent the offset and
delay when the path is otherwise idle; thus (theta, delta) represents
the true offset and delay values. The clock filter algorithm
essentially acts as an accurate estimator and produces an estimate of
the time, known as (theta_hat, delta_hat), from a sample sequence
(theta_i, delta_i), where i denotes a particular sample at some time,
collected for the path over an appropriate interval under ambient
traffic conditions.
The design of the clock filter algorithm was suggested by the
observation that packet switching networks are most often operated
well below the knee of the throughput-delay curve, which means that
packet queues are mostly small with relatively infrequent bursts. In
addition, the routing algorithm most often operates to minimize the
number of packet-switch hops and thus the number of queues. Not only
is the probability that an NTP packet finds a busy queue in one
direction relatively low, but the probability of packets from a
single exchange finding busy queuesin both directions is even lower.
Therefore, the best offset samples should occur with the lowest
delays.
Upon arrival of an NTP packet resulting from some poll interval at
time t=0, a shift register containing four variables (theta_i,
delta_i, e_i, t_i) is populated with the 0th sample, (theta_0,
delta_0, e_0, t_0). Here, e is the error (in seconds), which is
initially set to precision and grown at a rate r=15 ppm for each
epoch. If a packet has not arrived for three successive poll
intervals, then the sample (0, 0, 16, t) is shifted into the
register, where t is the last current known time. Missing data
Kasch, et al. Expires May 13, 2006 [Page 6]
Internet-Draft NTPv4 Algorithms Specification November 2005
samples that force this condition are never used in subsequent filter
calculations, but do prevent very old (i.e. stale) samples from being
used.
Next, the register contents are copied to a temporary list and sorted
by the metric lambda designed to avoid missing data and devalued
samples older than the compromise Allan intercept sigma_y(x) = 1500
s. The Allan intercept is the intersection coordinate (x, y) of the
phase and frequency lines. It characterizes each particular timing
source and clock oscillator. A useful statistic is the x value,
which specifies the optimum time constant for the particular source
and oscillator combination. The x value ranges from about 500 s to
2000 s. Above this value the performance is limited by oscillator
wander, while below this value the performance is limited by system
jitter. For comparison, the NTPv4 clock discipline time constant is
about 1000 s at a poll interval of 64 s. The y statistic represents
the best stability that can be achieved for the particular source and
oscillator, but is not useful for performance optimization. For this
reason, the term Allan intercept applies to the x value at the
intercept point.
If e_j = infinity, then lambda_j = infinity; else, if t_j-t >
sigma_y(x) then lambda_j = K_d + e_j; else, lambda_j = delta_j, where
K_d = 1 s is the selection threshold. The algorithm essentially
sorts the data by exchanging sets; however, an exchange is not made
unless to do so would reduce the metric by at least the value of the
precision. In other words, it does not make sense to change the
order in the list, which might result in the loss of otherwise good
samples, unless the metric change is significant. The first entry
(theta_0, delta_0, e_0, t_0) on the temporary list represents the
lowest delay sample, which is used to update the peer offset theta =
theta_0 and peer delay delta = d_0. The peer dispersion e is
calculated from the temporary list:
e=sum from k=0 to k=n-1 of [e_k/(2^(k+1))].
Finally, the temporary list is trimmed by discarding all entries
where lambda_j = infinity and all but the first devalued entry
lambda_j >= K_d, if one is present, leaving m (0 <= m < n) surviving
entries on the list. The peer jitter psi is used by the clustering
algorithm as a quality metric and in the computation of the expected
error:
psi=[ (1 / (m-1) ) * (sum from k = 1 to k = m-1 of [ (theta_k -
theta_0)^2) ]) ^ (1/2) ) ].
A 'popcorn spike' is a transient outlier, usually only a single
sample, that is typical of congested Internet paths. The popcorn
Kasch, et al. Expires May 13, 2006 [Page 7]
Internet-Draft NTPv4 Algorithms Specification November 2005
spike suppressor is designed to detect and remove them. Let
theta_prime be the peer offset determined by the previous message and
psi the current peer jitter. If |theta - theta_prime| > (K_s * psi),
where K_s is a tuning parameter that defaults to 3, the sample is a
popcorn spike and is discarded.
Note that the peer jitter will increase to protect a legitimate step
change.
As demonstrated by simulation and practical experience, it is prudent
to avoid using samples more than once. Let t_p be the epoch the peer
variables were last updated and t_0 the epoch of the first sample on
the temporary list. If t_0 <= t_p, the new sample is a duplicate or
earlier than the last one used. If this is true, the algorithm exits
without updating the system clock; otherwise, t_p = t_0 and the
offset can be used to update the system clock. The components of the
five-tuple (theta, delta, e, psi, t_p) are called the peer variables.
3. Clock Selection Algorithm
In order to provide reliable synchronization, NTP uses multiple
redundant servers and multiple disjoint network paths whenever
possible. When a number of associations are established, it is not
clear beforehand which are truechimers and which are falsetickers. A
'truechimer' is a clock that maintains timekeeping accuracy to a
previously published (and trusted) standard, while a 'falseticker' is
a clock that do not maintain that level of timekeeping accuracy.
Crucial to the success of this approach is a robust algorithm which
finds and discards the falsetickers from the raw server population,
since the timekeeping accuracy of a particular server may not be
known a priori. The clock selection algorithm determines from among
all associations a suitable subset of truechimers capable of
providing the most accurate and trustworthy time using principles
similar to. [4]
The true offset theta of a correctly operating clock relative to UTC
must be contained in a computable range, called the confidence
interval, equal to the root distance defined below. Marzullo and
Owicki devised an algorithm designed to find the intersection
interval containing the correct time given the confidence intervals
of m clocks, of which no more than f are considered incorrect. The
algorithm finds the smallest intersection interval containing points
in at least (m - f) of the given confidence intervals. [5]
The clock selection algorithm operates as follows:
Kasch, et al. Expires May 13, 2006 [Page 8]
Internet-Draft NTPv4 Algorithms Specification November 2005
1. For each of m associations, construct a correctness interval
[(theta - rootdist()), (theta + rootdist())].
2. Select the lowpoint, midpoint and highpoint of these
intervals. Sort these values in a list from lowest to highest.
Set the number of falsetickers f = 0.
3. Set the number of midpoints d = 0. Set c = 0. Scan from
lowest endpoint to highest. Add one to c for every lowpoint,
subtract one for every highpoint, add one to d for every midpoint.
If c >= m - f, stop; set l = current lowpoint
4. Set c = 0. Scan from highest endpoint to lowest. Add one to
c for every highpoint, subtract one for every lowpoint, add one to
d for every midpoint. If c >= m - f, stop; set u = current
highpoint.
5. Is d = f and l < u?
if yes, then follow step 5y, else, follow step 5n.
5y. Success: the intersection interval is [l, u].
5n. Add one to f. Is f < (m / 2)? If yes, then go to step 3
again. If no, then go to step 6.
6. Failure; a majority clique could not be found. Stop
algorithm.
4. Clustering Algorithm
NTP configurations usually include several servers in order to
provide sufficient redundancy for the selection algorithm to
determine which are truechimers and which are not. When a sizeable
number of servers are present, the individual clock offsets for each
are not always the same, even if each server is closely synchronized
to UTC by one means or another. Small systematic differences in the
order of a millisecond or two are usually due to interface and
network latencies. Larger differences are due to asymmetric delays
and in the extreme due to asymmetric satellite/landline delays.
The clustering algorithm sifts the truechimers of the selection
algorithm to identify the survivors providing the best accuracy. In
principle, the sift could result in a single survivor and its offset
estimate used to discipline the system clock; however, a better
estimate usually results if the offsets of a number of survivors are
averaged together. So, a balance must be struck between reducing the
Kasch, et al. Expires May 13, 2006 [Page 9]
Internet-Draft NTPv4 Algorithms Specification November 2005
selection jitter by casting off outlyers and improving the offset
estimate by including more survivors in the average.
The clustering algorithm steps follow:
1. Let (theta, phi, Lambda) represent a candidate peer with
offset theta, jitter j and a weight factor Lambda = stratum *
MAXDIST + rootdist().
2. Sort the candidates by increasing Lambda. Let n be the number
of candidates and NMIN the minimum number of survivors.
3. For each candidate compute the selection jitter jsubS (RMS
peer offset differences between this and all other candidates).
4. Select j_max as the candidate with maximum j_S.
5. Select j_min as the candidate with minimum j_S.
If yes, go to step 6y. If no, go to step 6n.
6y. Done. The remaining cluster survivors are correct. The
survivors are in the v. structure sorted by Lambda.
6n. Delete the outlyer candidate with j_max; reduce n by one, and
go back to step 3.
5. Clock Combining Algorithm
The selection and clustering algorithms operate to select a single
system peer based on stratum and root distance. The result is that
the NTP subnet forms a logical tree with the primary servers at the
root and other servers at increasing stratum levels toward the
leaves. However, since each server on the tree ordinarily runs the
NTP protocol with several other servers at equal or lower stratum,
these servers can provide diversity paths for backup and cross
checking. While these other paths are not ordinarily used directly
for synchronization, it is possible that increased accuracy can be
obtained by averaging their offsets according to appropriately chosen
weights.
The result of the clustering algorithm is a set of survivors (there
must be at least one) that represent truechimers, or correct clocks.
If only one peer survives or if the prefer peer is among the
survivors, that peer becomes the system peer and the combining
algorithm is not used. Otherwise, the final clock correction is
determined by the combining algorithm.
Kasch, et al. Expires May 13, 2006 [Page 10]
Internet-Draft NTPv4 Algorithms Specification November 2005
Let the three-tuple (theta_i, psi_i, Lambda_i) represent the peer
offset, peer jitter, and root distance for the ith survivor. Then
the combined peer offset and peer jitter is, respectively:
T = (a*sum) over all i of [theta_i / Lambda_i] and psi_r = (a * sum)
over all i of [(psi_i)^2 / Lambda_i]^(1/2),
where a is a normalization constant:
a=1/[sum over all i of [1 / Lambda_i].
The result T is the system offset processed by the clock discipline
algorithm. Note that the root distance cannot be less than the
precision in order to avoid divide exceptions.
Let psi_s represent the selection jitter associated with the system
peer and psi_r as above. Then the system jitter is defined as:
sj=[(psi_r)^2 + (psi_s)^2]^(1/2).
The system jitter represents the best estimate of error in computing
the clock offset. It is interpreted as the expected error statistic
available to application program.
6. Polling Algorithm
The poll process determines whether and when to send a poll message
to the server. Ordinarily, polls are sent at regular intervals
determined by the clock discipline time constant. In some cases
where justified by network load, performance can be improved and
network jitter reduced by sending several messages instead of just
one. This can be done when the server is unreachable, when it is
reachable or both. The most common cases where this is advisable is
when using very large poll intervals in the order of several hours or
more
The poll interval starts out normally at about one minute. If the
offset is less than a tuning constant times the system jitter for
some number of polls, it is increased, but usually not above 1024
seconds Otherwise, it is decreased, but usually not below 64 seconds.
The limits can be changed to a lower limit of 16 seconds and/or to an
upper limit of 36 hours. In order to minimize network traffic, when
a server has not been heard for some time, the poll interval is
increased in stages to 1024 seconds.
The poll process sends packets to the server at designated intervals
tau and updates the "reach" register which establishes whether the
Kasch, et al. Expires May 13, 2006 [Page 11]
Internet-Draft NTPv4 Algorithms Specification November 2005
server is reachable. Table 2 shows the poll process routines and
Table 3 the variables shared by the process routines, including
poll(), peer_xmit(), fast_xmit() and poll_update().
Table 2. Poll Process Routines
=====================================================================
| Name | Description | Related routines |
---------------------------------------------------------------------
| poll | poll | *clock_adjust, clock_filtert, |
| | | peer_xmit, poll_update |
---------------------------------------------------------------------
| poll_update | poll update | *packet, *poll |
---------------------------------------------------------------------
| peer_xmit | peer transmit | *poll, md5 |
---------------------------------------------------------------------
| fast_xmit | fast transmit | *receive, md5 |
---------------------------------------------------------------------
Table 3. Poll Process Variables
=====================================================================
| Name | Process | Variable Description |
---------------------------------------------------------------------
| hpoll | poll | host poll interval |
| hmode | poll | host mode |
| count | poll | burst counter |
| reach | poll | "reach" register |
| unreach | poll | unreach counter |
| t | local clock | current time |
| tau | local clock | poll interval |
| rho | system | system peer |
| M_BCST | parameter | broadcast server |
| M-BCLN | parameter | broadcast client |
| B_BURST | peer flag | burst enable |
| B_IBURST | peer flag | initial burst enable |
| B_COUNT | parameter | pkts in a burst |
---------------------------------------------------------------------
The poll() routine is described in Figure 2. Each time the poll()
routine is called, the reach variable is shifted left by one bit.
When a packet is accepted by the packet() routine in the peer process
the rightmost bit is set to one. As long as reach is nonzero, the
server is considered reachable. However, if the rightmost three bits
become zero, indicating that packets from the server have not been
received for at least three poll intervals, a sample with MAXDIST
dispersion is shifted in the clock filter. This causes the server to
be devalued in the mitigation process. The unreach counter
increments at each poll interval; it is reset to zero if the reach
register is nonzero. If the counter exceeds the UNREACH parameter,
Kasch, et al. Expires May 13, 2006 [Page 12]
Internet-Draft NTPv4 Algorithms Specification November 2005
the poll exponent is incremented for each succeeding poll. This
reduces useless network load in case of server failure. The poll()
routine can operate in three modes.
Ordinarily, polls are sent at the interval selected by hpoll and
ppoll poll exponents assigned. However, if the iburst feature is
enabled and the server is not reachable, a burst of eight polls is
sent at two-second intervals. Alternatively or in addition, if the
burst feature is enabled and the server is reachable, a burst of
eight polls is sent as with iburst. This is especially useful at
very large poll intervals of many hours. The remaining routines are
straightforward. The poll() routine calls the peer_xmit() routine
when an association has been mobilized. The receive() routine calls
fast_xmit() when a client mode packet is received. Both cases are
shown in Figure 3. These routines copy values from the association
(peer_xmit()) or from the arriving packet (fast_xmit()) as shown in
the accompanying tables. The poll_update() routine shown in Figure 4
determines the next poll interval or burst interval. Variable names
in both routines are referenced in Tables 4 and 5, respectively.
--------------- -----------------
| poll() |-->| hmode=M_BCST? |
--------------- -----------------
if hmode=M_BCST == YES:
---------------
|s.rho = NULL?|
---------------
if s.rho=NULL == YES:
--------------- ---------------
|poll_update()|-->| exit() |
--------------- ---------------
if s.rho=NULL == NO:
---------------
|hmode=M_BCLN?|
---------------
if hmode=M_BCLN == YES:
--------------- ---------------
|poll_update()|-->| exit() |
--------------- ---------------
if hmode_M_BCLN == NO:
--------------- --------------- ---------------
| peer_xmit() |-->|poll_update()|-->| exit() |
--------------- --------------- ---------------
if hmode=M_BCST == NO:
---------------
| burst = 0? |
---------------
if burst = 0 == YES:
--------------- ---------------
Kasch, et al. Expires May 13, 2006 [Page 13]
Internet-Draft NTPv4 Algorithms Specification November 2005
| reach <<=1 |-->| reach = 0? |
--------------- ---------------
if reach = 0 == YES:
----------------------
| unreach > UNREACH? |
----------------------
if unreach > UNREACH == YES:
--------------- --------------- -----go to-----
| hpoll++ |-->| unreach++ |-->|hmode=M_BCLN?|
--------------- --------------- ---------------
(go to hmode=M_BCLN routine earlier in the figure)
if unreach > UNREACH == NO:
---------------
| B_IBURST? |
---------------
if B_IBURST == YES:
---------------
| fit()? |
---------------
if fit() == YES:
---------------- -----go to-----
| burst=BCOUNT |-->| unreach++ |
---------------- ---------------
if fit() == NO:
-----go to-----
| unreach++ |
---------------
if B_IBURST == NO:
-----go to-----
| unreach++ |
---------------
if reach = 0 == NO:
---------------
| unreach = 0?|
---------------
if unreach = 0 == YES:
--------------------
| reach & 0x7 = 0? |
--------------------
if reach & 0x7 = 0 == YES:
---------------------------
| clock_filter(0,0,inf,t) |
---------------------------
\|/
-----------------
| hpoll = c,tau |
-----------------
\|/
Kasch, et al. Expires May 13, 2006 [Page 14]
Internet-Draft NTPv4 Algorithms Specification November 2005
-----------------
| B_BURST? |
-----------------
if B_BURST == YES:
---------------
| unreach = 0?|
---------------
if unreach = 0 == YES:
--------------- -----go to-----
|burst=BCOUNT |-->|hmode=M_BCLN?|
--------------- ---------------
if unreach = 0 == NO:
-----go to-----
|hmode=M_BCLN?|
---------------
if B_BURST == NO:
-----go to-----
|hmode=M_BCLN?|
---------------
if unreach = 0 == NO:
-----go to-----
| hpoll=c,tau |
---------------
if burst = 0 == NO:
--------------- -----go to-----
| burst-- |-->|hmode=M_BCLN?|
--------------- ---------------
END OF ROUTINE
Figure 2. Poll Routine
Table 4. Peer Fast Transmit Table
=====================================================================
| Variable Name | Description |
---------------------------------------------------------------------
| T1 | Origin Timestamp in NTP packet field |
| T2 | Receive Timestamp in NTP packet field |
| T3 | ??? |
| mac | ??? |
---------------------------------------------------------------------
Kasch, et al. Expires May 13, 2006 [Page 15]
Internet-Draft NTPv4 Algorithms Specification November 2005
---------------------------
| peer_xmit(),fast_xmit() |
---------------------------
\|/
---------------------------
| *copy header |
---------------------------
\|/
---------------------------
| T1, T2 |
---------------------------
\|/
---------------------------
| T3 = clock |
---------------------------
\|/
---------------------------
| mac = md5() |
---------------------------
\|/
---------------------------
| xmitpacket() |
---------------------------
\|/
---------------------------
| exit |
---------------------------
Figure 3. Peer Fast Transmit
Table 5. Poll Process Variables
=====================================================================
| Name | Process | Variable Description |
---------------------------------------------------------------------
| ppoll | peer | peer poll interval |
| hpoll | poll | host poll interval |
| burst | poll | burst counter |
| timer | poll | poll timer |
| BTIME | parameter | burst time |
| MINPOLL | parameter | minimum poll interval |
| MAXPOLL | parameter | maximum poll interval |
---------------------------------------------------------------------
Kasch, et al. Expires May 13, 2006 [Page 16]
Internet-Draft NTPv4 Algorithms Specification November 2005
---------------------------
| poll_update() |
---------------------------
\|/
---------------------------
| burst = 0? |
---------------------------
if burst = 0 == YES:
--------------------------------------
|hpoll=max(min(MAXPOLL,poll),MINPOLL)|
--------------------------------------
\|/
--------------------------------------
|timer=(max(min(ppoll,hpoll),MINPOLL)|
--------------------------------------
\|/
--------------------------------------
| exit |
--------------------------------------
if burst = 0 == NO:
--------------------------------------
| timer running? |
--------------------------------------
if timer running == YES:
--------------------------------------
| exit |
--------------------------------------
if timer running == NO:
--------------------------------------
| timer = BTIME |
--------------------------------------
\|/
--------------------------------------
| exit |
--------------------------------------
END OF ROUTINE
Figure 4. Poll Update
7. Clock Discipline Algorithm
The clock discipline algorithm synchronizes the computer clock with
respect to the best time value from each server and the best
combination of servers. This algorithm automatically adapts to
changes in operating environment without manual configuration or
real-time management functions. The clock discipline algorithm is
implemented as the feedback control system shown in Figure 5.
Kasch, et al. Expires May 13, 2006 [Page 17]
Internet-Draft NTPv4 Algorithms Specification November 2005
---------
thetar + | \ +----------------+
NTP --------->| Phase \ V_d | | V_s
thetac - | Detector ------>| Clock Filter |-----+
+-------->| / | | |
| | / +----------------+ |
| --------- |
| |
----- |
/ \ |
| VFO | |
\ / |
----- +-------------------------------------+ |
^ | Loop Filter | |
| | | |
| | +---------+ x +-------------+ | |
| | | |<-----| | | |
+------|-| Clock | y | Phase/Freq |<---|------+
| | Adjust |<-----| Prediction | |
| | | | | |
| +---------+ +-------------+ |
| |
+-------------------------------------+
Figure 5. Clock Discipline Algorithm
The variable theta_r represents the combined server reference phase
and theta_c the control phase of the VFO. Each update received from
a server produces a signal V_d representing the instantaneous phase
difference theta_r - theta_c. The clock filter for each server
functions as a tapped delay line, with the output taken at the tap
selected by the clock filter algorithm. The selection, clustering
and combining algorithms combine the data from multiple filters to
produce the signal V_s. The loop filter, with impulse response F(t),
produces the signal V_c which controls the VFO frequency omega_c.
Thus, its phase theta_c follows:
theta_c = integral over t of (omega_c(t) dt)
which closes the loop. The V_c signal is generated by an adjustment
process which runs at intervals of one second in the NTP daemon or
one tick in the kernel. are set to 0,
The NTPv4 discipline includes both PLL and FLL capabilities. The
selection of which mode to use, FLL or PLL and in what combination,
is made on the basis of the poll exponent tau. In the NTPv4 design,
PLL mode is used at smaller values of tau, while FLL mode is used at
larger values. In between, a combination of PLL and FLL modes is
Kasch, et al. Expires May 13, 2006 [Page 18]
Internet-Draft NTPv4 Algorithms Specification November 2005
used. This improves the clock accuracy and stability, especially for
poll intervals larger than the Allan intercept.
x <------(Phase Correction)<--.
|
y_FLL |
.-(FLL Predict)<-------+<--V_s
| |
\|/ |
y <--(Sum) |
^ |
| |
'-(PLL Predict)<-------'
y_PLL
Figure 6. FLL/PLL Prediction Functions
In PLL mode y is a time integral over all past values of V_s, so the
PLL frequency adjustment required is:
y_PLL = [ (V_s * mu) / ( (64 * T_c) ^ 2) ].
where T_c is the time constant. In FLL mode, is an average of past
frequency changes, as computed from V_s and mu. The goal of the
algorithm is to reduce V_s to zero; so, to the extent this has been
successful in the past, previous values can be assumed zero and the
average becomes:
y_FLL = [ (V_s - x) / (8 * mu) ].
where x is the residual phase error computed by the clock adjust
process.
Finally, in both PLL and FLL modes set the phase to x = V_s and
frequency y = [y + y_PLL + y_FLL].
Once each second the adjustment process computes a phase increment z
= [ x / (16 * T_c) ] and new phase adjustment x = x - z. The phase
increment z is passed to the kernel time adjustment function. This
continues until the next update which recomputes x and y.
A key factor in the performance of the PLL/FLL hybrid algorithm are
the weight factors for the y_PLL and y_FLL adjustments, which depend
on the poll exponent tau which in turn determines the time constant
T_c = (2 ^ tau), in seconds. PLL contributions should dominate at
the lower values of tau, while FLL contributions should dominate at
the higher values. The clock discipline algorithm response times to
several PPM deviation examples is presented in . [3]
Kasch, et al. Expires May 13, 2006 [Page 19]
Internet-Draft NTPv4 Algorithms Specification November 2005
7.1. Poll Interval Control
The NTPv4 algorithm aims to set the averaging time somewhere near the
Allan intercept. A key to this strategy is the measured clock jitter
and oscillator wander statistics. The clock jitter is estimated from
phase differences psi_c = ( <Delta_x^2> ^ (1/2) ), where the brackets
indicate an exponential average. The oscillator wander is estimated
from frequency differences psi_f = (T_c * <Delta_y^2> ^ (1/2) ). As
the poll exponent tau increases, it is expected that psisubc will
decrease and psisubf will increase, depending on the relative
contributions of phase noise and frequency noise.
In the NTPv4 algorithm at each update a counter is incremented by one
if x is within the bound |x|< (4 * psi_c), where the constant 4 is
determined by experiment, and decremented by one otherwise.
In order to avoid needless hunting, a degree of hysteresis is built
into the scheme. If the counter reaches an upper limit 30, tau is
increased by one; if it reaches a lower limit -30, tau is reduced by
two. In either case the counter is reset to zero. Under normal
conditions tau increases in stages from a default lower limit of 6
(64 s) to a default upper limit of 10 (1024 seconds). However, if
the wander increases because the oscillator frequency is deviating
too fast, tau is quickly reduced. Once the oscillator wander
subsides, tau is slowly increased again. Under typical operating
conditions, tau hovers close to the maximum; but, on occasions of a
heat spike when the oscillator wanders more than about 1 PPM, it
quickly drops to lower values until the wander subsides.
7.2. State Machine
The clock discipline must operate over an extremely wide range of
network jitter and oscillator wander conditions without manual
intervention or prior configuration. As determined by past
experience and experiment, the data grooming algorithms work well to
sift good data from bad, especially under conditions of light to
moderate network and server loads. The PLL/FLL hybrid algorithm may
perform poorly and even become unstable under heavy network loading.
The state machine functions to bypass some discipline functions under
conditions of hardware or software failure, severe time or frequency
transients and especially when the poll interval is relatively large.
Under normal conditions the NTP discipline algorithm writes the
current frequency offset to a file at hourly intervals. Once the
file is written and the daemon is restarted after reboot, for
example, it initializes the frequency offset from the file, which
avoids the training time, possibly several hours, to determine the
intrinsic frequency offset when the daemon is started for the first
Kasch, et al. Expires May 13, 2006 [Page 20]
Internet-Draft NTPv4 Algorithms Specification November 2005
time. When toll charges accrue for every NTP message, as in a
telephone modem service, it is important to determine the presence of
a possibly large intrinsic frequency offset, especially if the
interval between telephone calls must be 15 minutes or more. For
instance, without the state machine it might take many calls spaced
at 15 minutes until the frequency offset is determined and the call
spacing can be increased. With the state machine it usually takes
only two calls to complete the process.
The clock state machine transition function is shown in Table 6. It
determines the action and next state when an update with specified
offset occurs in a given state shown in the first column. The second
column shows what happens if the offset is less than the step
threshold, the third when the step threshold is exceeded but not the
stepout threshold and the third when both thresholds are exceeded.
The state machine responds to the current state and event to cause
the action shown.
Table 6. Clock State Machine Transition Function
=====================================================================
| State | abs(T) < STEP | abs(T) > STEP | Comments |
---------------------------------------------------------------------
| NSET | > FREQ; adjust | > FREQ; step | no frequency |
| | time | time | file |
---------------------------------------------------------------------
| FSET | > SYNC; adjust | > SYNC; step | frequency file |
| | time | time | |
---------------------------------------------------------------------
| SPIK | > SYNC; adjust | if (<900 s)>SPIK | outlier detected |
| | freq, adjust time | else SYNC; step | |
| | | freq; step time | |
---------------------------------------------------------------------
| FREQ | if (<900 s)> FREQ | if (<900 s)>FREQ | initial frequency |
| | else >SYNC; step | else >SYNC; step | |
| | freq, adjust time | freq, adjust time | |
---------------------------------------------------------------------
| SYNC | >SYNC; adjust freq| if (<900 s)>SPIK | normal operation |
| | adjust time | else >SYNC; step | |
| | | freq; step time | |
---------------------------------------------------------------------
The actions listed in the state diagram include adjust-frequency,
step-frequency, adjust-time and step-time actions. The normal action
in the SYNC state is to adjust both frequency and time. The step-
time action is to set the system clock, while the step-frequency
action is to calculate the frequency offset directly. This has to be
done carefully to avoid contamination of the frequency estimate by
the phase adjustment since the last update.
Kasch, et al. Expires May 13, 2006 [Page 21]
Internet-Draft NTPv4 Algorithms Specification November 2005
The machine can be initialized in two states, FSET if the frequency
file is present or NSET if it has not yet been created. If the file
is not present, this may be the first time the discipline has ever
been activated, so it may have to quickly determine the oscillator
intrinsic frequency offset. It is important to realize that a number
of NTP messages may be exchanged before the mitigation algorithms
determine a reliable time offset and call the clock discipline
algorithm.
When the first valid offset arrives in the NSET state, (1) the time
is stepped to that offset, if necessary, (2) the watchdog counter is
started and (3) the machine exits to the FREQ state. Subsequently,
updates will be ignored until the stepout threshold has been reached,
at which time the frequency is stepped, the time is stepped if
necessary, and the machine exits to SYNC state. When the first valid
offset arrives in the FSET state, the frequency has already been
initialized, so the machine does the same things as in the NSET
state, but exits to the SYNC state.
In the SYNC state the machine watches for outliers above the step
threshold. If one is found, the machine exits to SPIK state and
starts the watchdog timer. If another offset less than the step
threshold is found, the counter is stopped and the machine exits to
the SYNC state. If the watchdog timer reaches the stepout threshold,
the time and frequency are both stepped as required and the machine
exits to the SYNC state.
8. Security Considerations
There are no security considerations.
9. IANA Considerations
There are no IANA considerations.
10. Acknowledgements
11. References
11.1. Normative References
Kasch, et al. Expires May 13, 2006 [Page 22]
Internet-Draft NTPv4 Algorithms Specification November 2005
11.2. Informative References
[1] Mills, D., "Network Time Protocol (Version 3) Specification,
Implementation", RFC 1305, March 1992.
[2] Mills, D., "Simple Network Time Protocol (SNTP) Version 4 for
IPv4, IPv6 and OSI", RFC 2030, October 1996.
[3] "D. Mills, "Computer Network Time Synchronization: The Network
Time Protocol", DRAFT.", .
[4] "Dolev, D., Halpern, J., Simons, B., and Strong R., "Dynamic
Fault-Tolerant Clock Synchronization," JACM 42, January 1995,
pp. 143-185.", .
[5] "Berthaud, J.M., "Time Synchronization over Networks using
Convex Closures," IEEE/ACM Transactions on Networking, April
2000, pp. 265-277.", .
[6] "Burbank, J., Martin, J., and Mills, D., Network Time Protocol
Version 4 Protocol Specification,
<draft-ietf-ntp-ntpv4-proto-01.txt>, October 2005, work in
progress.", .
Kasch, et al. Expires May 13, 2006 [Page 23]
Internet-Draft NTPv4 Algorithms Specification November 2005
Authors' Addresses
William Kasch (editor)
The Johns Hopkins University Applied Physics Lab
11100 Johns Hopkins Road
Laurel, Maryland 20723-6099
United States
Phone: +1 443 778 7463
Email: william.kasch@jhuapl.edu
Jack Burbank (editor)
The Johns Hopkins University Applied Physics Laboratory
11100 Johns Hopkins Road
Laurel, Maryland 20723-6099
United States
Phone: +1 443 778 7127
Email: jack.burbank@jhuapl.edu
Dr. David L. Mills
University of Delaware
Newark, Delaware 19716
United States
Phone: +1 302 831 8247
Email: mills@udel.edu
Kasch, et al. Expires May 13, 2006 [Page 24]
Internet-Draft NTPv4 Algorithms Specification November 2005
Intellectual Property Statement
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at
ietf-ipr@ietf.org.
Disclaimer of Validity
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Copyright Statement
Copyright (C) The Internet Society (2005). This document is subject
to the rights, licenses and restrictions contained in BCP 78, and
except as set forth therein, the authors retain all their rights.
Acknowledgment
Funding for the RFC Editor function is currently provided by the
Internet Society.
Kasch, et al. Expires May 13, 2006 [Page 25]