State machines for large scale computer software and systems
State machines for large scale computer software and systems
Yodaiken, Victor
2016-08-04 00:00:00
State machines for large scale computer software and systems 1 Abstract. This note shows how the behavior and architecture of large scale discrete state systems found in computer software and hardware can be specified and analyzed using a particular class of primitive recursive functions. First, the utility of the method is illustrated by a number of small examples and a longer specification and verification of the ”Paxos” distributed consensus algorithm[Lam01] These ”sequence maps” are then shown to provide an alternative representation of deterministic state machines and algebraic products of state machines. Distributed and composite systems, parallel and concurrent computation, and real-time behavior can all be specified naturally with these methods - which require neither extensions to the classical state machine model nor any axiomatic methods or other techniques from ”formal methods”. Compared to state diagrams or tables or the standard set-tuple-transition-maps, sequence maps are more concise and better suited to describing the behavior and compositional architecture of computer systems. Staying strictly within the boundaries of classical deterministic state machines and primitive recursion anchors the methods to the algebraic structures of automata and semigroups and makes the specifications faithful to engineering practice. While sequence maps are reasonably obvious representations of state machines, they have not previously been employed as is done here and the techniques introduced here for applying, defining and composing them are novel. Keywords: automata, recursion, specification, concurrency arXiv:1608.01712v8 [cs.FL] 7 Jun 2023 Preprint State machines for large scale computer software and systems Victor Yodaiken, [email protected] 1. Introduction This note has three objectives: • To introduce a method involving certain recursive functions for specifying and verifying large scale discrete state systems. • To illustrate the application of the method to analysis of problems in computer systems design including network algorithms and real-time. • To precisely define the class of recursive functions of interest and show how they relate to classical state machines, automata products, and algebraic automata theory. The problem addressed here is not how to validate code, but how to understand and validate designs prior to writing code. Certainly, there is an extensive literature on this topic (see, [BR23] for one survey) but the approach taken here is unusual. Some of this is due to the author’s experience designing, writing code, and managing design and development projects for operating systems, real-time, and other ”systems” [BBG83, YB97, DMY99, DY16] and interest in solving problems specific to that field. And some is due the mathematical approach, which is minimalistic. The work here is based on classical deterministic state machines and recursive functions. There are no extensions or additions to the state machines, no infinite sequences, not even any non-determinism, and no use of the axiomatic or formal methods that have been nearly ubiquitous in the computer science ”specification” literature since the 1970s . See section 5 for some discussion of the motivations. Section 2 briefly describes the method and provides examples of applications ranging from simple counters to a PID controller to communicating threads. Section 3 applies the method to a specification and verification of the notoriously hard to understand ”Paxos” distributed consensus algorithm as an attempt to validate use outside of ”toy” examples. Section 4 defines the class of primitive recursive sequence functions and shows precisely how they relate to Moore type automata and Moore machine products. Readers who are not persuaded by the appeals to intuition in the first two sections might want to read section 4 first. Section 5 discusses motivation, background and related work in more detail. Correspondence and offprint requests to: Victor Yodaiken, [email protected] There is no discussion here of automated or mechanical proofs, only because that would be a secondary step that would be beyond the scope of this paper. State machines for large scale computer software and systems 3 2. Introduction to applications and method The basic idea here is to specify the behavior of discrete state systems with maps on finite sequences of events . The intended meaning is that f(w) is the output of a system or component described by f in the state reached by following w from the initial state. These maps capture the input/output behavior of Moore machines, state machines that have an output value associated with each state. [Moo64, HU79]. Input sequence w → MooreMachine → Output x One type of function composition modifies the output F (w) = h(f(w)) or combines systems that ”see” the same events F (w) = h(f (w), . . . f (w)) and operate in parallel. Systems constructed by interconnect- 1 n ing components can be specified via function composition using dependent sequence variables: F (w) = ∗ ∗ (f (u ), . . . f (u )) where u = u (w) and u : A → A may depend on both w and the recursive values of 1 1 n n i i i other components. The intended meaning is that f (u ), is the output of the system or component specified 1 1 by f when f is connected within an enclosing system via u . Sequence maps with this type of composition i i i capture the ”architecture” of Moore machine products [Har64] where component machines change state in parallel or concurrently. As shown in the examples below, every type of communication from signals on wires to synchronous messages can be described in terms of products. Crucially, products of Moore machines are just plain Moore machines and composite sequence maps are just sequence maps- the product operation is a closed one. → u → M → x 1 1 1 → u → M → x 2 2 1 . . . input → → Output (x , . . . x ) 1 n → u → M → x n n 1 ⇑ feedback ⇓ ⇐ (x , . . . x ) ⇐ 1 n Sequence maps of both kinds can be concisely defined with ”primitive recursion on words” [Pet82]. Let ǫ be the empty sequence. Then f(ǫ) is the output in the initial state. Let w · a be the sequence obtained by appending event a to finite event sequence w on the right. Then f(ǫ) = c, and f(w· a) = g(f(w), a) defines f in every reachable state. To illustrate: Counter(ǫ) = 0 and Counter(w · a) = Counter(w) + 1 mod k counts events mod k for positive integer k. Counter is really a family of state systems with k as a parameter. The equations U (ǫ) = 0 and U (w · a) = U (w) + 1 define an unbounded counter . A ”saturation” counter with an alphabet of numbers that counts consecutive non-zero inputs could be defined by 0 if a = 0; RCounter(ǫ) = 0, RCounter(w· a) = min{RCounter(w) + 1, k − 1} otherwise. To raise an alarm if the counter reaches saturation let: 1 if RCounter(w) = k − 1 Alarm(w) = 0 otherwise Suppose for some 0 ≤ e < k, C is an approximate mod k counter with bound e if and only if for all w: |(C(w)− Counter(w))| ≤ e. All sequences in this note are finite. A wary reader might note here that despite all the protestations about classical automata theory, U does not describe a finite state Moore machine. More on this in sections 4 and 5 , but while finiteness is a critical and necessary property of systems, sometimes we don’t know or care about the bound and sometimes it’s useful to have possibly unbounded imaginary systems to help measure or constrain behavior. 4 Victor Yodaiken, [email protected] Any sequence map that is a solution to this inequality has a useful property but we are not required to specify exactly what the value is in any particular state. Sequence maps specify deterministic systems but systems of interest are often not completely specified. Rather than positing some irreducible ”non-determinism”, we can think of these maps as solutions to constraints which may have many different solutions (see Broy [Bro10] for a similar more elaborated discussion). We can even leave the input alphabet fully or partially unspecified when convenient. The inputs for a control system could be samples of a signal over some range. A simple controller that produces a control signal, might have a fixed set point κ and an error: E(ǫ) = 0 and E(w · x) = κ − x where x is a signal sample. The change in the last error signal (the ”derivative”) is D(ǫ) = 0 and D(w · x) = E(w)− (κ − x). The sum of all errors is: S(ǫ) = 0 and S(w· x) = S(w) + (κ − x) although we would probably want, at some point of development, to force a bound on this sum or to show that it is bounded by some external condition such as an oscillation around the x-axis. Then the control signal might be calculated using three parameters as follows. Control(w) = κ E(w) + κ D(w) + κ S(w). 1 2 3 Perhaps there should be an alarm if the controller does not keep the error within some bound κ for k consecutive samples and a reset counter can be connected to the error calculation to trigger on that. Define: u(w)· 1 if |(x− κ )| ≥ κ 0 4 u(ǫ) = ǫ, u(w· x) = u(w)· 0 otherwise In this definition, 0 and 1 are being used as event symbols when they are appended to the sequence. Then Alarm(u(w)) = 1 when there have been too many out of range inputs in a row, If each event corresponds to a fixed time sample of the input signal, calculating time is straightforward, the alarm condition becomes true only if the input signal has been outside of the required error bound for k or more time units and U (w) is how much time has passed since the initial state. If there is a physics model of the plant, then U (w) should approximate the real valued time variable. For example, we could easily require that if t is a real-valued time variable in seconds that U (w) ≈ t, perhaps so that it is within 10 picoseconds or whatever level of precision is appropriate. Suppose there are n control systems embedded within a system with n inputs so that events are vectors ~x = (x , . . . x ). Say 1 n u (ǫ) = ǫ, u (w · ~x) = u (w)· (~x) i i i i Then Control (w) = Control(u (w)) is a controller that operates as specified on the indicated signal and i i Alarm(u(u (w))) is the alarm for controller i. More generally, the usual format for interconection is F (w) = (f (u ), . . . f (u ) where u = u (w) and 1 1 n n i i u (ǫ) = ǫ and u (w· a) = u (w)· g(F (w), a). The idea is that if f specifies some state system which will be a i i i i component, then f (u (w)) is the output of that component when it is enclosed in the system defined by the i i connectors u , . . . u . For example it may be that u (w · a) = u (w) · g(f (u (w)), f (u (w)), f (u (w)), a). 1 n 3 3 3 3 1 1 4 4 The alphabet of the composite system is does not need to be the same as the ones for the components, which may all have different event alphabets: the connector translates as well as interconnecting. Consider a system of connected processes with synchronous messages. The specification can be split into a specification of the processes P : B → X for process event alphabet B, followed by a specification of ∗ ∗ the interconnects u : A → B for composite system event alphabet A. The idea is that P (u (w)) is the x x x output of process x in the state determined by the combination of w and the interconnects. Possibly the processes are defined identically and just differ on what input they receive or they all could be different from each other. Suppose there is a set of process identifiers and each process has output either (y) to request for a message from process y, or (y, v) to send value v to process y, or step to do some internal processing with State machines for large scale computer software and systems 5 no I/O in this state. The events then are pairs (y, v) for a message v from process y, events (y) success in sending a message to process y, and step for some internal computation. Inputs: B Outputs: X step step → P → (y) (y) (y, v) (y, v) ∗ ∗ Connector u : A → B for each component P can be sketched out by u (ǫ) = ǫ and x x x u (w)· step if P (u (w)) = step x x x u (w)· (y, v) if P (u (w)) = (y) x x x and P (u (w)) = (x, v) y y u (w· a) = u (w)· (y) if P (u (w) = (y, v) for any v x x x and P (u (w)) = (x) y y u (w) otherwise This specification provides synchronous communication so that a process waiting for a message from process y or trying to send a message to process y will make no progress until process y has a matching output - if ever. Such systems are known for their propensity to deadlock: e.g. if P (u (w)) = (y) and P (u (w)) = (x) x x y y or if larger cycles are created. Some solutions require strict ordering of interactions, but it’s more common to add timeouts. Suppose we have a map τ : A → R associating each discrete system event in A with a duration in seconds. Define Blocked (ǫ) = 0 and τ(a) + Blocked (w) if P (u (w)) = (y) x x x and P (u (w)) 6= (x, v) for some v y y Blocked (w· a) = or P (u (w))) = (y, v) for some v, y x x x and P (u (w)) 6= (x) y y 0 otherwise Then we could add a new timeout event to B, a constant κ and an additional case to u where u (w· a) = x x u (w)·timeout if Blocked (w) > κ. By way of contrast, the network in section 3 is completely asynchronous. x x 3. The Paxos protocol The first part of this section is a general packet network connecting network agents. The second part adds constraints to the agents so they obey the Paxos protocol. The goal is to formalize the exposition in [Lam01] with at least a little more precision. 3.1. Packet network Consider a networked agent which could be a device, a program, or an operating system, and only changes state when it receives or sends a message. For simplicity, I will assume that these agents are sequential - they can receive a single message or send a single message each step but can’t, for example, receive and send at the same time. Each agent that will be connected on the network has a unique identifier - also for simplicity. There is a set of messages Messages, a set of identifiers Ids and a map (not dependent on events) source : Messages → Ids which is intended to tag a message with the identifier of the agent that sent it. The output of such a device is either a message to send or something else which indicates it has no message to send. The input alphabet must contain Messages and an additional event xmit 6∈ Messages that indicates to the agent that the message it was sending on output has transmitted. Definition 3.1. G : A → X is a Messages, Ids network agent with identifier i ∈ Ids only if: source : Messages → Ids, Messages ⊂ A, xmit ∈ A, and G(w) ∈ Messages only if source(G(w)) = i (it correctly labels the source of messages it sends.) 6 Victor Yodaiken, [email protected] Definition 3.2. The cumulative set of messages received in the state determined by q ∈ A is given by: Received(q)∪{a} if a ∈ Messages Received(ǫ) = ∅, Received(q · a) = Received(q) otherwise Definition 3.3. The cumulative set of messages transmitted by G in the state determined by q is given by: Sent(G, q)∪{G(q)} if G(q) ∈ Messages Sent(G, ǫ) = ∅, Sent(G, q · a) = and a = xmit Sent(G, q) otherwise The first argument is necessary because given the same sequence of inputs, different agents might transmit different messages. Lemma 1. If m ∈ Sent(G, q) then source(m) = i. The proof is simple, but I’m going to use the same inductive proof method multiple times so it’s worth going into a bit of detail. By the definition of Sent, Sent(G, ǫ) = ∅ so the claim is trivially true for a sequence of length 0. Suppose the claim is true for q. If m ∈ Sent(G, q) then source(m) doesn’t change with any event so the claim is true for q · a. If m 6∈ Sent(G, q · a) the claim is also trivially true. But if m 6∈ Sent(G, q) and m ∈ Sent(G, q· a) then, by the definition of Sent, G(q) = m which, by the definition 3.1 means source(G(q)) = i. Definition 3.4. A standard packet network consists of a collection of Messages, Ids agents, G : i ∈ Ids where each G has identifier i, and connection maps ∗ ∗ u : B → A so that 1. Each component starts in the initial state and advances by 0 or one steps on each network event u (ǫ) = ǫ and u (w· b) = u (w) or u (w · b) = u (w)· a for some a ∈ A i i i i i 2. And a message can only be delivered if it was previously sent. u (w· b) = u (w)· m for some m ∈ Messages only if for j = source(m), m ∈ Sent(G , u (w)) i i j j 3. And a message is transmitted only if it is being output by the agent. u (w· b) = u (w)· xmit if and only i i if G (u (w)) ∈ Messages i i The network can lose or reorder messages, but can never deliver spurious messages: Lemma 2. if m ∈ Received(u (w)) then, for j = source(m), m ∈ Sent(G , u (w)). i j j Proof: Trivially true for ǫ. Suppose it is true for q. If m ∈ Received(u (w)) then by the inductive hypothesis m ∈ Sent(G , u (w)) and then, by definition, m ∈ Sent(G , u (w · b)). If m 6∈ Received(u (w)) but m ∈ j j j j i Received(u (w· b)) it must be that u (w· b) = u (w)· m which by definition 3.4 mean m ∈ Sent(G , u (w)). i i i j j This network is not all that much - in fact, it is almost exactly the network that was specified for a famous impossibility ”theorem” in [LM86] that says there is no algorithm for detecting whether an agent has stopped forever from its inputs and outputs. The trick is that, so far, there is no notion of time for agents, so arbitrary pauses are possible and not detectable. 3.2. Paxos Paxos[Lam01] is a 2 phase commit protocol with a twist[GL04]. A ”proposer” agent first requests permission to use a sequence number by sending a ”prepare” message. If and when a quorum of acceptors agree by sending back ”prepare accept” messages, the proposer can send a ”proposal” with a value and the same sequence number. When or if a majority of acceptors agree to the proposal, the proposal has ”won”. The 2 phase twist is that during the prepare accept phase, the proposer can be forced to adopt a proposal value already tried by some lower numbered proposal. This is the most complex part of the protocol (see rule 10.j State machines for large scale computer software and systems 7 below) and it produces a result that multiple proposers can ”win” a consensus, but they must all end up using the same value. A Paxos group consists of a standard network with Messages and Ids so that: 1. P ⊂ Ids is a set of ”proposers” and C ⊂ Ids is a set of ”acceptors”. 2. T : X → {0, 1, 2, 3, 4} defines the Paxos ”type” or role of each possible output from an agent. Outputs of type 0 are not relevant to this Paxos group, 1 is a ”prepare message”, 2 is a ”prepare accept”, 3 is a ”proposal”, and 4 is a ”proposal accept”. 3. seq : Messages → N where N associates each message of non-zero type with a positive ”sequence number”. 4. π : N → P associates each sequence number with a proposer that is allowed to use that number for proposals. 5. val : Messages → V where V is a set of values is defined on every proposal message (type 3). 6. prior : Messages → Messages is defined on every prepare accept message (type 2) so that if prior(m) 6= 0 then T (prior(m)) = 3 as determined in rule 9.e below. 7. ⌊|C|/2⌋ < κ ≤ |C| is the ”quorum size” (must be more than 1/2 of the acceptors). A proposal message ”wins” if the proposing agent has received proposal accept messages from a majority of acceptors. Definition 3.5. Wins(i, q, p) if and only if T (p) = 3, p ∈ Sent(G , q) (the site has transmitted this message) and: {source(m) : m ∈ Received(q), T (m) = 4 and seq(m) = seq(p)} has κ or more elements. The theorem we want to prove is the following - it is a network property so the event sequence is u (w) not some arbitrary q ∈ A . ′ ′ Theorem 1. If Wins(i, u (w), p) and Wins(j, u (w), p ) then val(p) = val(p ). i j The Paxos algorithm can be expressed in 4 rules of the form ”T (G(q) = t only if conditions ”. These rules control when an agent can transmit a message with type other than 0. All four rules are local to agents - they do not require anything of the network. 8. Prepare. T (G (q)) = 1 only if π(seq(G (q)) = source(G (q)) i i and there is no m ∈ Sent(G , q) with T (m) = 1 and seq(G (q)) < seq(m). i i 9. Prepare Accept. T (G (q)) = 2 only if (a) i ∈ C (b) and there is some m ∈ Received(q) with T (m) = 1 and seq(m) = seq(G (q)) (c) and there is no m ∈ Sent(G , q), where T (m) = 2 and seq(m) > seq(G (q)). i i (d) and there is no m ∈ Sent(G , q), where T (m) = 4 and seq(m) ≥ seq(G (q)). i i (e) and for J = {p ∈ Received(q), T (p) = 3 and ∃m ∈ Sent(G , q), T (m) = 4, seq(m) = seq(p)} either (J = ∅ and prior(G (q)) = 0)) or prior(G (q)) is the highest numbered element in J. That is: i i prior(G (q)) is 0 when G has accepted no proposals and is otherwise the highest numbered proposal i i accepted by G in the q determined state. 10. Proposal. T (G (q)) = 3 only if (f) there is some m ∈ Sent(G , q)), T (m) = 1 and seq(m) = seq(G (m)) i i (g) and there is no m ∈ Sent(G , q) with T (m) = 1, and seq(m) > seq(G (q)). i i 8 Victor Yodaiken, [email protected] (h) and there is no p ∈ Sent(G , q) with T (p) = 3 and seq(p) ≥ seq(G (q)). i i (i) and {source(m) : m ∈ Received(q), T (m) = 2, seq(m) = seq(G (q))} has κ or more elements. (j) and if K is the set of prior messages received in prepare accept messages for seq(G (q))) K(i, q) = {prior(m) : m ∈ Received(q), T (m) = 2, prior(m) 6= 0, seq(m) = seq(G (w))} and K is not empty, val(G (q)) is the same as the value of the highest numbered element of K. 11. Proposal Accept. T (G (q)) = 4 only if (a) i ∈ C (b) There is some p ∈ Received(q) with T (p) = 3, seq(p) = seq(G (q)) (c) And there is no m ∈ Sent(G , q) where T (m) ∈ {2, 4} and seq(m) > seq(G (q). i i 3.3. Proof sketch In the interests of brevity, some of the details are in the appendix 6. The theorem is proved by creating a list in sequence number order: L(w) = p , p . . . p 0 1 n where p is the winning proposal with the least sequence number of any winning proposal and the other elements are all the proposals p, with seq(p) > seq(p ) so that for some j, p ∈ Sent(G , u (w)). No proposals 0 j j with a number less than seq(p ) can have won, because p is picked to be the winning proposal with the least 0 0 sequence number. Every winning proposal must have been sent by some G from the definition of Wins. So the list contains every winning proposal and perhaps some that didn’t win. By rule 10.h, no proposer ever sends two different proposals with the same number. By rule 10.f a proposer can only send a proposal if it has previously sent a prepare proposal with the same number and by rule 8 that number must map to the id of the proposer under π so no two proposer agents can use the same proposal number. From these considerations the elements of the list never have duplicate numbers and can be strictly sorted by sequence number. The mechanism by which values are set is via the ”prior” proposals attached to prepare accepts (messages of type 2). Lemma 3. Acceptors only send prepare accept messages that either have no prior proposal or where the sequence number of the prior proposal is less than the sequence number of the prepare accept. m ∈ Sent(G , q), with T (m) = 2 only if either prior(m) = 0 or seq(prior(m)) < seq(m) For proof: suppose prior(m) 6= 0, and G (q) = m. Then by rule 9 source(m) ∈ C and there is no m ∈ ′ ′ Sent(G , q) where T (m ) = 4 and seq(m ) ≤ seq(m) (by rule 9.d. But by rule 9.e, if prior(m) 6= 0 then there ′ ′ ′ ′ is some m ∈ Sent(G , q) so that T (m ) = 4 and m ∈ Sent(G , q) and seq(m ) = seq(prior(m)). If follows c c that seq(prior(m)) < seq(m). Lemma 4. If an acceptor has sent both a prior accept message and a proposal accept message where the proposal accept message has a smaller sequence number, then the prepare accept message must carry a prior proposal with a sequence number at least as great as that of the proposal accept. If m ∈ Sent(G , q) and m ∈ Sent(G , q) and T (m ) = 4 and T (m ) = 2 and seq(m ) < seq(m ) then x i y i x y x y seq(prior(m )) ≥ seq(m ). y x Proof by induction on sequence length. The lemma is trivially true for length 0 since Sent(G , ǫ) = ∅ by definition 3.3. Assume the lemma is true for sequence z and consider z · a. There are 4 cases for each z and a: 1. m ∈ Sent(G , z) and m ∈ Sent(G , z) in which case, by the induction hypothesis seq(m ) ≤ seq(prior(m ))) y i y i x y and these are not state dependent so the inequality holds in the state determined by z · a. 2. m 6∈ Sent(G , z · a) or m 6∈ Sent(G , z · a) in which case the lemma is trivially true in the state x i y i determined by z · a State machines for large scale computer software and systems 9 3. m 6∈ Sent(G , z) or m 6∈ Sent(G , z) and m ∈ Sent(G , z· a) or m ∈ Sent(G , z· a). By the definition x i y i x i y i of Sent one of m ∈ Sent(G , z) or m ∈ Sent(G , z) because at most one element is added to Sent(G , z) i y i i by the event a. So there are two subcases: (a) If m ∈ Sent(G , G , z) and G (z) = m by rule 11.c T (G (z)) 6= 4 contradicting the hypothesis. This y i i i x i case cannot happen. (b) If m ∈ Sent(G , G , z) and G (z) = m then, by lemma 11 there is some p ∈ Received(q) so T (p) = 3 i i i y and seq(p) = seq(m ). The set J in rule 9.e then contains at least p, so seq(prior(m )) ≥ seq(p) = x y seq(m ). ′ ′ ′ ′ Lemma 5. If Wins(i, u (w), p) and G (u (w)) = p ) for some p where T (p ) = 3 and seq(p ) > seq(p) then i j j ′ ′ there is a m ∈ Received(u (w)), T (m) = 2 and seq(m) = seq(p and seq(prior(p )) ≥ seq(p) This is a ridiculously long chain of assertions, but almost all the work is done. From the definition of Wins, site i has received proposal accept messages for p from κ or more acceptors. From rule 10.i site j must have received prepare accept messages for seq(p ) from κ or more acceptors. Since κ is more than half, there must be at least one agent x that has transmitted both a proposal accept for p and a prepare accept m for seq(p ) from x. By lemma 4 then seq(prior(m)) ≥ seq(p). Now we can prove that for every p on the list L(w), val(p ) = val(p ). This is obviously true for i = 0. i i 0 Suppose it is true for the first n elements of the list and consider p the next element (counting from 0). Let source p = x. We can show that for any w, G(u (w)) = p implies that there is some m ∈ n x n Received(u (w)) with T (m) = 2 and seq(m) = seq(m) so that seq(p ) ≤ prior(m) < seq(m) and there is x 0 no m ∈ Received(u (w)), T (m) = 2 and seq(m) = seq(p and seq(prior(m)) > seq(m) = seq(p ). It follows x n n that if G (u (w)) = p then, by rule 10.j K is not the empty set, and the element of K with the highest x x n sequence number is some proposal p so that seq(p ) ≤ seq(p) < seq(p ). By rule 10.j p = prior(m) for 0 n some prepare accept m ∈ Received(u (w)) which means that for c = source(m), m ∈ Sent(G , u (w)). But x c c rule 9.e, p = prior(m) must mean that p ∈ Received(u (w)) which means p was sent by source(p), which means that p is on the list p , . . . p , which means that val(p) = val(p ) by the induction hypothesis - so 0 n−1 0 val(p ) = val(p ). n 0 3.3.1. Discussion The specification here is compact but more detailed than the original one in ”Paxos Made Simple. For example, agents are defined to output at most one message in each state — something that appears to be assumed but not stated in the original specification. If G (q) could be a set , an agent could satisfy the specification and send an accept for p and a prepare accept for seq(p) at the same time - so that the prepare accept did not include a prior proposal with a number greater than or equal to seq(p ). Or consider what happens if an agent transmits a proposal before any proposal has won after receiving κ prepare accepts and then receives an additional prepare accept with a higher numbered prior proposal? The ”Paxos made simple” specification does not account for this possibility and would possibly permit the agent to send a second proposal with a different value but the same sequence number but rule 10.h forbids it here. This is the kind of detail that it’s better to nail down in the specification before writing code. For contrast, see proofs using formal methods and proof checkers in [CLS16] and [GS21]. There is no objective criterion for deciding which approach is ”better” other than practice, but the proofs in those works are for machine checking, not human understanding. The proof only proves the system is ”safe” (at most one value can win). Proving liveness (that some value will win) is a problem, because neither the network as specified (which doesn’t need to ever deliver any messages) or Paxos (which can spin issuing new higher numbered proposals that block others, indefinitely) is live as specified. It would be possible to fix the specification to, rate limit the agents, require eventual delivery from the network, and put timeouts on retries (as done in implementations such as [CGR07] ). For example, if the agent is multi-threaded without appropriate locking. 10 Victor Yodaiken, [email protected] 4. Primitive Recursive Sequence Functions and State Machines Definition 4.1. The standard representation of a Moore type state machine[HU79] is a sextuple: M = (A, S, σ , X, δ, λ) where A is a set of discrete events (or ”input alphabet”), S is a state set, σ ∈ S is the start state, X is a set of outputs and δ : S × A → S and λ : S → X are, respectively, the transition map and the output map. The state set and alphabet are usually required to be finite, but here we sometimes don’t need that restriction. Moore machines can be viewed as way of defining or implementing maps f : A → X. Sequence primitive recursion [Pet82] is a generalization of arithmetic primitive recursion [Pet67] so that for some set A, the set A of finite sequences over A (or the free monoid over A [Hol83, Pin86] ) takes the place of the natural numbers in arithmetic primitive recursion. The empty sequence ǫ takes the place of 0 and ”+1” is generalized to w · a which signifies ”append a to sequence w on the right”. Applying the idea of primitive recursion on sequences to state systems is straightforward. Definition 4.2. A map f : A → X is sequence primitive recursive (s.p.r) if and only if • There is a constant c ∈ X and map g : X × A → X so that f(ǫ) = c and f(w · a) = g(f(w), a) for all w ∈ A and a ∈ A. • Or f(w) = h(f (w)) for some h : X → X and primitive recursive sequence map f : A → X . 1 1 1 1 The relationship between Moore machines and Sequence Primitive Recursive functions is analyzed in the rest of this section. Lemma 6. Each s.p.r. map f : A → X is associated with a triple (c, h, g) called the s.p.r. basis of f so that g : Y × A → Y for some set Y ( which is essentially a state set) and ′ ′ ′ f(w) = h(f (w)) for f (ǫ) = c, and f(w· a) = g(f (w), a) Proof: by number of times the second type of composition is used. If none, then f(ǫ) = c and f(w · a) = g(f(w), a) so (c, ι, g) is the basis with ι(x) = x as the identity map. If f(w) = h(f (w)) where f is s.p.r. 1 1 let (c , h , g ) be the basis of f , justified by the induction hypothesis, so that f (w) = h (f (w)) where 1 1 1 1 1 1 ′ ′ ′ ′ f (ǫ) = c and f (w · a) = g (f (w), a). Then since h(f (w)) = h(h (f (w))) let H(x) = h(h (x)) and then 1 1 1 1 1 1 1 (c , H, g ) is the basis of f. 1 1 Definition 4.3. A sequence primitive recursive map f : A → X is finite state only if it has some p.r. basis c, h, g where g has a finite image (range). S.p.r. maps generally have more than one basis: in particular a finite state s.p.r. function with basis (c, h, g) may have ”duplicate” elements of the image of g. As a very simple example from section 2 Counter(ǫ) = 0 and Counter(w · a) = Counter(w) + 1 mod k counts inputs mod k and is finite state for each value of the k parameter. A basis is (0, ι, g) where ι(x) = x is the identity map and g(n, a) = n + 1 mod k. Each Moore machine tuple is associated with a unique sequence primitive recursive map. Definition 4.4. The characteristic sequence map of Moore type state machine M = (A, S, σ , X, δ, λ) is : ′ ′ ′ ′ f (w) = λ(f (w)) where f (ǫ) = σ , f (w · a) = δ(f (w), a). M 0 M M M M A primitive recursive basis for this map is (σ , δ, λ). If S is finite then δ must have a finite image. Each s.p.r. function is associated with a Moore machine. Definition 4.5. If f : A → X with basis (c, g, h) with g : Y × A → Y and c ∈ Y then M = (A, Y, c, X, g, h) where X = {h(s) : s ∈ Y} is the (c, g, h) Moore machine. State machines for large scale computer software and systems 11 If f is s.p.r. with basis (c, g.h) and M is the (c, g, h) Moore machine tuple, then the characteristic map of M is the original s.p.r. map f. As a consequence, s.p.r. maps constitute an alternative representation of the same thins that Moore machine tuples represent. Instead of using the primitive recursive reduction of a map to a state machine tuple, the connection can be made via the Myhill equivalence [MS59]. For any map: f : A → X, the equivalence classes can be the states of the state machine determined by f. Let w ∼ z if and only if f(w concat z) = f(q concat z) for all ∗ ∗ z ∈ A . Let [w] = {z : z ∼ w} and then put S = {[w] : w ∈ A }. Make [ǫ] be the start state and f f f f f δ([w] , a) = [w· a] and λ([w] ) = f(w). f f f 4.1. Composition Several apparently more complex types of maps can be shown to be sequence primitive recursive. Definition 4.6. If there are sequence primitive recursive maps f : A → X for i = 1 . . . , n then let: i i f(w) = (f (w), . . . f (w)) 1 n Claim: f is sequence primitive recursive. This describes a system constructed by connecting multiple com- ponents that change state in parallel, without communicating. w → f → x 1 1 w → f → x 2 2 input sequence w → → Output x = w → f → (x , . . . x ) 1 n . . . w → f → x n n Let (c , g , h ) be a basis for each f . i i i i Define G((y , . . . y ), a) = (g (y , a), . . . g (y , a)) (1) 1 n 1 1 n n c = (c , . . . c ). 1 n Then let r(ǫ) = c, r(w · a) = G(r(w), a) ′ ′ Clearly r is s.p.r. Now we prove r(w) = (f (w), . . . f (w)) 1 n r(ǫ) = (c , . . . c ) 1 n (2) ′ ′ = (f (ǫ), . . . f (ǫ)) 1 n ′ ′ Suppose r(w) = (f (w), . . . f (w)). 1 n r(w · a) = G(r(w), a) ′ ′ = G((f (w), . . . f (w)), a) 1 n (3) ′ ′ = ((g (f (w), a), . . . g (f (w)), a)) 1 n 1 n ′ ′ = ((f (w · a), . . . f (w· a)) 1 n Let: H((y , . . . y )) = (h (y ), . . . h (y )) 1 n 1 1 n n Then H(r(w)) = f(w) so f is s.p.r. Definition 4.7. If f is sequence primitive recursive, then let f(ǫ) = κ and f(w· a) = g((f(w), f (w)), a). 1 1 Claim: f is sequence primitive recursive. Here a new p.r. sequence map is being constructed to depend both on w and on the values of f (w). Input sequence w → w → f → w, x → New → x. 1 1 If we have n p.r. sequence maps f , . . . f with each f : A + i → X , the system output when they are 1 n i i connected is in the set X = X ··· × X , and a ”connector” is a map γ : A × X → A so that γ (a, x) 1 n i i i 12 Victor Yodaiken, [email protected] is the sequence of events produced for component i when the system input is a and the outputs of all the components are given by x. The event alphabets of the components can all be different or the same and the composite alphabet can also be different or the same depending only on γ . There must be a primitive ′ ′ ′ ′ recursive basis (c , g , h ) for f so that f (w) = h (f (w)) where f (ǫ) = c and f (wa) = g (f (w), a) Let 1 1 1 1 1 1 1 i 1 1 1 H((x, y)) = x G((x, y), a) = ((g(x, h (y), a), g (y, a)) (4) 1 1 r(ǫ) = (κ, c ) and r(w · a) = G(r(w), a) Clearly, r is s.p.r. Claim: r(w) = (f(w), f (w)). In that case H(r(w)) = f(w) which proves f is s.p.r. r(ǫ) = (κ, c ) = (f(ǫ), f (ǫ)). suppose r(w) = (f(w), f (w)) then r(w · a) = G(r(w), a) = G((f(w), f (w)), a) ′ ′ = (g(f(w), h (f (w)), a), g (f (w), a)) (5) 1 1 1 1 = (g(f(w), f (w), a), g (f (w· a)) 1 1 = (f(w· a), f (w · a)) End proof w → u → f −→ x 1 1 1 w → u → f −→ x 2 2 2 → Output (x , . . . x ) → input sequence w → . . . 1 n w → u → f −→ x n n n ⇑← (x , . . . x ) ⇐ feedback ⇓ 1 n ∗ ∗ Definition 4.8. For i = 1, . . . n, given f : A → X and γ : X × A → A where X = X × . . . X say the i i i 1 n i i general product is given by: f(w) = (f (u (w)), . . . f (u (w))) 1 1 n n and u (ǫ) = ǫ, u (w · a) = u (w) concat γ (f(w), a) for i = 1, . . . n i i i i where concat is the usual concatenation of finite sequences. Theorem 2. If f , . . . f are s.p.r. in a product of the type of definition 4.8, then f is s.p.r. 1 n Proof: Each f has a basis (c , g , h ) with g : Y × A → Y and f (q) = h (f (q)) where i i i i i i i i i i ′ ∗ ′ ′ ′ f : A → Y and f (ǫ) = c and f (w· a) = g (f (w), a) i i i i i i ′ ′ Let H(y , . . . y ) = (h (y ), . . . h (y )) so f(w) = H(f (u (w)), . . . f (u (w))). The goal is to define a s.p.r. 1 n 1 1 n n 1 n 1 n map r : A → Y × . . . Y 1 n ′ ′ so that r(w) = (f (u (w)), . . . f (u (w))), which implies H(r(w)) = f(w). This will prove f is s.p.r.. 1 n 1 n Because γ is sequence valued it is useful to extend each g to sequences: i i ∗ ∗ ∗ g : Y × A. Let g (y, ǫ) = y and g (y, q · a) = g (g (y, q), a). i i i i i i Then let: ′ ′ ∗ ′ F (ǫ) = c and F (q · a) = g (F (q), ǫ· a). i i i i State machines for large scale computer software and systems 13 ′ ′ ′ Clearly F (q) = f (q) so f (q) = h (F (q)). i i i i i ∗ ∗ for y = (y , . . . y ) let G(y, a) = (g (y , γ (H(y), ǫ· a)), . . . g (y , γ (H(y), ǫ· a))) 1 n 1 1 n n 1 n Let r(ǫ) = (c , . . . c ), and r(w · a) = G(r(w), a) 1 n By construction r is s.p.r. ′ ′ Claim r(w) = (F (u (w)), . . . F (u (w))) 1 n 1 n Proof by induction on w r(ǫ) = (c , . . . c ) 1 n ′ ′ = (F (ǫ), . . . F (ǫ) 1 n ′ ′ = (F (u (ǫ)), . . . F (u (ǫ)) (6) 1 n 1 n ′ ′ Inductive hypothesis r(w) = y = (F (u (w)), . . . F (u (w))) 1 n 1 n r(w · a) = G(r(w), a) ∗ ∗ = (g (y , γ (H(y), ǫ· a)), . . . g (y , γ (H(y), ǫ· a))) 1 1 n n 1 n ∗ ′ ∗ ′ = (g (F (u (w), γ (f(w), a)), . . . g (F (w), γ (f(w), a))) 1 1 n 1 1 n n ′ ′ = (F (u (w· a)), . . . F (w · a)))) 1 n QED The proof here is not complicated, but I originally produced it by going via the Moore machine representation covered in section 4.2 as it was easier to visualize. In that proof, first each component map is converted to a Moore machine (not necessarily finite state) and then they are multiplied out in the general product and then the result is converted back to a p.r. sequence map. 4.2. Moore machine products The general product of Moore machines [Har64] and later [G´ec86, Yod91] has a state set constructed as the cross product of the state sets of the factor machine and has a connector map for each component φ : X ···× X × A → A Compare to definition 4.8. i 1 n i Definition 4.9. The general product: M = (B, S, σ , δ, X, λ) of (not necessarily finite) Moore type state machines M , . . . M with 1 n M = (A , S , σ , δ , X , λ ), and connectors φ , . . . φ for φ : X × A → A where X = X × . . . X is given i i i i,0 i i i 1 n i i 1 n by: • A • S = {S × . . . S } 1 n • σ = (σ , . . . σ ) - the initial state, 1,0 n,0 • δ((σ , . . . σ ), b) = (δ (σ , a ), . . . δ (σ , a )) where a = φ (λ (σ ), . . . λ (σ )), b 1 n 1 1 1 n n n i i 1 1 n n • λ((s , . . . s ) = (λ (s ), . . . λ (s )) 1 n 1 1 n n If each M is finite state, then by necessity, the product is finite state. The connectors can be extended to produce sequences on each step just as with the general product of sequence primitive recursive functions. Since M is a Moore type state machine tuple, we know that it has a characteristic map. If f is a characteristic map for each M then f(w) = (f (u (w)), . . . f (u (w)))) 1 1 n n where each u (ǫ) = ǫ and u (w· a) = u (w)· φ (f (u (w)), . . . f (u (w)), a)) is the characteristic map for M. i i i i 1 1 n n In the construction of general product of s.p.r. maps, if each γ depends only on the first argument, then the product reduces to a ”direct” or ”cross” product and the state machines are not interconnected. If each γ (x , . . . x , a) depends only on a and x , . . . x then the product reduces to a ”cascade” product[HS66, Hol83, i 1 n 1 i Pin86, Mal10]. in which information only flows in a linearly ordered pipeline through the factors. ”Cascade” products channel information flow in one direction and correspond to pipelines and similar processing systems 14 Victor Yodaiken, [email protected] (including many kinds of digital circuits). The network in section 3 is an easy example of a system that where a cascade decomposition will not reflect system architecture. Moore machines distinguish between internal state and externally visible state (output) in a way that cor- responds to ”information hiding” [Par72]. This is why the connectors of the general product 4.8 depend on the outputs of the components, not on their interior state sets. The decomposition structure of systems can then provide an insight into modularity in terms of how much of the internal state of components must be communicated to other components. It is relatively simple to show that any finite state map with n states can be constructed from log n single bit state maps - but at the expense of making all state visible. The ratio of the size of the output set to the size of the set A/ ∼ indicates the extent of information hiding. The two sided variant of the Myhill equivalence noted above is given by w ≡ z iff for all finite sequences ′ ′ ′ q, q over the alphabet of events f(q concat w concat q ) = f(q concat z concat q ). Then it’s easy to show that the equivalence classes constitute a monoid under concatenation of representative sequences: {s} ◦{z} = {s concat z} . f f f This monoid is finite if and only if f is finite state. The correspondence between recursive function structure and monoid product structure follows for cascade decomposition. Whether there are interesting aspects of monoid structure that correspond to feedback product decomposition an open question. 5. Related work ”Many modeling languages are designed by initially thinking about syntax, but should it not be exactly the other way around”. [BR23] This work is based on three lines of research: temporal logic, algebraic automata theory, and primitive recursion on words. 1. The project began with an effort to apply temporal logic[Ram83, MP79, Pnu85, Lam94] to operating and real-time systems because it seemed to be able to express interesting properties like ”every process will eventually run”, ”eventually process p runs” or ”some process is always active”. Temporal logic borrowed from Kripke[Kri63] a semantics consisting of directed graphs of ”worlds” where each ”world” is a map interpreting formal variables and/or propositions and temporal quantifiers can be understood as assertions about how the worlds change along paths through the graph. A proposition such as AlwaysP is true if and only if all worlds reachable from the current world, the symbol P evaluates to ”true”. Computer science researchers took the world graphs to be state machines and the ”worlds” to be ”states”, where states are maps assigning values to state dependent variables. This is an intuitively appealing approach because state machines offer a natural semantics for discrete state systems and state machines lend themselves to automated checking e.g. in [CGK 18] and [Lam94]). One of the limitations of this approach, however, is that parallel state change and causality are difficult, the later because the state machines are unlabeled with each transition being considered to correspond to an advance in time. ”What caused the change” has to be encoded in the assignment maps and concurrency is represented indirectly via non-determinism and interleaving. When two components are composed, the assignment maps need to be merged and the domains made disjoint to prevent conflicting assignments to the same formal variable. On each state change, the state machine must in a non-deterministic way ”choose” to change state of some component via changes to the state variable symbols that belong to that component. Systems with components that change state at different rates are even more of a problem (for example see state ”stuttering” in [Lam18]). Mathematically, the composed machines are are not state machines. Instead they are state machines plus assignment maps plus rules for merging assignment maps, and so on. This is also true for statecharts [Har84] and similar. In [Yod91] and [VY91] I tried to address some of these issues by introducing the automata products as semantics for a formal language and bringing in operators to allow quantification over the compositional structure and to make events visible. This approach was still limited both by the lack of precision in temporal quantifiers and the nature of a formal logic. Instead of ”next state” a ”after [a]” modifier had to be introduced. But these modifiers were hard to adapt to compositional systems where a in the enclosing system can trigger different events (or no event) in components. In any case, using temporal logic to describe real-systems one often encounters a need for an additional temporal qualifiers — maybe ”until” or ”while” or ”before”. The key to the Paxos verification above, for example, is a proof that one type State machines for large scale computer software and systems 15 of message must be sent before another is sent. Similarly, when coming to grips with actual systems, ”eventually” as a limit to infinity of some sort doesn’t correspond to anything concrete in systems of interest. We usually don’t care if a process runs after after an infinite number of steps or moments or even after some unknown and unbounded delay. What seemed more interesting in detailed examples is ”happens with some time limit t” or within some number of events or after no more than t ”tick” events or something else depending on the system. All these things are easily addressed directly in terms of sequences and maps and the utility of a layer of abstraction was not obvious. Perhaps with enough experience with complex systems this question could be revisited. Additionally, outside of an axiomatic closed system, it is easier to use results from other fields. For example, assuming a probability distribution for network packet loss in the example of section 3 would not require any changes to the methods or specification. Compare, for example, what is needed to access results from control theory in e.g. [UM18] with the simple connection between a real-valued time variable and the s.p.r. map T (w) in section 2. The absence of a formal specification language is one of the key differences between this work and [Yod91]. Of course, there is nothing to prevent an axiomatic formalization of the methods provided here, and perhaps that will be useful (especially for automated proof or proof checking systems. 2. In the algebraic automata literature, Hartmanis defined a Moore machine product for concurrent systems in a 1962 paper (in a 1964 collection [Har64]) that is similar to the product in Gecseg’s monograph [G´ec86] and that is essentially the same thing as the ”general” recursive function composition used here. Products of state machines have two advantages: they provide a direct and widely applicable model of interconnection and parallel state change and they are algebraically closed in the sense that a product of automata is itself just an automaton with no extensions. These products provide a general model of concurrent or parallel computation without requiring that any ”primitive” method of communication be assumed to be fundamental, or that concurrent events are interleaved nondeterministically, or an axiomatic or other external specification of communication such as in [Bro10] or [Har84] is employed. Hartmanis[HS66] and later researchers in algebraic automatic theory [Gin68, Hol83, Pin86] mostly focused on ”loop free products” for computer science applications The loop free product, also called the ”cascade” product, as the name implies, imposes a linear order on factor machines and only permits communication from lower to higher. Those researchers were mostly interested in factoring state machines and not in composing state systems. That is, they started with small complete state machines via state tables or state diagrams and looked at how factor them into simpler elements. This line of research was also motivated by the decomposition theorem of Krohn-Rhodes. Each state machine defines a semigroup via congruence classes on finite sequences as shown by Nerode and Myhill (p. 70-72 [MS59]) and the Krohn-Rhodes theorem showed that loop free automata factorization induced a semigroup factorization which has remarkable mathematical properties connecting semigroup factorization to group theory and normal subgroups. Certain circuits and piplelined computation can be modeled in terms of cascades but factorization via cascades does not correspond to system architecture where component connection permits ”feedback” (loops). The problem remained of how to work with large scale state machines and complicated, possibly multi-level, automata products. 3. The insight that deterministic state machines determine maps f : A → X is reasonably obvious. Arbib ∗ ∗ notes that state machines with output can be considered maps A → X ( [Arb69] p. 7) and similar maps are used in algebraic automata theory [Gin68, Pin86, Hol83] basically to define equivalences on state machines that ”do the same thing” . Here the map is used directly as the specification of the system in place of state tables or diagrams or the usual state machine tuple representation of definition 4.1. And the scaling problem is addressed here by using primitive recursion on words (finite sequences) to define and compose maps. Primitive recursion on words appears to have been first described as a generalization of arithmetic primitive recursion by Rosza Peter [Pet82] although it is present in a more abstract form in Eilenberg and Elgot [EE70]. The connection between recursion on words and state machines is perhaps obvious, but it is not as far as I can tell, mentioned in the computer science literature prior to [Yod91]. One advance since that paper is the development of primitive recursive basis which permits reduction of complex maps to simpler ones. ”Bisimulation” [Mil79] for the so-called ”process algebra” is derived from the concepts of machine homomorphism and ”covering” in [Gin68] which involve treating state machines as map. See [Par81]. 16 Victor Yodaiken, [email protected] 6. Appendix: Some Details of the Paxos safety proof Lemma 7. If m ∈ Sent(G , q) and (T (m) = 2 or T (m) = 4) then i ∈ C. This follows directly from rules 9 and 11 and from lemma 1. Lemma 8. If p ∈ Sent(G , q) and T (p) = 3 then π(seq(p)) = j. Proof: G(q) = p implies, by rule 10.f that there is some m ∈ Sent(G , q) where seq(m) = seq(p) and T (m) = 1. And rule 8 requires that G (z) = m only if π(seq(m)) = source(m) and lemma 1 requires that source(m) = j. Since seq(m) = seq(p) then source(p) = source(m) = j = π(source(p)). ′ ′ ′ Lemma 9. If p ∈ Sent(G , q) where T (p) = 3 and p ∈ Sent(G , q) where T (p ) = 3 and seq(p) = seq(p ) j j then p = p . ′ ′ Proof: Suppose, without loss of generality that p ∈ Sent(G , , z) and G (z) = p then by rule 10.h p = p ). j j It follows that ′ ′ Lemma 10. If p ∈ Sent(G , u (w)) where T (p) = 3 and p ∈ Sent(G , s (w)) then j = i or seq(p) 6= xseq(p ) j j i i ′ ′ If seq(p) = seq(p ) then π(seq(p)) = j = π(seq(p )) = i. Lemma 11. If a site has sent a proposal accept, it has received a matching proposal (with the same sequence number) If m ∈ Sent(G , q), T (m) = 4 then there is some p, T (p) = 3, seq(p) = seq(m), p ∈ Received(q). Proof by induction on prefixes of q, Initially m 6∈ Sent(G , , ǫ. Suppose m ∈ Sent(G , , z · a) but m 6∈ j j Sent(G , , z) then G(z) = m (by definition of txd) which means by 11.b the matching proposal must have been received. Lemma 12. If m ∈ Sent(G , q) and T (m) = 2 then either source(prior(m)) = 0 or seq(prior(m)) < seq(m)). Suppose G(q) = m and T (m) = 2. By rule 9.e if prior(m) 6= 0 then prior(m) = p so that p ∈ {p ∈ Received(q), T (p) = 3 and ∃m ∈ Sent(G , q), T (m ) = 4, seq(m ) = seq(p)} c i c c So m ∈ Sent(G , q) which implies, by rule 9.d that seq(m ) < seq(m). c i c References [Arb69] Michael A Arbib. Theories of Abstract Automata (Prentice-Hall Series in Automatic Computation). Prentice-Hall, Inc., USA, 1969. [BBG83] Anita Borg, Jim Baumbach, and Sam Glazer. A message system supporting fault tolerance. In Proceedings of the Ninth ACM Symposium on Operating Systems Principles, SOSP ’83, page 90–99, New York, NY, USA, 1983. Association for Computing Machinery. [BR23] Manfred Broy and Bernhard Rumpe. Development use cases for semantics-driven modeling languages. Commun. ACM, 66(5):62–71, apr 2023. [Bro10] Manfred Broy. A logical basis for component-oriented software and systems engineering. Comput. J., 53(10):1758– 1782, 2010. [CGK 18] E.M. Clarke, O. Grumberg, D. Kroening, D. Peled, and H. Veith. Model Checking, second edition. Cyber Physical Systems Series. MIT Press, 2018. [CGR07] Tushar Deepak Chandra, Robert Griesemer, and Joshua Redstone. Paxos made live - an engineering perspective (2006 invited talk). In Proceedings of the 26th Annual ACM Symposium on Principles of Distributed Computing, [CLS16] Saksham Chand, Yanhong A. Liu, and Scott D. Stoller. Formal verification of multi-paxos for distributed consensus. CoRR, abs/1606.01387, 2016. [DMY99] Cort Dougan, Paul Mackerras, and Victor Yodaiken. Optimizing the idle task and other mmu tricks. In Proceedings of the Third Symposium on Operating Systems Design and Implementation, OSDI ’99, page 229–237, USA, 1999. USENIX Association. [DY16] Cort Dougan and Victor Yodaiken. Method, time consumer system, and computer program product for maintaining accurate time on an ideal clock, 8 2016. [EE70] S. Eilenberg and Calvin Elgot. Recursiveness. Academic Press, New York, 1970. State machines for large scale computer software and systems 17 [G´ec86] Ferenc G´ecseg. Products of Automata, volume 7 of EATCS Monographs on Theoretical Computer Science. Springer, Berlin, 1986. [Gin68] A. Ginzburg. Algebraic theory of automata. Academic Press, New York, 1968. [GL04] Jim Gray and Leslie Lamport. Consensus on transaction commit. Computing Research Repository, cs.DC/0408036, [GS21] Aman Goel and Karem A. Sakallah. Towards an automatic proof of lamport’s paxos. 2021 Formal Methods in Computer Aided Design (FMCAD), pages 112–122, 2021. [Har64] J. Hartmanis. Loop-free structure of sequential machines. In E.F. Moore, editor, Sequential Machines: Selected Papers, pages 115–156. Addison-Welsey, Reading MA, 1964. [Har84] D. Harel. Statecharts: A visual formalism for complex systems. Technical report, Weizmann Institute, 1984. [Hol83] W.M.L. Holcombe. Algebraic Automata Theory. Cambridge University Press, 1983. [HS66] J. Hartmanis and R. E. Stearns. Algebraic Structure Theory of Sequential Machines. Prentice-Hall, Englewood Cliffs, N.J., 1966. [HU79] John E. Hopcroft and Jeffrey D. Ullman. Introduction to Automata Theory, Languages, and Computation. Addison- Welsey, Reading MA, 1979. [Kri63] S. Kripke. Semantical considerations on modal logic. Acta Philosophica Fennica, 16:83–94, 1963. [Lam94] L. Lamport. The temporal logic of actions. ACM Transactions on Programming Languages and Systems (TOPLAS), 16(3):872–923, May 1994. [Lam01] Leslie Lamport. Paxos made simple. ACM SIGACT News (Distributed Computing Column) 32, 4 (Whole Number 121, December 2001), pages 51–58, December 2001. [Lam18] Leslie Lamport, 2018. [LM86] A. Lynch, N. and M. Merrrit. Introduction to the theory of nested transactions. Technical Report TR-367, Laboratory for Computer Science, MIT, 1986. [Mal10] Oded Maler. On the Krohn-Rhodes Cascaded Decomposition Theorem, page 260–278. Springer-Verlag, Berlin, Heidelberg, 2010. [Mil79] R. Milner. A Calculus of Communicating Systems, volume 92 of Lecture Notes in Computer Science. Springer Verlag, 1979. [Moo64] E.F. Moore, editor. Sequential Machines: Selected Papers. Addison-Welsey, Reading MA, 1964. [MP79] Z. Manna and A. Pnueli. The modal logic of programs. In Proceedings of the 6th International Colloquium on Automata, Languages, and Programming, volume 71 of Lecture Notes in Computer Science, pages 385–408, New York, 1979. Springer-Verlag. [MS59] Rabin M.O. and Dana Scott. Finite automata and their decision problems. IBM Journal of Research and Devel- opment, (2), April 1959. [Par72] D. L. Parnas. On the criteria to be used in decomposing systems into modules. Commun. ACM, 15(12):1053–1058, December 1972. [Par81] David Michael Ritchie Park. Concurrency and automata on infinite sequences. In Theoretical Computer Science, [Pet67] Rozsa Peter. Recursive functions. Academic Press, New York, 1967. [Pet82] Rozsa Peter. Recursive Functions in Computer Theory. Ellis Horwood Series in Computers and Their Applications, Chichester, 1982. [Pin86] J.E. Pin. Varieties of Formal Languages. Plenum Press, New York, 1986. [Pnu85] A. Pnueli. Applications of temporal logic to the specification and verification of reactive systems: a survey of curent trends. In J.W. de Bakker, editor, Current Trends in Concurrency, volume 224 of Lecture Notes in Computer Science. Springer-Verlag, 1985. [Ram83] Krithivasan Ramamritham. Correctness of a distributed transaction system. Information systems, 8(4):309–324, [UM18] Dogan Ulus and Oded Maler. Specifying timed patterns using temporal logic. In Proceedings of the 21st Interna- tional Conference on Hybrid Systems: Computation and Control (Part of CPS Week), HSCC ’18, page 167–176, New York, NY, USA, 2018. Association for Computing Machinery. [VY91] Krithi Ramamritham Victor Yodaiken. Mathematical models of real-time scheduling. In Real Time Computing: Formal Specifications and Methods, pages 55–86. Kluwer, 1991. [YB97] Victor Yodaiken and Michael Barabanov. Real-Time. In USENIX 1997 Annual Technical Conference (USENIX ATC 97), Anaheim, CA, January 1997. USENIX Association. [Yod91] Victor Yodaiken. Modal functions for concise definition of state machines and products. Information Processing Letters, 40(2):65–72, October 1991. This figure "bothcounter.png" is available in "png"