Finite Transducers

Lectures given by Prof. Jacques Sakarovitch.

Convention We will denote an alphabet with A, the free monoid (the set of words) over A by A* and the empty word by 1A*.
Definition A finite automaton on A is a directed graph with edges labelled by the letters from A. It is described by the 5-tuple ๐’œ๏ธ€=โŸจA,Q,I,E,TโŸฉ, where Q is the set of states (vertices), IโŠ‚Q is the set of initial states, EโŠ‚Qร—Aร—Q is the set of transitions and TโŠ‚Q is the set of final states. The accepted language is the set
L(๐’œ๏ธ€)โ‰”{wโˆˆA*|โˆƒiโˆˆI,tโˆˆT:iโ†’๐’œ๏ธ€wt}.
Note Let A*,B* be two free monoids. Then A*ร—B* is also a monoid, but not free (for example, given aโˆˆA and bโˆˆB, the elements (a,1A*) and (1B*,b) commute).
Definition A finite transducer on A is a directed graph with edges labelled by pairs from A*ร—B*. It is described by the 5-tuple ๐’ฏ๏ธ€=โŸจA*ร—B*,Q,I,E,TโŸฉ, where Q is the set of states (vertices), IโŠ‚Q is the set of initial states, EโŠ‚Qร—A*ร—B*ร—Q is the set of transitions and TโŠ‚Q is the set of final states. We define
|๐’ฏ๏ธ€|โ‰”{(u,v)|โˆƒiโˆˆI,tโˆˆT:iโ†’๐’ฏ๏ธ€(u,v)T}.
Definition A relation ฮธ from A* to B* is defined by its graph ฮธ^โŠ‚A*ร—B* and denoted by ฮธ:A*โ†’B*, with
ฮธ(u)โ‰”{vโˆˆB*|(u,v)โˆˆฮธ^}.
Its inverse is ฮธโˆ’1:B*โ†’A* with the same graph and
ฮธโˆ’1(v)โ‰”{uโˆˆA*|(u,v)โˆˆฮธ^}.
Definition additivity Let LโŠ‚A*. Then ฮธ(L)โ‰”โ‹ƒuโˆˆLฮธ(u).
Definition The complement of a relation ฮธ is the relation โˆฮธ such that โˆฮธ^=โˆฮธ^=(A*ร—B*)โˆ–ฮธ^.
Definition Let ๐’ฏ๏ธ€ be a finite transducer. Then |๐’ฏ๏ธ€| is its realised relation.
Note The domain and range of |๐’ฏ๏ธ€| are regular languages and the relation |๐’ฏ๏ธ€|โˆ’1 is also realised by a transducer (with the labels swapped).
Note Cโ‰”(Aร—{1})โˆช({1}ร—B) is a generating set of the monoid A*ร—B*. So is Dโ‰”(Aร—{1})โˆช({1}ร—B)โˆช(Aร—B).
Definition A transducer is normalised if its labels are in C and subnormalised if its labels are in D.
Theorem Every transducer is equivalent to a normalised transducer.
Proof We replace the edge (u,v) with the sequence of edges (u1,1),โ€ฆ,(u|u|,1),(1,v1),โ€ฆ,(1,v|v|).
Definition An automaton over a monoid M is proper if no edge is labelled by 1M (spontaneous transition).
Theorem Every automaton is equivalent to a proper automaton.
Proof Shown in the main lecture. It uses the backward closure and the forward closure.
Definition Let KโŠ‚A* be a regular language. We define the relation zK:A*โ†’A*,
zK(u)โ‰”{{u}ifuโˆˆK,โˆ…otherwise.
Note Obviously this relation is accepted by a finite transducer.
Note We can extend the notion of a finite transducer to arbitrarily many free monoids.
Note We can also have right automata, which read from right to left. We can model a right automaton as the transpose of a left automaton. However, this does not preserve certain properties (such as determinism), so it is often useful to treat them as a separate concept.
Definition The composition of relations ฮธ:A*โ†’B* and ฯƒ:B*โ†’C* is the relation ฯƒโˆ˜ฮธ:A*โ†’C* defined as (ฯƒโˆ˜ฮธ)(u)โ‰”ฯƒ(ฮธ(u)), or equivalently
ฯƒโˆ˜ฮธ^โ‰”{(u,w)โˆˆA*ร—C*|โˆƒvโˆˆB*:(u,v)โˆˆฮธ^โˆง(v,w)โˆˆฯƒ^}.
Definition The composition product of two subnormalised transducers ๐’ฏ๏ธ€=โŸจA*ร—B*,Q,I,E,TโŸฉ,๐’ฎ๏ธ€=โŸจB*ร—C*,R,J,F,UโŸฉ is the transducer ๐’ฏ๏ธ€โ‹ˆ๐’ฎ๏ธ€=โŸจA*ร—C*,Qร—R,Iร—J,G,Tร—UโŸฉ with Gโ‰”G1โˆชG2โˆชG3,
G1โ‰”{(p,r)โ†’(x,y)(q,s)|xโˆˆAโˆช{1},yโˆˆCโˆช{1},p,qโˆˆQ,r,sโˆˆR|โˆƒbโˆˆB:pโ†’(a,b)qโˆˆEโˆงrโ†’(b,y)sโˆˆF},
G2โ‰”{(p,r)โ†’(a,1)(q,r)|aโˆˆA,p,qโˆˆQ,rโˆˆR|pโ†’(a,1)qโˆˆE},
G3โ‰”{(p,r)โ†’(1,c)(p,s)|cโˆˆC,pโˆˆQ,r,sโˆˆR|rโ†’(1,c)sโˆˆE}.
Note The composition product might result in spontaneous transitions, which can be eliminated with backward closure.
Theorem Elgotโ€“Megei The relation realised by the composition product of two transducers is the composition of their relations.
Note Let M be a monoid. Then ๐’ซ๏ธ€(M) with the operations โˆช,โ‹… is a semiring.
Definition Let M be a monoid and PโŠ‚M. Then we define P0โ‰”{1M},Pn+1โ‰”Pnโ‹…P,P*โ‰”โ‹ƒnโˆˆโ„•0Pn.
Lemma Let M be a monoid and PโŠ‚M. Then P*=โŸจPโŸฉ.
Definition A rational expression over a monoid M is an expression built inductively as: The subset denoted by a rational expression is A subset PโŠ‚M is rational if it is denoted by a rational expression. The set of all rational subsets is denoted RatM.
Note The name โ€œrationalโ€ comes from the fact that X*=1+XX*, which could be rewritten as
X*(1โˆ’X)=(1โˆ’X)X*=1.
So although 1โˆ’X does not actually mean anything, we can imagine it as being the multiplicative inverse of X*. This is analogous to the geometric series identity
11โˆ’x=โˆ‘n=0โˆžxn.
Theorem A subset of a monoid M is rational if and only if it is accepted by a finite automaton over M.
Proof Analogous to the one for regular languages.
(โ‡’)
Operations on standard automata.
(โ‡)
State elimination method.
Note Unlike regular languages, rational sets are not closed under intersection. For example, let
|๐’ฑ๏ธ€1|โ‰”{(๐–บn๐–ปm,๐–ผn)|n,mโˆˆโ„•},
|๐’ฑ๏ธ€2|โ‰”{(๐–บn๐–ปm,๐–ผm)|n,mโˆˆโ„•}.
Both of these can be recognised by a finite transducer, but their intersection is
|๐’ฑ๏ธ€1|โˆฉ|๐’ฑ๏ธ€2|={(๐–บn๐–ปn,๐–ผn)|nโˆˆโ„•},
which is clearly not rational.
Corollary Rational sets are not closed under complement.
Theorem The complement if the identity is a rational expression.
Corollary Any function ฮธ is a rational expression.
Proof
โˆฮธ^=โˆDomฮธร—B*โˆชฯ‡B*โˆ˜ฮธ^.
Theorem Rabin & Scott, 1959 Given R,SโˆˆRatA*ร—B* with |A|,|B|โ‰ฅ2, it is undecidable whether RโˆฉS=โˆ….
Theorem Fischer & Rosenberg, 1968 If |A|,|B|โ‰ฅ2, it is undecidable whether two transducers over A*ร—B* are equivalent.
Theorem Gibbons & Ryther, 1986 Given R,SโˆˆRat{๐–บ,๐–ป}*ร—{๐–ผ}, it is decidable whether RโˆฉS=โˆ….
Theorem Ibarra, 1978; Liskovik, 1979 It is undecidable whether two transducers over {๐–บ,๐–ป}*ร—{๐–ผ} are equivalent.
Problem Post correspondence problem Let B be a set with at least 2 elements and U={u1,โ€ฆ,uk},V={v1,โ€ฆ,vk}โŠ‚B*. Does there exist a sequence of indices i1,โ€ฆ,ip such that ui1โ‹ฏuip=vi1โ‹ฏvip?
Theorem The Post correspondence problem is recursively undecidable.
Note Take the sets U,V from the Post correspondence problem and define morphisms ฯ„U,ฯ„V:[k]*โ†’B* as ฯ„U(i)โ‰”ui,ฯ„V(i)โ‰”vi. Then the Post correspondence problem can be reformulated as deciding whether โˆƒwโˆˆ[k]*:ฯ„U(w)=ฯ„V(w), or equivalently, ฯ„^Uโˆฉฯ„^Vโ‰ โˆ….
Note Let A={๐–บ,๐–ป}. Define the injective morphism ฮบ:[k]*โ†’A*,ฮบ(i)โ‰”๐–บi๐–ป. Then
ฯ„Uโˆ˜ฮบโˆ’1^โˆฉฯ„Vโˆ˜ฮบโˆ’1^=โˆ…โŸบฯ„^Uโˆฉฯ„^V=โˆ….
This somehow implies that the equivalence of rational sets is undecidable.
Theorem Let RโŠ‚RatA*ร—B* with |A|,|B|โ‰ฅ2. Then it is undecidable whether R=A*ร—B*.
Proof
โˆ(ฯ„Uโˆ˜ฮบโˆ’1)^โˆฉโˆ(ฯ„Vโˆ˜ฮบโˆ’1)^=A*ร—B*โŸบฯ„^Uโˆฉฯ„^V=โˆ….
Note Let AโˆฉB=โˆ… and Zโ‰”AโˆชB. Define the projections ฯ€A:Z*โ†’A*,ฯ€B:Z*โ†’B* the obvious way. Then for a given ฮธ:A*โ†’B* we have
ฮธโˆˆRatAร—ร—B*โŸบโˆƒKโˆˆRatZ*:ฮธ=ฯ€Bโˆ˜ZKโˆ˜ฯ€Aโˆ’1.
Definition Let ๐’œ๏ธ€=โŸจQ,I,E,TโŸฉ,โ„ฌ=โŸจR,J,F,UโŸฉ be two finite automata. A graph-morphism is a function ฯ†:Qโ†’R such that We denote ฯ†:๐’œ๏ธ€โ†ฌโ„ฌ.
Theorem If ฯ†:๐’œ๏ธ€โ†ฌโ„ฌ, then |๐’œ๏ธ€|โŠ‚|โ„ฌ|.
Theorem The projection from ๐’œ๏ธ€ร—โ„ฌ to ๐’œ๏ธ€ is a graph-morphism.
Theorem If ฯ†:๐’œ๏ธ€โ†ฌโ„ฌ, then also ฯ†:๐’œ๏ธ€๐–ณโ†ฌโ„ฌ๐–ณ.
Definition A graph-morphism ฯ†:๐’œ๏ธ€โ†ฌโ„ฌ is conformal if every computation in โ„ฌ is the image of a computation in ๐’œ๏ธ€.
Definition Given an automaton ๐’œ๏ธ€, denote by ๐’œ๏ธ€n the normalised version of ๐’œ๏ธ€ with two new subliminal states i,t. A graph-morphism ฯ†:๐’œ๏ธ€โ†ฌโ„ฌ can be naturally extended to a morphism ฯ†n:๐’œ๏ธ€nโ†ฌโ„ฌn.
Definition The outgoing bouquet of a state p in an automaton ๐’œ๏ธ€ is
Out๐’œ๏ธ€(p)โ‰”{eโˆˆE|e=(p,a,q)}.
The ingoing bouquet of a state q in an automaton ๐’œ๏ธ€ is
In๐’œ๏ธ€(q)โ‰”{eโˆˆE|e=(p,a,q)}.
Definition A graph-morphism ฯ†:๐’œ๏ธ€โ†ฌโ„ฌ is locally outsurjective/outbijective if for all pโˆˆ๐’œ๏ธ€n, the morphism ฯ†n:Out๐’œ๏ธ€n(p)โ†’Outโ„ฌn(ฯ†(p)) is surjective/bijective.
Definition Let ๐’œ๏ธ€=โŸจQ,I,E,TโŸฉ,โ„ฌ=โŸจR,J,F,UโŸฉ be two finite automata. A morphism from ๐’œ๏ธ€ to โ„ฌ is a graph-morphism ฯ†:๐’œ๏ธ€โ†ฌโ„ฌ which is surjective and locally outsurjective. If there exists a morphism from ๐’œ๏ธ€ to โ„ฌ, โ„ฌ is a quotient of ๐’œ๏ธ€.
Theorem If โ„ฌ is a quotient of ๐’œ๏ธ€, then |๐’œ๏ธ€|=|โ„ฌ|.
Definition A graph-morphism ฯ†:๐’œ๏ธ€โ†ฌโ„ฌ is locally insurjective/inbijective if for all pโˆˆ๐’œ๏ธ€n, the morphism ฯ†n:In๐’œ๏ธ€n(p)โ†’Inโ„ฌn(ฯ†(p)) is surjective/bijective.
Definition Let ๐’œ๏ธ€=โŸจQ,I,E,TโŸฉ,โ„ฌ=โŸจR,J,F,UโŸฉ be two finite automata. A comorphism from ๐’œ๏ธ€ to โ„ฌ is a graph-morphism ฯ†:๐’œ๏ธ€โ†ฌโ„ฌ which is surjective and locally insurjective. If there exists a comorphism from ๐’œ๏ธ€ to โ„ฌ, โ„ฌ is a coquotient of ๐’œ๏ธ€.
Definition Let ๐’œ๏ธ€=โŸจQ,I,E,TโŸฉ,โ„ฌ=โŸจR,J,F,UโŸฉ be two finite automata. A covering from ๐’œ๏ธ€ to โ„ฌ is a graph-morphism ฯ†:๐’œ๏ธ€โ†ฌโ„ฌ which is surjective and locally outbijective.
Definition Let ๐’œ๏ธ€=โŸจQ,I,E,TโŸฉ,โ„ฌ=โŸจR,J,F,UโŸฉ be two finite automata. A cocovering from ๐’œ๏ธ€ to โ„ฌ is a graph-morphism ฯ†:๐’œ๏ธ€โ†ฌโ„ฌ which is surjective and locally inbijective.
Definition An automaton ๐’œ๏ธ€ is unambiguous if every accepted word has only one computation.
Theorem If ฯ†:๐’œ๏ธ€โ†’โ„ฌ is a morphism, then there exists a subautomaton ๐’ž๏ธ€ of ๐’œ๏ธ€ such that ฯ†:๐’ž๏ธ€โ†’โ„ฌ is a covering.
Definition A state q in a subnormalised transducer ๐’ฏ๏ธ€ is stalled/tape-consistent if, if an ingoing transition is labeled in {1}ร—A (resp. Aร—{1}), then all outgoing transitions are labeled in {1}ร—A (resp. Aร—{1}).
Definition A subnormalised transducer is synchronous if every state is stalled. A relation ฮธ:A*โ†’B* is synchronous if it is realised by a synchronous transducer. The set of synchronous relations is denoted SynA*ร—B*.
Note This corresponds to a Turing machine with two tapes where both tape heads only move forward and simultaneously.
Theorem The inverse of a synchronous relation is a synchronous relation.
Theorem The universal relation is synchronous.
Theorem SynA*ร—B* is an effective Boolean algebra.
Theorem A synchronous transducer can be determinised.
Definition A transducer ๐’ฏ๏ธ€ is homogeneous if for every state p, either:
Theorem Every synchronous transducer has a homogeneous covering.
Definition A transducer ๐’ฏ๏ธ€ is complete if:
Theorem Every homogenous synchronous transducer can be made complete.
Theorem It is recursively undecidable if a rational relation is synchronous.
Definition The matrix representation of an automaton ๐’œ๏ธ€=โŸจQ,I,E,TโŸฉ is the morphism ฮผ:A*โ†’๐”นQร—Q defined as
(ฮผ(a))p,qโ‰”[pโ†’aq].
We will also represent I as a row vector and T as a column vector. Then we can write ๐’œ๏ธ€=โŸจI,ฮผ,TโŸฉ.
Lemma For every wโˆˆA* and p,q,
ฮผ(w)=1โŸบpโ†’๐’œ๏ธ€wq.
Corollary
L(๐’œ๏ธ€)={wโˆˆ๐’œ๏ธ€*|Iโ‹…ฮผ(w)โ‹…T=1}.
Definition A language LโŠ‚A* is recognisable if there exists a finite monoid M, a morphism ฮฑ:A*โ†’M and PโŠ‚M such that L=ฮฑโˆ’1(P).
Theorem A language is recognisable if and only if it is accepted by a finite automaton.
Proof
(โ‡)
Let ๐’œ๏ธ€=โŸจI,ฮผ,TโŸฉ be an automaton accepting L, represented as a matrix. Take
Pโ‰”{mโˆˆ๐”นQร—Q|Iโ‹…mโ‹…T=1}.
Then L=ฮผโˆ’1(P).
(โ‡’)
Let ๐’œ๏ธ€โ‰”M,{1M},E,P with transitions
Eโ‰”{mโ†’amโ‹…ฮฑ(a)|mโˆˆM,aโˆˆA}.
Note that this automaton is deterministic and for all wโˆˆ๐’œ๏ธ€*,
1Mโ†’๐’œ๏ธ€wฮฑ(w).
Note This construction could in theory be used to determinise an automaton, but itโ€™s even worse than the subset construction because it may produce up to 2|Q|2 states.
Note An automaton defined on a non-free monoid does not necessarily have a matrix representation because not every function defined on a generating set can be extended to a morphism.
Definition A finite real-time transducer is a tuple ๐’ฏ๏ธ€=โŸจA*ร—B*,Q,I,E,TโŸฉ, where Q is a finite set, EโŠ‚Qร—(Aร—๐’ซ๏ธ€(B*))ร—Q and I,T:Qโ†’๐’ซ๏ธ€(B*). Note that the subsets of B* may be infinite!
Theorem A relation ฮธ:A*โ†’B* is rational if and only if it is realised by a finite real-time transducer.
Theorem Let Eโˆˆ๐’ซ๏ธ€(A)Qร—Q. Then the entries of E* belong to the rational closure of the entries of E.
Theorem If ฮผ:A*โ†’(RatB*)Qร—Q is a morphism, then ฮผ(RatA*)โŠ‚(RatB*)Qร—Q.