On Two Letters versus Three - Cornell University
[Pages:7]On Two Letters versus Three
Dexter Kozen Department of Computer Science
Cornell University Ithaca, New York 14853-7501, USA
kozen@cs.cornell.edu
February 4, 2002
Abstract
If A is a context-free language over a two-letter alphabet, then the set of all words obtained by sorting words in A and the set of all permutations of words in A are context-free. This is false over alphabets of three or more
letters. Thus these problems illustrate a difference in behavior between twoand three-letter alphabets.
The following problem appeared on a recent exam at Cornell:
Let be a finite alphabet with a fixed total ordering on the letters.
For a string x 2 , let sort x be the string obtained by sorting the letters in increasing order. For example, if a < b < c, then sort abacbaa = aaaabbc. For A , let sortA = fsortx j x 2 Ag.
Of the following three statements, two are false and one is true. Give counterexamples for the two false ones and a proof of the true one.
(i) If A is regular, then so is sort A. (ii) If A is context-free, then so is sort A. (iii) If A is context-sensitive, then so is sort A. One might also ask the same questions about perm A, the set of all permutations of words in A.
Of course, it is (i) and (ii) that are false, since
sort (abc) = perm (abc) \ a b c = fanbncn j n 0g:
1
Interestingly, (ii) is true for both sort and perm over a two-letter alphabet. This is
quite surprising: whereas a two-letter alphabet is exponentially more succinct than a one-letter alphabet, one does not normally think of a break in behavior between two- and three-letter alphabets. In many applications, three letters (or for that matter any fixed finite number of letters) can be coded into two with only a linear loss of efficiency. Not so, apparently, in this case.
In this short note we give an elementary proof of these facts. The proof for
sort is a fairly straightforward construction relying on Parikh's theorem and Pilling normal form, but the proof for perm is somewhat more involved, requiring a bit of
linear algebra over integer lattices.
Let = fa1 : : : adg, and let : ! Nd be the Parikh map
(x) =def (#a1(x) : : : #ad(x)) where #a(x) is the number of a's in x. Define
Theorem 1 For d sort A.
(A) =def f (x) j x 2 Ag perm A =def ;1( (A)) sort A =def perm A \ a1 ad :
2, if A is a context-free language, then so are perm A and
This is trivial for d = 1 and false for d 3. The interesting case is d = 2.
Lemma 1 It suffices to prove Theorem 1 for A regular. When manipulating regular expressions, we can also use the commutativity axiom xy = yx.
Proof. This is a consequence of Parikh's theorem (the commutative image of
any context-free language is the commutative image of some regular set), observing
that the definitions of perm A and sort A depend only on the commutative image
(A) of A.
2
Lemma 2
x y1 : : :
It
yk
suffices
2.
to
prove
Theorem
1
for
A
of
the
form
xy1
yk , where
Proof. Under commutativity, every regular expression is equivalent to a sum of
expressions of this form. This is known as Pilling normal form (see [1]).
2
2
Here is a direct construction for sort A. This result will also follow from the result for perm A by intersecting with a b , but the proof for perm A is somewhat
harder.
Without loss of generality, assume A is of the form of Lemma 2. Let m = #a(x), n = #b(x), mi = #a(yi), and ni = #b(yi), 1 i k. A context-free grammar for sort A is
S ! amT1bn Ti ! amiTibni j Ti+1 1 i k ; 1 Tk ! amkTkbnk j ":
For perm A, we will need to use some linear algebra on integer lattices.
Lemma 3 Let y1 : : : yn be nontrivial. The following are equivalent:
(i) (y1) : : : (yn) are linearly dependent over Q.
(ii) (y1) : : : (yn) are linearly dependent over Z.
(iii) There exists a partition of y1 : : : yn into two nonempty disjoint sets y1 : : : yk
and yk+1 : : : yn (renumbering if necessary) and coefficients ai 2 N, 1
i
and
Qn,kis=u1cyhiatih=at
Qnonit=akl+l 1ayi ia=i.
0,
1
i
k, not all ai = 0, k + 1
i
n,
The property in observation that
w(iiei)carengnaortdhinagvethQeki=va1nyiiasihi=ng1owf itthheaci o2efNficuiennletsssfoalllloawi s=fr0o.m
the
The following lemma gives a stronger version of Pilling normal form.
Lemma 4 (Conway [1, Theorem 2, p. 92]) Any regular subset of Nd can be writ-
ten as a sum of terms of the form xy1 yn with (y1) : : : (yn) linearly inde-
pendent over Q.
aQkni+=P1kr+:o1:o:yf.iaai Snwu=pitph0o.saeUi s2i(nyg1N)th, e:1:K: lee(niyena)lgaernbe,ralniniodeteanarlltylitiaed1sep:e:n:deankt.
Let
=0
Qki=1
and
yiai
not
=
all
y = (nX;1 yi)(yn)
i=0
x1 xn = (x1 xn) (Xk Y xj )
i=1 1 j k j6=i
3
(the second one requires commutativity), rewrite y1
where
= Yk aX i;1 yij
i=1 j=0
and then (y1a1) (ykak ) as (y1a1 ykak) (Xk i)
i=1
yk as (y1a1)
(ykak) ,
where
i = Y(yjaj ) 1 i k:
j6=i
Note contains no starred terms, so it can be expressed as a finite sum of products
of the yi. Then y1 yn can be written as a sum of terms of the form
u(y1a1 ykak) iyk+1 yn: Now we can replace Qki=1 yiai with Qni=k+1 yiai to get
u(yka+k+11 ynan) iyk+1 yn:
Since this is contained in y1 yn, we have u 2 y1 yn, thus
u(yka+k+11 ynan) iyk+1 yn
u i0yk+1 yn
where
i0 = Y yj 1 i k:
j6=i
Thus form
the but
original with one
tfeerwmerxsyt1arredyyni.
can
be
written
as
a
sum
of
terms
of
the
same
We can continue decreasing the number of starred terms inductively until the
yi are linearly independent.
2
By this lemma, to prove Theorem 1 for the case perm A, it suffices to consider A of the form xu or xu v , where (u) and (v) are linearly independent. Note
that the dimension is at most two since we are over a two-letter alphabet. We
can get rid of the x without loss of generality by jxj applications of the following
lemma:
4
Lemma 5 Let a 2 . If A is context-free, then so is fxay j xy 2 Ag. It follows that if perm A is context-free, then so is perm aA, since perm aA = fxay j xy 2 perm Ag.
Proof. Consider a Chomsky normal form grammar for perm A. For every
nonterminal X, add a new nonterminal Xa, which is meant to generate all the
strings that X generates but with an extra a somewhere. For every production X !
Y Z, add the productions Xa ! YaZ j Y Za. For every production X ! b, add
the
Xa
productions Xa ! ! a. The new start
ba j ab.
symbol
For every production X ! ", add the production
is Sa, where S was the old start symbol.
2
Now we show that perm u v is context-free. (We leave the easier case,
perm u , as an exercise u2, #a(v) = v1, #a(v)
for the interested reader.) Suppose #a(u) = = v2; thus (u) = (u1 u2) and (v) = (v1
vu21), .#Abr(ruan) g=e
(u) and (v) in a 2 2 matrix
A =def
u1 v1 u2 v2
with positive determinant = u1v2 ; u2v1 > 0. (The sign of the determinant is determined by the orientation of u and v; exchange if necessary to make it positive.) The adjoint (pseudo-inverse) of A is
A0 =def
v2 ;v1 ;u2 u1
and satisfies the property
AA0 = A0A = 0 0 :
Now we give a nondeterministic one-way automaton with an integer counter
accepting perm u v . The machine actually keeps three counters, c1 c2 c3, but
the counters c1 and c3
control. The counter
hold only finitely many values and can be
c2 holds an integer. We can simulate this
stored in the finite with a pushdown
automaton with a single-letter stack, keeping the sign in the finite control.
The automaton starts in the state c1 = c2 = c3 = 0 and takes the following actions on each input symbol: on input a,
c1 := (c1 + v2) mod c2 := c2 ; u2 c3 := min(c3 + 1 v1)
5
and on input b,
c1 := (c1 ; v1) mod c2 := c2 + u1:
In addition, it may nondeterministically choose to take the following reset step
whenever c3 = v1 without reading an input symbol.
c2 := c2 ; c3 := 0:
Thus after scanning a prefix y of the input string,
c1 = (v2#a(y) ; v1#b(y)) mod c2 = ;u2#a(y) + u1#b(y) ; q
(1)
where q is the number of resets that have occurred, and c3 contains the number of a's seen since the last reset, up to a maximum of v1. The automaton accepts if c1 = c2 = 0.
Now we show that the automaton accepts perm u v . For s t 2 Z2, note that As = t iff A0t = s. Applying this with s = (p q) and t = (#a(x) #b(x)), we
have
#a(x) = u1p + v1q #b(x) = u2p + v2q
(2)
iff
;uv22##aa((xx))
; +
uv11##bb((xx))
= =
p q:
(3)
This implies that the following are equivalent:
(i) x 2 perm u v (ii) there exist p q 2 N such that x 2 perm upvq (iii) there exist p q 2 N satisfying either of the equivalent conditions (2) or (3).
Now suppose x 2 perm u v and condition (iii) holds with p q 2 N. Let
the automaton choose to perform the reset step at its earliest opportunity while
scanning x (i.e., as soon as the counter c3 reaches v1), but only q times. It has the
6
opportunity to perform a reset at least q times, since by (2), #a(x) the final values of c1 and c2 are
(v2#a(x) ; v1#b(x)) mod = 0 ;u2#a(x) + u1#b(x) ; q = 0
v1q. By (1),
respectively, so the machine accepts.
Conversely, suppose the machine accepts. Let q be the number of times the reset occurred. By (1), there exists p 2 Z such that (3) holds, and we need only show that p 0. Since the reset occurred q times, we have #a(x) v1q. Then
u1v2p = p + u2v1p = v2#a(x) ; v1#b(x) + u2v1p v2v1q ; v1(u2p + v2q) + u2v1p = 0:
But u1v2 = + u2v1 > 0, therefore p 0.
Acknowledgements
Thanks to Juris Hartmanis, Jon Kleinberg, and Lillian Lee for valuable comments. This work was supported in part by NSF grant CCR-0105586 and by ONR Grant N00014-01-1-0968. The views and conclusions contained herein are those of the author and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of these organizations or the US Government.
References
[1] John Horton Conway. Regular Algebra and Finite Machines. Chapman and Hall, London, 1971.
7
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- letters and sounds phase three amazon web services
- phonics intervention strategy sound elkonin boxes
- making three letter words homeschool creations
- new collins scrabble words initiation kit
- on two letters versus three cornell university
- stage 2 three letter words with short vowel sounds
- welcome to word ladders
- over 200 emotions alpha sorted inverse studio
- wrs materials information booklet wilson language
Related searches
- cornell university data analytics program
- cornell university data analytics certificate
- cornell university business analytics
- cornell university business
- cornell university johnson business school
- cornell university college of business
- cornell university college report
- cornell university reputation
- cornell university data analytics
- cornell university dyson business school
- cornell university johnson
- cornell university johnson school