Module 1 : Introduction to the theory of computation – Set ...



Module 1

Theory of computation

The theory of computation is the branch of computer science that deals with whether and how efficiently problems can be solved on a model of computation, using an algorithm. The field is divided into two major branches: computability theory and complexity theory, but both branches deal with formal models of computation.

In order to perform a rigorous study of computation, computer scientists work with a mathematical abstraction of computers called a model of computation. There are several models in use, but the most commonly examined is the Turing machine. A Turing machine can be thought of as a desktop PC with a potentially infinite memory capacity, though it can only access this memory in small discrete chunks.

Computer scientists study the Turing machine because it is simple to formulate, can be analyzed and used to prove results, and because it represents what many consider the most powerful possible "reasonable" model of computation. It might seem that the potentially infinite memory capacity is an unrealizable attribute, but any decidable problem solved by a Turing machine will always require only a finite amount of memory. So in principle, any problem that can be solved (decided) by a Turing machine can be solved by a computer that has a bounded amount of memory.

1.1 Computability theory

Computability theory deals primarily with the question of whether a problem is solvable at all on a computer. The statement that the halting problem cannot be solved by a Turing machine is one of the most important results in computability theory, as it is an example of a concrete problem that is both easy to formulate and impossible to solve using a Turing machine. Much of computability theory builds on the halting problem result.

The next important step in computability theory was the Rice's theorem, which states that for all non-trivial properties of partial functions, it is undecidable whether a Turing machine computes a partial function with that property.

Computability theory is closely related to the branch of mathematical logic called recursion theory, which removes the restriction of studying only models of computation which are close to physically realizable. Many mathematicians and Computational theorists who study recursion theory will refer to it as computability theory. There is no real difference between the fields other than whether a researcher working in this area has his or her office in the computer science or mathematics field.

1.2 Complexity theory

Complexity theory considers not only whether a problem can be solved at all on a computer, but also how efficiently the problem can be solved. Two major aspects are considered: time complexity and space complexity, which are respectively how many steps does it take to perform a computation, and how much memory is required to perform that computation.

In order to analyze how much time and space a given algorithm requires, computer scientists express the time or space required to solve the problem as a function of the size of the input problem. For example, finding a particular number in a long list of numbers becomes harder as the list of numbers grows larger. If we say there are n numbers in the list, then if the list is not sorted or indexed in any way we may have to look at every number in order to find the number we're seeking. We thus say that in order to solve this problem, the computer needs to perform a number of steps that grows linearly in the size of the problem.

To simplify this problem, computer scientists have adopted Big O notation, which allows functions to be compared in a way that ensures that particular aspects of a machine's construction do not need to be considered, but rather only the asymptotic behavior as problems become large. So in our previous example we might say that the problem requires O(n) steps to solve.

Perhaps the most important open problem in all of computer science is the question of whether a certain broad class of problems denoted NP can be solved efficiently. This is discussed further at Complexity classes P and NP.

1.3 Set Theory

        Set is a group of elements, having a property that characterizes those elements. One way is to enumerate the elements completely. All the elements belonging to the set are explicitly given. Example:   A = {1,2,3,4,5}. Alternate way is to give the properties that characterize the elements of the set. Example: B = {x | x is a positive integer less than or equal to 5}. Some sets can also be defined recursively.

     

1.3.1 Set terminology

        Belongs To

            x [pic] B  means that  x  is an element of set B. Using this notation we can specify the set {0,1,2,3,4} call it Z by writing

                     Z = {x | x [pic] N | x [pic]5} where N represents the set of natural numbers.

            It is read as "the set of natural numbers that are less than or equal to 5".

 

       Subset

            Let A and B be two sets.

            A is a subset of B, if every element of A is an element of B.

            A is a subset of B is represented as A [pic] B.

            Note:  If A is a subset of B and B is a subset of A then A=B. Also, if A is a subset of, but not equal to B

                        represented as A [pic] B.

        Universal Set

            The set U of all the elements we might ever consider in the discourse is called the universal set.

        Complement

            If A is a set, then the complement of A is the set consisting of all elements of the universal set

            that are not in A. It is denoted by A' or [pic]. Thus

            A' = { x | x [pic]U ^ x [pic] A } ,   where   [pic]  means " is not an element of "..

            Example:

                 If U is the set of natural numbers and A = { 1,2,3 } , then A' = { x | x [pic]U ^ x > 3 } .

 

       Set Operations

            The operations that can be performed on sets are:

           1. Union

                   If A and B are two sets, then the union of A and B is the set that contains all the elements that are in

                   A and B including the ones in both A and B. It is denoted by A [pic] B.

                   Example:  If   A = {1,2,3} and B = {3,4,5}

                                   then A [pic] B = {1,2,3,4,5}

            2. Difference

                    If A and B are two sets, then the difference of A from B is the set that consists of the elements of A

                    that are not in B. It is denoted by A - B.

                    Example:  If A = {1,2,3} B = {3,4,5}

                                    then A - B = {1,2}

                    Note that in general A - B [pic]B - A .

                    For A and B of the above example B - A = {4,5} .

            3. Intersection

                    If A and B are two sets, then the intersection of A and B is the set that consists of the elements in

                    both A and B . It is denoted by A [pic] B.

                    Example:  If A = {1,2,3,8}     B = {3,4,5,8}

                                    then A [pic]B = {3,8}.

        Disjoint sets

             A and B are said to be disjoint if they contain no elements in common

             i.e. A [pic] B =  ø, where ø  is the Empty set.

             Example:   A = { 1,2,3,4,5 }    and     B = { 6,8,9 } are disjoint.

        Following is a list of some standard Set Identities

         A, B, C represent arbitrary sets and ø is the empty set and U is the Universal Set.

    The Commutative laws:

             A [pic] B = B [pic] A

             A [pic] B = B [pic] A

    The Associative laws:

             A [pic] (B [pic] C) = (A [pic] B) [pic] C

             A [pic] (B [pic] C) = (A [pic] B) [pic] C

    The Distributive laws:

             A [pic] (B [pic] C) = (A [pic] B) [pic] (A [pic] C)

             A [pic] (B [pic] C) = (A [pic] B) [pic] (A [pic] C)

    The Idempotent laws:

             A [pic] A = A

             A[pic] A = A

    The Absorptive laws:

             A [pic] (A [pic] B) = A

             A [pic] (A [pic] B) = A

    The De Morgan laws:

             (A [pic] B)' = A' [pic]B'

             (A [pic] B)' = A' [pic]B'

    Other laws involving Complements:

             ( A' )' = A

             A [pic] A' = ø

             A [pic] A' = U

    Other laws involving the empty set

             A [pic] ø = A

             A [pic] ø = ø

    Other laws involving the Universal Set:

             A [pic] U = U

             A [pic] U = A

          

1.3.2 Generalized Set Operations

        Union, intersection and Cartesian product of sets are associative. For example   [pic]holds. To denote either of these expressions we often use A [pic]B [pic]C . This can be generalized for the union of any finite number of sets as A1 [pic]A2[pic]....[pic]An,which we write as

          [pic] Ai

        This generalized union of sets can be rigorously defined as follows:

        Definition ([pic] Ai) :

        Basis Clause: For n = 1 ,  [pic] Ai = A1.

        Inductive Clause:   [pic]Ai = ([pic] Ai) [pic]An+1

        Similarly the generalized intersection [pic]Ai and generalized Cartesian product [pic]Ai can be defined.

        Based on these definitions, De Morgan's law on set union and intersection can also be generalized as follows:

        Theorem (Generalized De Morgan)

             [pic]= [pic],     and

             [pic]= [pic]

1.4 Relations                

                        Let A and B be sets. A binary relation from A into B is any subset of the Cartesian product A x B.

Example1: Let's assume that a person owns three shirts and two pairs of slacks. More precisely, let A = {blue shirt, mint green shirt} and B = {gray slacks, tan slacks}. Then certainly A x B is the set of all possible combinations (six) of shirts and slacks that the individual can wear. However, the individual may wish to restrict himself to ombinations which are color coordinated, or "related". This may not be all possible airs in A x B but will certainly be a subset of A x B. For example, one such subset ay be { (blue shirt, gray slack), (black shirt, tan slacks), (mint green shirt, tan slacks) }.

Eample2: Let A = {2, 3, 5, 6) and define a relation R from A into A by (a, b) [pic]R if and only if divides evenly into b. o, R = {(2, 2), (3, 3), (5, 5), (6, 6), (2,6), (3, 6)}. A typical element in R is an ordered pair (x, y). In some cases R can be described by atually listing the pairs which are in R, as in the previous example. This may not be onvenient if R is relatively large. Other notations are used depending on the past practice.

Consider the following relation on real numbers.

                R = { (x, y) | y is the square of x} and S = { (x, y) | x relation.

[pic]

Inverse

Let f be a bijection from a set A to a set B. Then the function g is called the inverse function of f, and it is denoted by f -1 ,  if for every element y of B,  g(y) = x , where f(x) = y . Note that such an x is unique for each y because f is a bijection. For example, the rightmost function in the above figure is a bijection and its inverse is obtained by reversing the direction of each arrow. Example: The inverse function of f(x) = 2x from the set of natural numbers N to the set of non-negative even numbers E is f -1(x) = 1/2 x from E to N. It is also a bijection. A function is a relation. Therefore one can also talk about composition of functions.

Composite function

Let g be a function from a set A to a set B , and let f be a function from B to a set C . Then the composition of functions f and g , denoted by fg , is the function from A to C that satisfies    fg(x) = f( g(x) )   for all x in A. Example: Let  f(x) = x2 , and  g(x) = x + 1 . Then f( g(x) ) = ( x + 1 )2 .

1.6 Primitive recursive function

The primitive recursive functions are defined using primitive recursion and composition as central operations and are a strict subset of the recursive functions (recursive functions are also known as computable functions). The term was coined by Rózsa Péter. In computability theory, primitive recursive functions are a class of functions which form an important building block on the way to a full formalization of computability. These functions are also important in proof theory.

Many of the functions normally studied in number theory, and approximations to real-valued functions, are primitive recursive, such as addition, division, factorial, exponential, finding the nth prime, and so on (Brainerd and Landweber, 1974). In fact, it is difficult to devise a function that is not primitive recursive, although some are known (see the section on Limitations below).The set of primitive recursive functions is known as PR in complexity theory. Every primitive recursive function is a general recursive function.

The primitive recursive functions are among the number-theoretic functions which are functions from the natural numbers (nonnegative integers) {0, 1, 2 , ...} to the natural numbers which take n arguments for some natural number n. Such a function is called n-ary.

The basic primitive recursive functions are given by these axioms:

1. Constant function: The 0-ary constant function 0 is primitive recursive.

2. Successor function: The 1-ary successor function S, which returns the successor of its argument (see Peano postulates), is primitive recursive.

3. Projection function: For every n≥1 and each i with 1≤i≤n, the n-ary projection function Pin, which returns its i-th argument, is primitive recursive.

More complex primitive recursive functions can be obtained by applying the operators given by these axioms:

1. Composition: Given f, a k-ary primitive recursive function, and k m-ary primitive recursive functions g1,...,gk, the composition of f with g1,...,gk, i.e. the m-ary function h(x1,...,xm) = f(g1(x1,...,xm),...,gk(x1,...,xm)), is primitive recursive.

2. Primitive recursion: Given f, a k-ary primitive recursive function, and g, a (k+2)-ary primitive recursive function, the (k+1)-ary function defined as the primitive recursion of f and g, i.e. the function h where h(0,x1,...,xk) = f(x1,...,xk) and h(S(n),x1,...,xk) = g(h(n,x1,...,xk),n,x1,...,xk), is primitive recursive.

The primitive recursive functions are the basic functions and those which can be obtained from the basic functions by applying the operators a finite number of times.

Role of the projection functions

The projection functions can be used to avoid the apparent rigidity in terms of the arity of the functions above; by using compositions with various projection functions, it is possible to pass a subset of the arguments of one function to another function. For example, if g and h are 2-ary primitive recursive functions then is also primitive recursive. One formal definition using projections functions is

Converting predicates to numeric functions

In some settings it is natural to consider primitive recursive functions that take as inputs tuples that mix numbers with truth values { t= true, f=false }, or which produce truth values as outputs (see Kleene [1952 pp.226-227]). This can be accomplished by identifying the truth values with numbers in any fixed manner. For example, it is common to identify the truth value t with the number 1 and the truth value f with the number 0. Once this identification has been made, the characteristic function of a set A, which literally returns 1 or 0, can be viewed as a predicate that tells whether a number is in the set A. Such an identification of predicates with numeric functions will be assumed for the remainder of this article.

1.6.1 Examples

Most number-theoretic functions which can be defined using recursion on a single variable are primitive recursive. Basic examples include the addition and "limited subtraction" functions.

Addition

Intuitively, addition can be recursively defined with the rules:

add(0,x)=x,

add(n+1,x)=add(n,x)+1.

In order to fit this into a strict primitive recursive definition, define:

add(0,x)=P11(x) ,

add(S(n),x)=S(P13(add(n,x),n,x)).

Here P13 is the projection function that takes 3 arguments and returns the first one.

P11 is simply the identity function; its inclusion is required by the definition of the primitive recursion operator above; it plays the role of f. The composition of S and P13, which is primitive recursive, plays the role of g.

Subtraction

Because primitive recursive functions use natural numbers rather than integers, and the natural numbers are not closed under subtraction, a limited subtraction function is studied in this context. This limited subtraction function sub(a,b) returns b − a if this is nonnegative and returns 0 otherwise.

The predecessor function acts as the opposite of the successor function and is recursively defined by the rules:

pred(0)=0,

pred(n+1)=n.

These rules can be converted into a more formal definition by primitive recursion:

pred(0)=0,

pred(S(n))=P22(pred(n),n).

The limited subtraction function is definable from the predecessor function in a manner analogous to the way addition is defined from successor:

sub(0,x)=P11(x),

sub(S(n),x)=pred(P13(sub(n,x),n,x)).

Here sub(a,b) corresponds to b-a; for the sake of simplicity, the order of the arguments has been switched from the "standard" definition to fit the requirements of primitive recursion. This could easily be rectified using composition with suitable projections.

Other primitive recursive functions include exponentiation and primality testing. Given primitive recursive functions e, f, g, and h, a function which returns the value of g when e≤f and the value of h otherwise is primitive recursive.

Operations on integers and rational numbers

By using Gödel numbers, the primitive recursive functions can be extended to operate on other objects such as integers and rational numbers. If integers are encoded by Gödel numbers in a standard way, the arithmetic operations including addition, subtraction, and multiplication are all primitive recursive. Similarly, if the rationals are represented by Gödel numbers then the field operations are all primitive recursive.

1.6.2 Relationship to recursive functions

The broader class of partial recursive functions is defined by introducing an unbounded search operator. The use of this operator may result in a partial function, that is, a relation which has at most one value for each argument but, unlike a total function, does not necessarily have any value for an argument (see domain). An equivalent definition states that a partial recursive function is one that can be computed by a Turing machine. A total recursive function is a partial recursive function which is defined for every input.

Every primitive recursive function is total recursive, but not all total recursive functions are primitive recursive. The Ackermann function A(m,n) is a well-known example of a total recursive function that is not primitive recursive. There is a characterization of the primitive recursive functions as a subset of the total recursive functions using the Ackermann function. This characterization states that a function is primitive recursive if and only if there is a natural number m such that the function can be computed by a Turing machine that always halts within A(m,n) or fewer steps, where n is the sum of the arguments of the primitive recursive function.

1.6.3 Limitations

Primitive recursive functions tend to correspond very closely with our intuition of what a computable function must be. Certainly the initial functions are intuitively computable (in their very simplicity), and the two operations by which one can create new primitive recursive functions are also very straightforward. However the set of primitive recursive functions does not include every possible computable function — this can be seen with a variant of Cantor's diagonal argument. This argument provides a computable function which is not primitive recursive. A sketch of the proof is as follows:

The primitive recursive functions can be computably numerated. This numbering is unique on the definitions of functions, though not unique on the actual functions themselves (as every function can have an infinite number of definitions — consider simply composing by the identity function). The numbering is computable in the sense that it can be defined under formal models of computability such as μ-recursive functions or Turing machines; but an appeal to the Church-Turing thesis is likely sufficient.

Now consider a matrix where the rows are the primitive recursive functions of one argument under this numbering, and the columns are the natural numbers. Then each element (i, j) corresponds to the ith unary primitive recursive function being calculated on the number j. We can write this as fi(j).

Now consider the function g(x) = S(fx(x)). g lies on the diagonal of this matrix and simply adds one to the value it finds. This function is computable (by the above), but clearly no primitive recursive function exists which computes it as it differs from each possible primitive recursive function by at least one value. Thus there must be computable functions which are not primitive recursive.

This argument can be applied to any class of computable (total) functions that can be enumerated in this way, as explained in the article Machines that always halt. Note however that the partial computable functions (those which need not be defined for all arguments) can be explicitly enumerated, for instance by enumerating Turing machine encodings.

Other examples of total recursive but not primitive recursive functions are known:

• The function which takes m to Ackermann(m,m) is a unary total recursive function that is not primitive recursive.

• The Paris–Harrington theorem involves a total recursive function which is not primitive recursive. Because this function is motivated by Ramsey theory, it is sometimes considered more “natural” than the Ackermann function.

1.6.4 Some common primitive recursive functions

The following examples and definitions are from Kleene (1952) pp. 223-231 -- many appear with proofs. Most also appear with similar names, either as proofs or as examples, in Boolos-Burgess-Jeffrey 2002 pp. 63-70; they add #22 the logarithm lo(x, y) or lg(x, y) depending on the exact derivation.

In the following we observe that primitive recursive functions can be of four types:

1. functions for short: "number-theoretic functions" from { 0, 1, 2, ...} to { 0, 1, 2, ...},

2. predicates: from { 0, 1, 2, ...} to truth values { t =true, f =false },

3. propositional connectives: from truth values { t, f } to truth values { t, f },

4. representing functions: from truth values { t, f } to { 0, 1, 2, ... }. Many times a predicate requires a representing function to convert the predicate's output { t, f } to { 0, 1 } (note the order "t" to "0" and "f" to "1" matches with ~(sig( )) defined below). By definition a function φ(x) is a "representing function" of the predicate P(x) if φ takes only values 0 and 1 and produces 0 when P is true".

In the following the mark " ' ", e.g. a' , is the primitive mark meaning "the successor of", usually thought of as " +1", e.g. a +1 =def a'. The functions 16-21 and #G are of particular interest with respect to converting primitive recursive predicates to, and extracting them from, their "arithmetical" form expressed as Gödel numbers.

1. Addition: a+b

2. Multiplication: a×b

3. Exponentiation: ab,

4. Factorial a! : 0! = 1, a'! = a!×a'

5. pred(a): Decrement: "predecessor of a" defined as "If a> 0 then a-1 → anew else 0 → a."

6. Proper subtraction: a ┴ b defined as "If a ≥ b then a-b else 0."

7. Minimum (a1, ... an)

8. Maximum (a1, ... an)

9. Absolute value: | a-b | =defined a ┴ b + b ┴ a

10. ~sg(a): NOT[signum(a}]: If a=0 then sg(a)=1 else if a>0 then sg(a)=0

11. sg(a): signum(a): If a=0 then sg(a)=0 else if a>0 then sg(a)=1

12. "b divides a" [ a | b ]: If the remainder ( a, b )=0 then [ a | b ] else b does not divide a "evenly"

13. Remainder ( a, b ): the leftover if b does not divide a "evenly". Also called MOD(a, b)

14. s = b: sg | a - b |

15. a < b: sg( a' ┴ b )

16. Pr(a): a is a prime number Pr(a) =def a>1 & NOT(Exists c)1 ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download