Regular Expressions

Regular ExpressionsRegular Expressions

Section 1.3(also 1.1, 1.2)

CSC 4170

Theory of Computation

Regular operations1.3.a

Union: L1 L2 = {x | xL1 or xL2} {Good,Bad} {Boy,Girl} =

{0,00,000,…} {1,11,111,…} =

L =

Concatenation: L1 L2 = {xy | xL1 and yL2}

{Good,Bad}{Boy,Girl} =

{0,00,000,…}{1,11,111,…} =

L =

Star: L* = {x1…xk | k0 and each xiL}

{Boy,Girl}* =

{0,00,000,…}* =

* =

Regular expressions1.3.b

We say that R is a regular expression (RE) iff R is one of the following:

1. a, where a is a symbol of the alphabet

2.

3.

4. (R1)(R2), where R1 and R2 are RE

5. (R1) (R2), where R1 and R2 are RE

6. (R1)*, where R1 is a RE

What language is represented by the expression:

{a}

{}

The union of the languages represented by R1 and R2The concatenation of the languages represented by R1 and R2The star of the language represented by R1

Conventions: The symbol is often omitted in RE Some parentheses can be omitted. The precedence order for the operators is:

* (highest), (medium), (lowest)

Regular languages1.3.c

A language is said to be regular iff it can be represented by a regular expression.

Language Expression

{11}

{Boy, Girl, Good, Bad}

{,0,00,000,0000,…}

{0,00,000,0000,…}

{,01,0101,010101,01010101,…}

{x | x = 0k where k is a multiple of 2 or 3}

{x | x is divisible by 8}

{x | x MOD 4 = 3}

Exercising reading regular expressions1.3.d

Expression Language

0*10*

(Good Bad)(Boy Girl)

(Tom Bob)_is_(good bad)

{Name_is_adjective | Name is an uppercaseletter followed by zero or more lowercase letters, and adjective is a lowercase letterfollowed by zero or more lowercase letters}

(0 1)*101(0 1)*

((0 1)(0 1))*

Regular languages and DFA-recognizable languages are the same

1.3.e

Theorem 1.54* A language is regular if and only if some NFA (DFA) recognizes it. In other words,

a) [The “only if” part] For every regular expression there is an NFA that recognizes exactly the language represented by that expression.

b) [The “if” part] For every NFA there is a regular expression that represents exactly the language recognized by that NFA.

Constructing an NFA from a regular expression: Base cases1.3.f

Case of a, where a is a symbol of the alphabet.

Case of

Case of

Constructing an NFA from a regular expression: Case of union

1.3.g

Case of (R1)(R2), where R1 and R2 are RE

First, construct NFAs N1 and N2 from R1 and R2:

s1

N1

N2

s2

Then, combine them in the following way:

s1

N1

N2

s2

Constructing an NFA from a regular expression: Case of concatenation

1.3.h

Case of (R1) (R2), where R1 and R2 are RE

First, construct NFAs N1 and N2 from R1 and R2:

N1

s2

N2

Then, combine them in the following way:

N1

s2

N2

s1

s1

Constructing an NFA from a regular expression: Case of star

1.3.i

Case of (R1)*, where R1 is a RE

First, construct an NFA N1 from R1:

s1N1

Then, extend it in the following way:

s1N1

Constructing an NFA from a regular expression: An example

1.3.j

#(0 1)*(0 1)*

#

0 1

0

1

0

1

#

# 01 1

GNFA1.3.k

great

(great)*

grand mother father

grand

g r e a t g r e a t g r e a t g r a n d f a t h e r

About -transitions1.3.l

great

(great)*

grand mother father

grand

Adding or removing -transitions does not change the recognized language

The same GNFA simplified1.3.m

great

grand mother father

Ripping a state out1.3.n

mother fathergrand (great)*

Eliminating parallel transitions1.3.o

mother father (great)*grand

Again ripping out 1.3.p

( (great)*grand) (mother father)

How, exactly, to do ripping out 1.3.q1

Assume, we are ripping out the state r from a GNFA that has no parallel transitions.

Let L be the label of the loop from r to r (if there is no loop, then L=).

L

T

R

S




1. For every pair s1,s2 of states such that there is an E1-labeled transition from s1 to r and an E2-labeled transition from r to s2, add an R1L*R2-labeled transition from s1 to s2;

L

T

R

S

RL*T

SL*T




1. For every pair s1,s2 of states such that there is an E1-labeled transition from s1 to r and an E2-labeled transition from r to s2, add an R1L*R2-labeled transition from s1 to s2;

2. Delete r together with all its incoming and outgoing transitions.

RL*T

SL*T

How, exactly, to eliminate parallel transitions 1.3.r

Whenever you see parallel transitions labeled with R1 and R2,

Replace them by a transition labeled with R1R2.

R1

R2

R1R2

Repeat until there are no parallel transitions remaining.

From NFA to RE 1.3.s

a

b

b

b

From NFA to RE: Step 1 1.3.t

Step 1: If there are incoming arrows to the start state, or the start state is an accept state, then add a new start state and connect it with an -arrow to the old start state.

a

b

b

b

a

From NFA to RE: Step 2 1.3.u

a

b

b

b

Step 2: If there are more than one, or no, accept states, or there is an accept state that has outgoing arrows, then add a new accept state, make all the old accept states non-accept states and connect each of them with an -arrow to the new accept state.

a

From NFA to RE: Step 3 1.3.v

a

b

b

b

Step 3: Eliminate all parallel transitions.

a

From NFA to RE: Step 4 1.3.w1

b

b

Step 4: While there are internal states (states that are neither the start nor the accept state), do the following:

Step 4.1: Select an internal state and rip it out;Step 4.2: Eliminate all parallel transitions.

a

b

aa

ab


b

baa



a

b

ab




b

a(baa)*

b(baa)*

b(baa)*ab

a(baa)*ab




a(baa)*

b(baa)*

b(baa)*ab

ba(baa)*ab




a(baa)*

(b a(baa)*ab) (b(baa)*ab)* ( b(baa)*)




((b a(baa)*ab) (b(baa)*ab)* ( b(baa)*)(a(baa)*)

From NFA to RE: Step 5 1.3.x

Step 5: Return the label of the only remaining arrow (if there is no arrow, return ).

Claim: The resulting RE represents exactly the language recognized by the original NFA. This completes the proof of Theorem 1.54.



Regular Expressions

Documents

Transcript of Regular Expressions