Definition 2.1: Alphabet and Language
- Sigma
- An alphabet Sigma is a nonempty, finite set of symbols.
- L
- A language L over an alphabet Sigma is a collection of strings of elements of Sigma. The empty string lambda is a string with no symbols at all.
- Sigma*
- The set of all possible finite strings of elements of Sigma is denoted by Sigma*. Lambda is an element of Sigma*.
Definition 2.2: Context-free grammar
Context-free grammar G is a quadruple
G = (V, T, P, S)
where
V is a finite set of variable symbols,
T is a finite set of terminal symbols disjoint from V,
P is a finite set of rewriting rules (productions) of the formA --> w where A in V, w in (V union T)*
S is an element of V called the start symbol.
Definition 2.3: Generation of a Language from the Grammar
Let G be a grammar. Then the set
L(G) = {w in T* | S ==>* w}
is the language generated by G.
A language L is context-free iff there is a context-free grammar G such that L = L(G).
If w in L(G), then the sequence
S ==> w1 ==> w2 ==> ... ==> wn ==> w
is a derivation of the sentence w and the wi are called sentential form
Let G = (V, T, P, S) be a context-free grammar. A derivation tree has the following properties.
- The root is labeled S.
- Every interior vertex has a label from V.
- If a vertex has label A in V, and its children are labeled (from left to right) a1, ..., an, then P must contain a production of the form
A --> a1...an
- Every leaf has a label from T union {lambda}.
A context-free grammar G is said to be ambiguous if there exists some w in L(G) which has two distinct derivation trees.
A push-down automaton M is a 7-tuple (Q, Sigma, Tau, delta, q0, Z0, F)
Q is a finite set of states Sigma is a finite alphabet called the input alphabet Tau is a finite alphabet called the stack alphabet delta is a transition function from Q× (Sigma union {e})× Tau to finite subsets of Q× Tau* q0 in Q is the initial state Z0 in Tau is called the start symbol F a subset of Q; the set of accepting states
Definition 2.7: Regular expressions and Regular languages
Regular Expression E Language Denoted L(E) ø ø The empty set; language lambda {lambda} empty string; language which consists of the empty string a { a } a; the language which consists of just a (E ·F) {uv | u in L(E) and v in L(F) } concatenation; (E|F) {u | u in L(E) or u in L(F) } alternation; union of L(E) and L(F) (E*) {u1u2...un| ui in L(E) 0 <= i <=n, n >=0 } any sequence from E
Definition 2.8: Finite State Automaton
A finite state automaton or fsa is defined by the quintuple
M = (Q, Sigma, delta, q0, F),
where
Q is a finite set of internal states
Sigma is a finite set of symbols called the input alphabet
delta: Q × Sigma --> 2Q is a total function called the transition function
q0 in Q is the initial state
F a subset of Q is the set of final states