languages


1.

a: b c d e

item sequence b, c, d, e can be replaced with item a



2.

b, c etc. can be tokens in the input file, the rule itself, or other rules

the rule can contain an item representing itself

multiple combinations for each rule may exist


a : a c d e
  : g h c 


b: ...
 : ...





3.

for software languages, it's desirable that a single parse tree be mappable to an input file, not none or multiple ones.

resolving ambiguity



a : a x a	precedence: 1		replace this sequence with 'a' before the other sequence 
  : a y a 	precedence: 2



a : a x a	assoc: l		accumulate left-to-right combinations
  : a y a 	assoc: r		accumulate right-to-left combinations




4.

parsing


LL start at the topmost symbol, determine which of the rule options to follow, descend through each item to the lowest level,

	being a terminal symbol


shift-reduce

	push input items onto a stack and reduce multiple items with a single item when a sequence is recognised


scan

	scan the input sequence repeatedly, replacing sequences, until the input sequence is reduced to a single symbol ('program', 'sourcefile' etc.)



5. context sensitivity

	issues
		terminal symbols appearing in multiple rules, items valid at only some points of the input



6. recursion

repeated items


a : a ";" N			l recursion
  : N   



a : N ";" a			r recursion
  : N   

