Next: , Previous: , Up: Earlemes   [Contents][Index]


6.1.1 The traditional input model

In traditional Earley parsers, the concept of location is very simple. Locations are numbered from 0 to n, where n is the length of the input. Every location has an Earley set, and vice versa. Location 0 is the start location. Every location after the start location has exactly one input token associated with it.

Some applications do not fit this traditional input model — natural language processing requires ambiguous tokens, for example. Libmarpa allows a wide variety of alternative input models.

In Libmarpa a location is called a earleme. The number of an Earley set is the ID of the Earley set, or its ordinal. In the traditional model, the ordinal of an Earley set and its earleme are always exactly the same, but in Libmarpa’s advanced input models the ordinal of an Earley set can be different from its location (earleme).

The important earleme values are the latest earleme. the current earleme, and the furthest earleme. Latest, current and furthest earleme, when they have specified values, obey a lexical order in this sense: The latest earleme is always at or before the current earleme, and the current earleme is always at or before the furthest earleme.