What is the significance of context-sensitive (Type 1) languages?

https://cs.stackexchange.com/questions/14

16-10-2019
|

Pergunta

Seeing that in the Chomsky Hierarchy Type 3 languages can be recognised by a state machine with no external memory (i.e., a finite automaton), Type 2 by a state machine with a single stack (i.e. a push-down automaton) and Type 0 by a state machine with two stacks (or, equivalently, a tape, as is the case for Turing Machines), how do Type 1 languages fit into this picture? And what advantages does it bring to determine that a language is not only Type 0 but Type 1?

Solução

The context-sensitive languages are exactly the languages that can be recognized by a Turing machine using linear space and non-determinism. You can simulate such a Turing machine using exponential time, so you can recognize any such language in exponential time. Do note that the problem of recognizing some context-sensitive languages is $PSPACE$-complete, which means we're pretty sure you can't do better than exponential time.

Comparing this to type 0 languages, this means you can at least say something about how long it takes to recognize the language. A type 0 language may not even be decidable: the language of all Turing machines that halt is a type 0 language, but as recognizing this language is exactly the halting problem, it is not decidable.

Context-sensitive grammars are not very useful in practice. Context-free grammars are intuitive to work with, but as the examples on Wikipedia show, context-sensitive grammars very quickly become rather messy. Programs using polynomial space are much more easily designed (and the $PSPACE$-completeness guarantees the existence of some equivalent CSG that is only polynomially larger than the space usage of your algorithm).

The reason for their existence is that they form a very natural extension of context-free grammars (you allow context to determine which productions are valid). This will probably have inspired Chomsky to define them and name them the type 1 languages. Remember that this definition was made before computers became as fast as they are today: it's more of interest to formal language theorists than for programmers.

Unrestricted grammars get even weirder: there's no longer a notion of 'expanding' a nonterminal and replacing it with a production, possibly depending on context. You're allowed to freely modify the context as well. This makes unrestricted grammars even less intuitive to work with: programs are equivalent and a lot more intuitive.

Outras dicas

Generally speaking, you usually want to know the smaller class to which a given language $L$ belongs. This is because smaller classes can be recognized/accepted/generated by simpler mechanisms (automata, grammars, regular expressions, etc.), which is desirable.

For example, the class of regular languages has good closure properties, and given a DFA $\mathcal{A}$ you can test in linear time that a word belongs to $L(\mathcal{A})$. In contrast, with a Turing machine you need linear time just to read the output, which usually happens before it actually starts processing.

In short, for smaller classes you need less computational power to solve the problem of deciding whether a word belongs to the language.

According to Wikipedia, Chomsky defined context-sensitive grammars (i.e. Type 1) to describe the syntax of natural languages. This is a bit different than with other classes of languages, which were introduced to describe families of strings that were used in mathematics (e.g. the syntax of arithmetic formulae) instead of natural languages (e.g. the syntax of a gramatically-correct sentence in English).

In context-free languages, at any point of the input parsing, the automaton is in a state defined by its stack. Each production has the same behaviour in consuming the input regardless of where it is used.

This leads to the interesting property that each production generates a sub-language of the one generated by the ones that are deeper in the stack and thus for each pair A and B of productions generated and consumed on any particular input we have three possible cases:

a: The input consumed by A is completely contained in the input consumed by B; or
b: The input consumed by A completely contains the input consumed by B; or
c: The input consumed by A is completely disjoint from the input consumed by B.

This implies that the following never happens:

d: The input consumed by A partially overlaps the input consumed by B.

Contrasting to that, in context-sensitive languages, the behaviour of each production depends on where it is used, so the input consumed in a production is not a sub-language of the ones deeper in the stack (in fact, processing it with a stack would not work). And we have that possibility d may happen.

In the real world, a case where a context-sensitive language would make sense is something like denoting bold text, italic text and underlined text with these html tags and let them overlap, like "This is a text with mixed overlapping tags." Observe that to parse that and find if all the starting tags match the ending tags, a PDA won't do because it is not context-free, but an LBA will easily do.

Closure Properties

Of all language classes from the Chomsky hierarchy, only regular and context-sensitive languages are closed under complementation. Hence this is a sort of unique feature of context-sensitive languages.

In contrast to context-free languages, CS are also closed under intersection and shuffle product.

Any language that is type 1 can be recognized by a Turing machine that only uses linear space (so-called linear bounded automata).

Type 1 languages can be decided by linear bounded automata, which are non-deterministic Turing machines that may only use a portion of the tape whose size is linear to the input size.

The Chomsky hierarchy classifies grammars more than languages. However it was not designed to have something to do with the number of tapes a automaton should have to recognize it, as you suggested for Type 2 and 3, even if there is a kind of Turing machine that does that for Type-1 grammars.

You should also note that the languages of Type-0 grammars are not all recognized by a Turing machine, but they only can be enumerated by such a machine: Type-0 means recursively enumerable, and Turing machines only recognize recursive languages.

Modern programming language use context-sensitive features all the time; they fall into a subset that can efficiently be decided.

Examples are name and type analysis and type inference.

Many others have mentioned that Type-1 languages are those that can be recognised by linear bounded automata. The halting problem is decidable for linear bounded automata, which in turn means many other properties that are computationally undecidable for languages recognised by Turning Machines are decidable for Type-1 languages.

Admittedly the proof that the halting problem is decidable for linear bounded automata relies on the fact that with a finite amount of tape they can only enter a finite number of states, so if they don't halt within that many steps you know they're looping and won't ever halt. This proof technically applies to all actual computers (which also have finite memory), but that isn't of any practical benefit in solving the halting problem for the programs that run on them.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a cs.stackexchange