문제

I'm trying to understand context-sensitive grammars.

I understand why languages like

  1. $\{ww \mid w \in A^*\}$
  2. $\{a^n b^n c^n \mid n\in\mathbb{N}\}$

are not context free, but what I'd like to know if a language similar to the untyped lambda calculus is context sensitive.

I'd like to see an example of a simple, but non-toy (I consider the above toy examples), example of a context-sensitive grammar that can, for some production rule, e.g., tell whether or not some string of symbols is in scope currently (e.g. when producing the body of a function).

Are context sensitive grammars powerful enough to make undefined/undeclared/unbound variables a syntactic (rather than semantic) error?

도움이 되었습니까?

해결책

Yes, I would believe this to be possible, but no I am not willing to explicitly construct that context-sensitive grammar. I will explain my answer, by splitting the question in two different parts.

(1) What would the non-toy example be? It should reflect declaration of variables. My proposal of such a language, abstracted from real programming would be something like this. The alphabet is $\{a,b,;,(,)\}$. $$\{ w_1;w_2;\dots;w_n(x_1;x_2;\dots;x_m) \mid w_i,x_j\in\{a,b\}^*, \mbox{ each }x_j\mbox{ is equal to some }w_i \} $$ That language is context-sensitive.

(2) To show it actually is context-sensitive I would use another formalism. That of a Turing machine with linear use of its tape: a linear bounded automaton LBA. I can program it to do the pattern matching/ I would consecutively consider each $x_j$ and try to match it with a proper $w_j$, letter by letter. LBA's are equivalent to context-sensitive grammars, but much easier to program.

다른 팁

My favourite example of a context-sensitive language (CSL) is SAT. The Landweber-Kuroda Theorem says that CSL = NSPACE$[n]$. Any SAT instance has a linear-size certificate, so SAT is a CSL. See my question Context-sensitive grammar for SAT? for references and discussion.

Many other NP-hard languages are also in CSL by the same reason, such as CLIQUE.

There are also fairly natural languages in CSL that are even harder.

However, I am not aware of any way to express an arbitrary CSL as a context-sensitive grammar (CSG), other than to use Landweber's construction in Theorem 3 of his paper. In this construction, the CSG describes the reverse of the operation of the linear-bounded automaton that recognizes the CSL. Productions of the CSG describe how a particular state of the machine results from one possible move. Such a CSG is straightforward translation of the automaton into a grammar, so it is not necessarily going to correspond to language features such as being able to declare variables, but will instead be bogged down by the details of the automaton.

If you insist on a CSG rather than a CSL, and if your actual question is specifically about wanting to see a CSG for a language that involves restricted variable scoping, then Hendrik Jan's answer seems to be a good start.

Yes, context-sensitive grammars (CSG) are powerful enough to make undefined/undeclared/unbound variables check, but unfortunately we don't know any efficient algorithm to parse strings of CSG.

A real example of a context-sensitive language is the C programming language. A feature like declare variables first and then use them later make C-language a context-sensitive language (CSL). (I don't know about untyped lambda calculus).

And because we don't know any linear parsing algorithm for CSL (or CSG). That is the reason in compiler design, we use CFG (and its parsing algoritm only) for syntax checking since we know efficient algorithms to parse CFG (if it's in restricted form). Compilers first parse a context free feature and then later handle context-sensitive features in a problematically way (for example, checks any used variable in the symbol table if it's defined. Otherwise, it generates an error).

Also context-sensitive grammar is used in natural-language processing (NLP). And most natural languages are examples of context-sensitive languages. (I am not sure for the Sanskrit language).

I will try to explain it with a silly but simple example (it's just an idea, you can refine it):

NOUN     -->  { BlueBomber, Grijesh, I, We}
TENSE    -->  { am, was, is, were}
VERB     -->  { going, eating, working}

SENTENCE --> <NOUN> <TENSE> <VERB>

Now, using this grammar, we can generate some correct statements, but some are wrong too. For example,

SENTENCE --> <NOUN>   <TENSE>   <VERB>
             Grijesh    is       working       [Correct statement]

But

             Grijesh    am       working       [wrong statement]

Reason: the value of <TENSE> depends on value <NOUN> (for example, I &lt;TENNSE> --> I am) and hence the grammar doesn't generate correct statements in the English language.

Actually we can't write a context-free grammar for complete English!

You might have noticed, any natural language translator or grammar checker doesn't works correctly (try with long statements). Because this problem comes under the context-sensitive parsing algorithm.


REFERENCE: You can watch Dr. Arun Kumar's lectures. In some lecture he explains exactly what you are interested in.

(extending comments into answer)

Not sure this is an example you would like, anyway.

Almost all real programming languages are context-sensitive (in the general sense, i.e conflating both Chomsky type-0 and type-I grammars under "context-sensitive", which is true of course since unrestricted grammars are even more context-sensitive than context-sensitive grammars).

For example, "variables need declared before used", "identifiers should be unique", all these require context (sometimes refered as semantic context, but that can be misleading, since it involves syntactic features anyway) see for example https://www.cs.purdue.edu/homes/hosking/502/notes/04-semantics.pdf

The sense that the above examples are context-sensitive (in the grammar/syntactic sense as well as semantic one) is because they talk about their context (what precedes or comes after).

A "variable already defined", is about context preceding, a variable use. A "unique identifier", is context both to what preceded or comes after an identifier declaration, and so on..

see also Is JavaScript a Context Free Language? on SO

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 cs.stackexchange
scroll top