python parsley - domain name syntax

https://stackoverflow.com/questions/22107347

18-10-2022
|

Question

I'm trying to get a limited syntax for domain names to work. The syntax is defined at https://www.rfc-editor.org/rfc/rfc1035 Section 2.3.1. A subset of it is as under

<label> ::= <letter> [ [ <ldh-str> ] <let-dig> ]
<ldh-str> ::= <let-dig-hyp> | <let-dig-hyp> <ldh-str>
<let-dig-hyp> ::= <let-dig> | "-"
<let-dig> ::= <letter> | <digit>
<letter> ::= any one of the 52 alphabetic characters A through Z in upper case and a through z in lower case
<digit> ::= any one of the ten digits 0 through 9

My attempt below. I'm trying to match label

from parsley import makeGrammar
import ometa
domain = makeGrammar('''
letdighyp = (letterOrDigit|-)
label = letterOrDigit letdighyp+ letterOrDigit
''', {})

tests = ('abcd1000',)
for t in tests:
    try:
        print domain(t).label()
    except ometa.runtime.ParseError as e:
        print 'parse failed for', t
        print e

running that gives me

parse failed for abcd1000

abcd1000
^
Parse error at line 2, column 0: expected EOF. trail: [digit letdig letdighyp]

What am I doing wrong ?

P.S.

    label = letterOrDigit letdighyp+ letterOrDigit

is the line I'm not able to get working. It matches the string if the last letterOrDigit isn't there.

La solution

Try:

from parsley import makeGrammar
import ometa
def check_the_rest(s):
    if '-' in s:
        return s[-1].isalnum()
    return True

domain = makeGrammar('''
letdighyp = (letterOrDigit|'-')
# label = <letterOrDigit (letdighyp* letterOrDigit)*>
label = <letter the_rest>
the_rest = letdighyp*:r ?(check_the_rest(r)) -> r
''', dict(check_the_rest=check_the_rest))

tests = ('a', 'abcd1000', 'a-',)
for t in tests:
    try:
        print('✓', domain(t).label())
    except ometa.runtime.ParseError as e:
        print('parse failed for', t)
        print(e)

(The last test should fail)

I'm using the py3 branch by vsajip.

Licencié sous: CC-BY-SA avec attribution

Non affilié à StackOverflow