Parsing trouble with amotoen
Question
I'm trying to write a grammar to parse a simple language to describe drum loops, using Clojure and amotoen. The language looks like this:–
# Test Loop
# this is a comment
BPM: 100
Samples:
BD: bd.wav
SD: sd.wav
CHH: chh.wav
CSH: csh.wav
Body:
BD: /---/---/---/---
SD: ---/--/--/-/--/-
CHH: --/---/---/---/-
CSH: /---------------
# this is another comment
I've defined the grammar as follows:–
(def g {
:Whitespace '(| \space \newline \tab \, :Comment)
:_* '(* :Whitespace)
:_ [:Whitespace '(* :Whitespace)]
:Comment [\# '(* (% \newline)) \newline]
:BPM [\B \P \M \: :_* '(* :Number) \newline]
:Number '(* :Digit)
:Digit (a/lpegs '| "0123456789")
:Samples [\S \a \m \p \l \e \s \: \newline '(* :SampleDef)]
:SampleDef [:_* :Name \: :_* :File \newline]
:Name '(* (% \:))
:File '(* (% \newline))
:Body [\B \o \d \y \: \newline '(* :Pattern)]
:Pattern [:_* :Name \: :_* '(* (| \/ \-)) '(| \newline \$)]
:Document [:_* :BPM :_* :Samples :_* :Body :_* \$]
})
When I call pegasus
on each part of a sample file individually, they are parsed correctly. For example:–
(pprint
(a/pegasus
:Body
g
(a/wrap-string
"Body:\n
BD: /---/---/---/---\n
SD: ---/--/--/-/--/-\n
CHH: --/---/---/---/-\n
CSH: /---------------\n")))
However, when I call (pprint (a/pegasus :Document g (a/wrap-string (slurp "sample.orc"))))
, all I get is nil
. Similarly if I replace (a/wrap-string (slurp "sample.orc"))
with a string containing the text contained in sample.orc
.
So, my question is: can anyone spot what's wrong with my grammar? I'm all out of ideas, and I've been staring at it for a few days now. I'm sure it's something embarrassingly simple, but I just can't see it!
Thanks in advance.
Solution
The :Samples
rule consumes the :Body
, :File
can be empty and end-of-input should be marked by :$
not \$
. Here's an amended grammar:
(def g
{
:Whitespace '(| \space \tab \, :Comment)
:n* '(* (| \newline :Comment))
:_* '(* :Whitespace)
:_ [:Whitespace '(* :Whitespace)]
:Comment [\# '(* (% \newline)) \newline]
:BPM [\B \P \M \: :_* '(* :Number) \newline]
:Number '(* :Digit)
:Digit (a/lpegs '| "0123456789")
:Samples [\S \a \m \p \l \e \s \: \newline '(* :SampleDef)]
:SampleDef [:_* :Name \: :_* :File \newline]
:Name '[(% \:) (* (% \:))]
:File '[(% \newline) (* (% \newline))]
:Body [\B \o \d \y \: \newline '(* :Pattern)]
:Pattern [:_* :Name \: :_* '(* (| \/ \-)) '(| \newline :$)]
:Document [:n* :BPM :n* :Samples :n* :Body :n* :$]
})
;; sample.orc contains the example input from the question text
(pprint (a/pegasus :Document g (a/wrap-string (slurp "sample.orc"))))
;; output:
{:Document
[{:n*
({:Comment [\# (\space \T \e \s \t \space \L \o \o \p) \newline]}
{:Comment
[\#
(\space
\t
\h
\i
\s
\space
\i
\s
\space
\a
\space
\c
\o
\m
\m
\e
\n
\t)
\newline]}
\newline)}
{:BPM
[\B
\P
\M
\:
{:_* {:Whitespace \space}}
{:Number ({:Digit \1} {:Digit \0} {:Digit \0})}
\newline]}
{:n* \newline}
{:Samples
[\S
\a
\m
\p
\l
\e
\s
\:
\newline
({:SampleDef
[{:_* {:Whitespace \space}}
{:Name [\B \D]}
\:
{:_* {:Whitespace \space}}
{:File [\b (\d \. \w \a \v)]}
\newline]}
{:SampleDef
[{:_* {:Whitespace \space}}
{:Name [\S \D]}
\:
{:_* {:Whitespace \space}}
{:File [\s (\d \. \w \a \v)]}
\newline]}
{:SampleDef
[{:_* ()}
{:Name [\C (\H \H)]}
\:
{:_* {:Whitespace \space}}
{:File [\c (\h \h \. \w \a \v)]}
\newline]}
{:SampleDef
[{:_* ()}
{:Name [\C (\S \H)]}
\:
{:_* {:Whitespace \space}}
{:File [\c (\s \h \. \w \a \v)]}
\newline]})]}
{:n* \newline}
{:Body
[\B
\o
\d
\y
\:
\newline
({:Pattern
[{:_* {:Whitespace \space}}
{:Name [\B \D]}
\:
{:_* {:Whitespace \space}}
(\/ \- \- \- \/ \- \- \- \/ \- \- \- \/ \- \- \-)
\newline]}
{:Pattern
[{:_* {:Whitespace \space}}
{:Name [\S \D]}
\:
{:_* {:Whitespace \space}}
(\- \- \- \/ \- \- \/ \- \- \/ \- \/ \- \- \/ \-)
\newline]}
{:Pattern
[{:_* ()}
{:Name [\C (\H \H)]}
\:
{:_* {:Whitespace \space}}
(\- \- \/ \- \- \- \/ \- \- \- \/ \- \- \- \/ \-)
\newline]}
{:Pattern
[{:_* ()}
{:Name [\C (\S \H)]}
\:
{:_* {:Whitespace \space}}
(\/ \- \- \- \- \- \- \- \- \- \- \- \- \- \- \-)
\newline]})]}
{:n*
(\newline
{:Comment
[\#
(\space
\t
\h
\i
\s
\space
\i
\s
\space
\a
\n
\o
\t
\h
\e
\r
\space
\c
\o
\m
\m
\e
\n
\t)
\newline]})}
:$]}