Parsing trouble with amotoen

https://stackoverflow.com/questions/11970318

26-06-2021
|

Question

I'm trying to write a grammar to parse a simple language to describe drum loops, using Clojure and amotoen. The language looks like this:–

# Test Loop
# this is a comment

BPM: 100

Samples:
 BD: bd.wav
 SD: sd.wav
CHH: chh.wav
CSH: csh.wav

Body:
 BD: /---/---/---/---
 SD: ---/--/--/-/--/-
CHH: --/---/---/---/-
CSH: /---------------

# this is another comment

I've defined the grammar as follows:–

(def g {
        :Whitespace '(| \space \newline \tab \, :Comment)
        :_* '(* :Whitespace)
        :_ [:Whitespace '(* :Whitespace)]
        :Comment [\# '(* (% \newline)) \newline]
        :BPM [\B \P \M \: :_* '(* :Number) \newline]
        :Number '(* :Digit)
        :Digit (a/lpegs '| "0123456789")
        :Samples [\S \a \m \p \l \e \s \: \newline '(* :SampleDef)]
        :SampleDef [:_* :Name \: :_* :File \newline]
        :Name '(* (% \:))
        :File '(* (% \newline))
        :Body [\B \o \d \y \: \newline '(* :Pattern)]
        :Pattern [:_* :Name \: :_* '(* (| \/ \-)) '(| \newline \$)]
        :Document [:_* :BPM :_* :Samples :_* :Body :_* \$]
       })

When I call pegasus on each part of a sample file individually, they are parsed correctly. For example:–

(pprint
  (a/pegasus
    :Body
    g
    (a/wrap-string
      "Body:\n
        BD: /---/---/---/---\n
        SD: ---/--/--/-/--/-\n
       CHH: --/---/---/---/-\n
       CSH: /---------------\n")))

However, when I call (pprint (a/pegasus :Document g (a/wrap-string (slurp "sample.orc")))), all I get is nil. Similarly if I replace (a/wrap-string (slurp "sample.orc")) with a string containing the text contained in sample.orc.

So, my question is: can anyone spot what's wrong with my grammar? I'm all out of ideas, and I've been staring at it for a few days now. I'm sure it's something embarrassingly simple, but I just can't see it!

Thanks in advance.

Solution

The :Samples rule consumes the :Body, :File can be empty and end-of-input should be marked by :$ not \$. Here's an amended grammar:

(def g                                                                                                                                                                  
  {                                                                                                                                                                     
   :Whitespace '(| \space \tab \, :Comment)                                                                                                                             
   :n* '(* (| \newline :Comment))                                                                                                                                       
   :_* '(* :Whitespace)                                                                                                                                                 
   :_ [:Whitespace '(* :Whitespace)]                                                                                                                                    
   :Comment [\# '(* (% \newline)) \newline]                                                                                                                             
   :BPM [\B \P \M \: :_* '(* :Number) \newline]                                                                                                                         
   :Number '(* :Digit)                                                                                                                                                  
   :Digit (a/lpegs '| "0123456789")                                                                                                                                     
   :Samples [\S \a \m \p \l \e \s \: \newline '(* :SampleDef)]                                                                                                          
   :SampleDef [:_* :Name \: :_* :File \newline]                                                                                                                         
   :Name '[(% \:) (* (% \:))]                                                                                                                                           
   :File '[(% \newline) (* (% \newline))]                                                                                                                               
   :Body [\B \o \d \y \: \newline '(* :Pattern)]                                                                                                                        
   :Pattern [:_* :Name \: :_* '(* (| \/ \-)) '(| \newline :$)]                                                                                                          
   :Document [:n* :BPM :n* :Samples :n* :Body :n* :$]                                                                                                                   
   })

;; sample.orc contains the example input from the question text
(pprint (a/pegasus :Document g (a/wrap-string (slurp "sample.orc"))))

;; output:
{:Document
 [{:n*
   ({:Comment [\# (\space \T \e \s \t \space \L \o \o \p) \newline]}
    {:Comment
     [\#
      (\space
       \t
       \h
       \i
       \s
       \space
       \i
       \s
       \space
       \a
       \space
       \c
       \o
       \m
       \m
       \e
       \n
       \t)
      \newline]}
    \newline)}
  {:BPM
   [\B
    \P
    \M
    \:
    {:_* {:Whitespace \space}}
    {:Number ({:Digit \1} {:Digit \0} {:Digit \0})}
    \newline]}
  {:n* \newline}
  {:Samples
   [\S
    \a
    \m
    \p
    \l
    \e
    \s
    \:
    \newline
    ({:SampleDef
      [{:_* {:Whitespace \space}}
       {:Name [\B \D]}
       \:
       {:_* {:Whitespace \space}}
       {:File [\b (\d \. \w \a \v)]}
       \newline]}
     {:SampleDef
      [{:_* {:Whitespace \space}}
       {:Name [\S \D]}
       \:
       {:_* {:Whitespace \space}}
       {:File [\s (\d \. \w \a \v)]}
       \newline]}
     {:SampleDef
      [{:_* ()}
       {:Name [\C (\H \H)]}
       \:
       {:_* {:Whitespace \space}}
       {:File [\c (\h \h \. \w \a \v)]}
       \newline]}
     {:SampleDef
      [{:_* ()}
       {:Name [\C (\S \H)]}
       \:
       {:_* {:Whitespace \space}}
       {:File [\c (\s \h \. \w \a \v)]}
       \newline]})]}
  {:n* \newline}
  {:Body
   [\B
    \o
    \d
    \y
    \:
    \newline
    ({:Pattern
      [{:_* {:Whitespace \space}}
       {:Name [\B \D]}
       \:
       {:_* {:Whitespace \space}}
       (\/ \- \- \- \/ \- \- \- \/ \- \- \- \/ \- \- \-)
       \newline]}
     {:Pattern
      [{:_* {:Whitespace \space}}
       {:Name [\S \D]}
       \:
       {:_* {:Whitespace \space}}
       (\- \- \- \/ \- \- \/ \- \- \/ \- \/ \- \- \/ \-)
       \newline]}
     {:Pattern
      [{:_* ()}
       {:Name [\C (\H \H)]}
       \:
       {:_* {:Whitespace \space}}
       (\- \- \/ \- \- \- \/ \- \- \- \/ \- \- \- \/ \-)
       \newline]}
     {:Pattern
      [{:_* ()}
       {:Name [\C (\S \H)]}
       \:
       {:_* {:Whitespace \space}}
       (\/ \- \- \- \- \- \- \- \- \- \- \- \- \- \- \-)
       \newline]})]}
  {:n*
   (\newline
    {:Comment
     [\#
      (\space
       \t
       \h
       \i
       \s
       \space
       \i
       \s
       \space
       \a
       \n
       \o
       \t
       \h
       \e
       \r
       \space
       \c
       \o
       \m
       \m
       \e
       \n
       \t)
      \newline]})}
  :$]}

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow