Question

I have the following specs

  it "parses a document with only an expression" do
    puts parser.document.should parse("[b]Hello World[/b]")
  end
  it "parses a document with only text" do
    puts parser.document.should parse(" Hello World")
  end
  it "parses a document with both an expression and text" do
    puts parser.document.should parse("[b]Hello World[/b] Yes hello")
  end

For the following Parslet Parser

class Parser < Parslet::Parser

rule(:open_tag) do
  parslet = str('[')
  parslet = parslet >> (str(']').absent? >> match("[a-zA-Z]")).repeat(1).as(:open_tag_name)
  parslet = parslet >> str(']')
  parslet
end

rule(:close_tag) do
  parslet = str('[/')
  parslet = parslet >> (str(']').absent? >> match("[a-zA-Z]")).repeat(1).as(:close_tag_name)
  parslet = parslet >> str(']')
  parslet
end

rule(:text) { any.repeat(1).as(:text) }

rule(:expression) do
  # [b]Hello World[/b]
  # open tag, any text up until closing tag, closing tag
  open_tag.present?
  close_tag.present?
  parslet = open_tag >> match("[a-zA-Z\s?]").repeat(1).as(:enclosed_text) >> close_tag
  parslet
end

rule(:document) do
  expression | text
end

The first two tests pass just fine, and I can see by puting them out to the command line that the atoms are of the correct type. However, when I try to parse a document with both an expression and plain text, it fails to parse the plain text, failing with the following error

Parslet::UnconsumedInput: Don't know what to do with " Yes hello" at line 1 char 19.

I think I'm missing something regarding defining the :document rule. What I want is something that will consume any number of in sequence expressions and plain text, and while the rule I have will consume each atom individual, using them both in the same string causes failure.

Was it helpful?

Solution 2

For your document rule you want to use repeat:

rule(:document) do
  (expression | text).repeat
end

You’ll also need to change your text rule; currently if it starts matching it will consume everything including any [ that should start a new expression. Something like this should work:

rule(:text) { match['^\['].repeat(1).as(:text) }

OTHER TIPS

What you were looking for is something like this...

require 'parslet'

class ExampleParser < Parslet::Parser
  rule(:open_tag) do
    str('[') >> 
      match["a-zA-Z"].repeat(1).as(:open_tag_name) >>
    str(']')
  end

The open_tag rule doesn't need to exclude the ']' character as the match only allows letters.

  rule(:close_tag) do
    str('[/') >> 
      match["a-zA-Z"].repeat(1).as(:close_tag_name) >>
    str(']')
  end

same here

  rule(:text) do 
    (open_tag.absent? >> 
      close_tag.absent? >> 
        any).repeat(1).as(:text) 
  end

If you exclude the open and close tags here.. you know you are only dealing with text. Note: I like this technique of using "any" once you have excluded the things you don't want, but bare it in mind if you are refactoring later as your exclusion list may need to grow. Note2: You could simplify this further as below..

  rule(:text) do 
    (str('[').absent? >> any).repeat(1).as(:text) 
  end

.. if you don't want any square brackets in your text at all.

  rule(:expression) do
    # [b]Hello World[/b]
    open_tag >> text.as(:enclosed_text) >> close_tag
  end

This becomes much simpler as the text can't include a close_tag

  rule(:document) do
    (expression | text).repeat
  end

I've added in the repeat you missed (as pointed out by matt)

end

require 'rspec'
require 'parslet/rig/rspec'

describe 'example' do
  let(:parser) { ExampleParser.new }
  context 'document' do
    it "parses a document with only an expression" do
      parser.document.should parse("[b]Hello World[/b]")
    end
    it "parses a document with only text" do
      parser.document.should parse(" Hello World")
    end
    it "parses a document with both an expression and text" do
      parser.document.should parse("[b]Hello World[/b] Yes hello")
    end
  end
end


RSpec::Core::Runner.run([])

Hope this give you some tips on using Parslet. :)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top