How to PARSE a sequence of items where items not in blocks are handled like single element blocks?

StackOverflow https://stackoverflow.com/questions/21644187

  •  08-10-2022
  •  | 
  •  

Question

I've got a situation where I want an equivalence, such that:

[
    {Foo}
    http://example.com/some/stuff.html
    separator
]

...is handled just as if you had written:

[
    [{Foo}]
    [http://example.com/some/stuff.html]
    [separator]
]

Adding a little to the complexity is that if you put the item in a block, then it can have arguments:

[
    [{Foo} /some-refinement]
    [http://example.com/some/stuff.html {stuff caption} 3]
    [separator dashed-line]
]

I'd like a PARSE-based engine that can run the same handler for {Foo}, [{Foo}], and [{Foo} /some-refinement] (let's call it STRING-HANDLER), and have it merely invoked with the right number of parameters.

To write this without PARSE is easy... a single element is wrapped in a temporary block (in the case it's not a block). Then the first item is tested in a CASE statement. But I'd like to convert this to be PARSE-based, where one branch uses INTO while another does not, without repeating code.

It will need to support nesting, so you might wind up processing something like:

[http://example.com/some/stuff.html [{Foo} /some-refinement] 3]
Was it helpful?

Solution

I hope the following can be the basis for your solution.

The following performs exactly the same in both R2 and R3. PARSE's 'into operation is VERY different between the two so I put a simple guard [.here.: block! :.here.] which fixes different bug situations in both platforms.

I used hook functions which allow to cleanly separate the data browsing from the data evaluation. If you look closely, you will notice that the =enter-block?=: rule is global and that the code which switches its meaning is setup BEFORE running the emit-value function... so in some cases, you might actually want to use emit-value to setup a different rule.

note that I'm not assuming any kind of known structure as your explanation seems to be meant for unstructured datasets.

Also note that test B is setup as a string, so we can use the wrapper directly on string input data:

rebol [
    author: "Maxim Olivier-Adlhoch"
    date: 2014-02-08
    license: "public domain"
]


A: [
    [{Foo}]
    [http://example.com/some/stuff.html]
    [separator]
]


B: {[
    {Foo}
    http://example.com/some/stuff.html
    separator
]}


C: [
    [{Foo} /some-refinement]
    [http://example.com/some/stuff.html {stuff caption} 3]
    [separator dashed-line]
]


D: [http://example.com/some/stuff.html [{Foo} /some-refinement] 3]


depth: ""
enter-block: func [][
    prin depth 
    print "[" 
    append depth "^-"
]

quit-block: func [][
    remove depth 
    prin depth 
    print "]"
]

emit-value: func [value][
    prin depth 
    probe value
]

=enter-block?=: none

=block=: [
    (
        =enter-block?=: [into =block=] ; we enter blocks by default
        enter-block
    )
    some [
        .here.: block! :.here. ; only enter blocks (R3/R2 compatible)
        (if 1 = length? .value.: first .here. [ =enter-block?=: [skip]  emit-value first .value. ])
        =enter-block?=
        | set .value. skip ( emit-value .value. )
    ]
    (quit-block)
]

STRING-HANDLER: func [data][
    if string? data [
        data: load data
    ]

    parse data =block=
]

STRING-HANDLER A
STRING-HANDLER B
STRING-HANDLER C
STRING-HANDLER D

ask "press enter to quit ..."
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top