Preserving the failure mode of a parser in Parsec

Question

I believe I found a solution. It's not especially nice, but seems to works. At least something to start with:

{-# LANGUAGE FlexibleContexts #-}
import Control.Applicative hiding (many, (<|>))
import Control.Monad (void)
import Control.Monad.Trans (lift)
import Control.Monad.Trans.Maybe
import Text.Parsec hiding (optional)
import Text.Parsec.Char
import Text.Parsec.String

rcomb ::  (Stream s m t) => ParsecT s u m a -> ParsecT s u m b -> ParsecT s u m b
rcomb p q = ((test $ opt p *> opt q) <|> pure (Just ()))
                >>= maybe empty (\_ -> p *> q)
  where
    -- | Converts failure to @MaybeT Nothing@:
    opt = MaybeT . optional -- optional from Control.Applicative!
    -- | Tests running a parser, returns Nothing if parsers failed consuming no
    -- input, Just () otherwise.
    test = lookAhead . try . runMaybeT . void

This is the r combinator you're asking for. The idea is that we first execute the parsers in a "test" run (using lookAhead . try) and if any of them fails without consuming input, we record it as Nothing inside MaybeT. This is accomplished by opt, it converts a failure to Nothing and wraps it into MaybeT. Thanks to MaybeT, if opt p returns Nothing, opt q is skipped.

If both p and q succeed, the test .. part returns Just (). And if one of them consumes input, the whole test .. fails. This way, we distinguish the 3 possibilities:

Failure with some input consumed by p or q.
Failure such that the failing part doesn't consume input.
Success.

After <|> pure (Just ()) both 1. and 3. result in Just (), while 2. results in Nothing. Finally, the maybe part converts Nothing into a non-consuming failure, and Just () into running the parsers again, now without any protection. This means that 1. fails again with consuming some input, and 3. succeeds.

Testing:

samples =
    [ "xyz" -- (accept)
    , "xyzP" -- (accept; P remains unparsed)
    , "xyzPz" -- (accept; Pz remains unparsed)
    , "xyzPx" -- (accept; Px remains unparsed; q failed but did not consume any input)
    , "xyzPxy" -- (reject; parser q consumed xy but failed)
    , "xyzPxyz" -- (accept)
    ]

main = do
    -- Runs a parser and then accept anything, which shows what's left in the
    -- input buffer:
    let run p x = runP ((,) <$> p <*> many anyChar) () x x

    let p, q :: Parser String
        p = string "P"
        q = (++) <$> try (string "xy") <*> string "z"

    let parser = show <$> ((:) <$> q <*> many (rcomb p q))
    mapM_ (print . run parser) samples

Preserving the failure mode of a parser in Parsec

Problem statement

Rationale

Addendum: Example