Consigli per l'implementazione semplice espressione regolare (per bbcode / parsing GeSHi)

https://stackoverflow.com/questions/4207058

25-09-2019
|

Domanda

avevo fatto un software nota personale in PHP in modo da poter memorizzare e organizzare i miei appunti e desiderare un formato di bel semplice da scrivere in.

che avevo fatto in Markdown, ma abbiamo scoperto che era un po 'di confusione e non c'era semplice evidenziazione della sintassi, così ho fatto bbcode prima e voluto attuare tale.

Ora, per GeSHi che desidero veramente da implementare (l'evidenziatore sintassi), richiede il codice più semplice come questo:

$geshi = new GeSHi($sourcecode, $language);
$geshi->parse_code();

Ora, questa è la parte facile, ma quello che voglio fare è di permettere il mio bbcode chiamarlo.

Il mio attuale espressioni regolari per abbinare un formato da [sintassi = cpp] [/ sintassi] BBCode è la seguente:

preg_replace('#\[syntax=(.*?)\](.*?)\[/syntax\]#si' , 'geshi(\\2,\\1)????', text);

Si noterà catturo la lingua e il contenuto, come mai dovrei collegarlo al codice GeSHi?

preg_replace sembra essere solo in grado di sostituirlo con un non stringa di un 'espressione', io non sono sicuro di come utilizzare queste due righe di codice per GeSHi lì con i dati acquisiti ..

Sono davvero entusiasta di questo progetto e desiderio di superare questo.

Soluzione

Ho scritto questa classe un po 'indietro, la ragione per la classe è stato quello di consentire una facile personalizzazione / analisi. Forse un po 'eccessivo, ma funziona bene e ho avuto bisogno eccessivo per la mia applicazione. L'utilizzo è piuttosto semplice:

$geshiH = new Geshi_Helper();
$text = $geshiH->geshi($text); // this assumes that the text should be parsed (ie inline syntaxes)

---- ---- OR

$geshiH = new Geshi_Helper();
$text = $geshiH->geshi($text, $lang);  // assumes that you have the language, good for a snippets deal

ho dovuto fare un po 'di spezzettamento di altri oggetti personalizzati che ho avuto, ma in attesa di errori di sintassi dalla trinciatura dovrebbe funzionare. Sentitevi liberi di usarlo.

<?php

require_once 'Geshi/geshi.php';

class Geshi_Helper 
{
    /**
     * @var array Array of matches from the code block.
     */
    private $_codeMatches = array();

    private $_token = "";

    private $_count = 1;

    public function __construct()
    {
        /* Generate a unique hash token for replacement) */
        $this->_token = md5(time() . rand(9999,9999999));
    }

    /**
     * Performs syntax highlights using geshi library to the content.
     *
     * @param string $content - The context to parse
     * @return string Syntax Highlighted content
     */
    public function geshi($content, $lang=null)
    {
        if (!is_null($lang)) {
            /* Given the returned results 0 is not set, adding the "" should make this compatible */
            $content = $this->_highlightSyntax(array("", strtolower($lang), $content));
        }else {
            /* Need to replace this prior to the code replace for nobbc */
            $content = preg_replace('~\[nobbc\](.+?)\[/nobbc\]~ie', '\'[nobbc]\' . strtr(\'$1\', array(\'[\' => \'&#91;\', \']\' => \'&#93;\', \':\' => \'&#58;\', \'@\' => \'&#64;\')) . \'[/nobbc]\'', $content);

            /* For multiple content we have to handle the br's, hence the replacement filters */
            $content = $this->_preFilter($content);

            /* Reverse the nobbc markup */
            $content = preg_replace('~\[nobbc\](.+?)\[/nobbc\]~ie', 'strtr(\'$1\', array(\'&amp;#91;\' => \'[\', \'&amp;#93;\' => \']\', \'&amp;#58;\' => \':\', \'&amp;#64;\' => \'@\'))', $content);

            $content = $this->_postFilter($content);
        }

        return $content;
    }

    /**
     * Performs syntax highlights using geshi library to the content.
     * If it is unknown the number of blocks, use highlightContent
     * instead.
     *
     * @param string $content - The code block to parse
     * @param string $language - The language to highlight with
     * @return string Syntax Highlighted content
     * @todo Add any extra / customization styling here.
     */
    private function _highlightSyntax($contentArray)
    {
        $codeCount = $contentArray[1];

        /* If the count is 2 we are working with the filter */
        if (count($contentArray) == 2) {
            $contentArray = $this->_codeMatches[$contentArray[1]];
        }

        /* for default [syntax] */
        if ($contentArray[1] == "")
            $contentArray[1] = "php";

        /* Grab the language */
        $language = (isset($contentArray[1]))?$contentArray[1]:'text';

        /* Remove leading spaces to avoid problems */
        $content = ltrim($contentArray[2]);

        /* Parse the code to be highlighted */
        $geshi = new GeSHi($content, strtolower($language));
        return $geshi->parse_code();
    }

    /**
     * Substitute the code blocks for formatting to be done without
     * messing up the code.
     *
     * @param array $match - Referenced array of items to substitute
     * @return string Substituted content
     */
    private function _substitute(&$match)
    {
        $index = sprintf("%02d", $this->_count++);
        $this->_codeMatches[$index] = $match;
        return "----" . $this->_token . $index . "----";
    }

    /**
     * Removes the code from the rest of the content to apply other filters.
     *
     * @param string $content - The content to filter out the code lines
     * @return string Content with code removed.
     */
    private function _preFilter($content)
    {
        return preg_replace_callback("#\s*\[syntax=(.*?)\](.*?)\[/syntax\]\s*#siU", array($this, "_substitute"), $content);
    }

    /**
     * Replaces the code after the filters have been ran.
     *
     * @param string $content - The content to replace the code lines
     * @return string Content with code re-applied.
     */
    private function _postFilter($content)
    {
        /* using dashes to prevent the old filtered tag being escaped */
        return preg_replace_callback("/----\s*" . $this->_token . "(\d{2})\s*----/si", array($this, "_highlightSyntax"), $content);
    }
}
?>

Altri suggerimenti

Sembra a me come si già ottenuto il diritto regex. Il tuo problema è l'invocazione, così Suggerisco di fare una funzione wrapper:

function geshi($src, $l) {
    $geshi = new GeSHi($sourcecode, $language);
    $geshi->parse_code();
    return $geshi->how_do_I_get_the_results();
}

Ora, questo sarebbe normalmente sufficiente, ma il codice sorgente è probabile che contengono si virgolette singole o Dobule. Pertanto non si può scrivere preg_replace(".../e", "geshi('$2','$1')", ...) come si avrebbe bisogno. (Si noti che '$ 1' e '$ 2' bisogno virgolette perché preg_replace sostituti appena i $ 1, $ 2 segnaposti, ma questo deve essere valido codice php in linea).

Ecco perché è necessario utilizzare preg_replace_callback per evitare la fuga problemi nel codice e exec / sostituzione. Così, per esempio:

preg_replace_callback('#\[syntax=(.*?)\](.*?)\[/syntax\]#si' , 'geshi_replace', $text);

E mi piacerebbe fare un secondo involucro, ma si può combinare con il codice originale:

function geshi_replace($uu) {
    return geshi($uu[2], $uu[1]);
}

Usa preg_match:

$match = preg_match('#\[syntax=(.*?)\](.*?)\[/syntax\]#si', $text);
$geshi = new GeSHi($match[2], $match[1]);

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow