Textarea input - how to deal with paragraphs and headings?

https://stackoverflow.com/questions/3941908

30-09-2019
|

Question

I currently use the following expression which I use to put paragraph tags around textarea input before storing it in a MySQL database.

$inputText = str_replace('<p></p>', '', '<p>' . preg_replace('#([\r\n]\s*?[\r\n]){2,}#', '</p>$0<p>', $inputText) . '</p>');

This works well and good, except when I wish to use header tags. These are then surrounded by unwanted paragraph tags:

<p><h3>Test Header</h3></p>

While this displays as expected, it is not great from a validation point of view.

Can anyone suggest an improved expression and/or method to catch headers tags and only apply the paragraph tags to actual paragraphs? Or, an expression which I can apply to my input prior to the expression I'm currently using to produce the same desired effect.

As a side note, I would like to be able to enter stand-alone hyperlink 'a' tags and still have them surrounded with paragraph tags as before.

I have considered that it may just be easier to manually edit the details after they are entered into the database to remove the unwanted paragraph tags.

Solution

I use this function from wordpress, wraps p's around paragraphs nicely as well as line breaks whilst preserving HTML:

function wpautop($pee, $br = 1) {
    $pee = $pee . "\n"; // just to make things a little easier, pad the end
    $pee = preg_replace('|<br />\s*<br />|', "\n\n", $pee);
    // Space things out a little
    $allblocks = '(?:table|thead|tfoot|caption|colgroup|tbody|tr|td|th|div|dl|dd|dt|ul|ol|li|pre|select|form|map|area|blockquote|address|math|style|input|p|h[1-6]|hr)';
    $pee = preg_replace('!(<' . $allblocks . '[^>]*>)!', "\n$1", $pee);
    $pee = preg_replace('!(</' . $allblocks . '>)!', "$1\n\n", $pee);
    $pee = str_replace(array("\r\n", "\r"), "\n", $pee); // cross-platform newlines
    $pee = preg_replace("/\n\n+/", "\n\n", $pee); // take care of duplicates
    $pee = preg_replace('/\n?(.+?)(?:\n\s*\n|\z)/s', "<p>$1</p>\n", $pee); // make paragraphs, including one at the end
    $pee = preg_replace('|<p>\s*?</p>|', '', $pee); // under certain strange conditions it could create a P of entirely whitespace
    $pee = preg_replace('!<p>([^<]+)\s*?(</(?:div|address|form)[^>]*>)!', "<p>$1</p>$2", $pee);
    $pee = preg_replace( '|<p>|', "$1<p>", $pee );
    $pee = preg_replace('!<p>\s*(</?' . $allblocks . '[^>]*>)\s*</p>!', "$1", $pee); // don't pee all over a tag
    $pee = preg_replace("|<p>(<li.+?)</p>|", "$1", $pee); // problem with nested lists
    $pee = preg_replace('|<p><blockquote([^>]*)>|i', "<blockquote$1><p>", $pee);
    $pee = str_replace('</blockquote></p>', '</p></blockquote>', $pee);
    $pee = preg_replace('!<p>\s*(</?' . $allblocks . '[^>]*>)!', "$1", $pee);
    $pee = preg_replace('!(</?' . $allblocks . '[^>]*>)\s*</p>!', "$1", $pee);
    if ($br) {
        $pee = preg_replace('|(?<!<br />)\s*\n|', "<br />\n", $pee); // optionally make line breaks
    }
    $pee = preg_replace('!(</?' . $allblocks . '[^>]*>)\s*<br />!', "$1", $pee);
    $pee = preg_replace('!<br />(\s*</?(?:p|li|div|dl|dd|dt|th|pre|td|ul|ol)[^>]*>)!', '$1', $pee);
    $pee = preg_replace( "|\n</p>$|", '</p>', $pee );
    return $pee;
}

OTHER TIPS

You can use the strip_tags function like this:

<?php
$text = '<p><h3>Test Header</h3></p>';
echo strip_tags($text);
echo "\n";

// Allow <p> and <h3>
echo strip_tags($text, '<p><h3>');
?>

It should work out.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow