Question

I have the following line of code, it is used to break a long sentence into pieces with specified character length, and can only break on word boundaries:

print "$_\n" for grep substr($_, 0, 80), /(.{1,80}\b)/g;

If it applies to the following line:

So-called smartguns have been in the works for years but haven’t gotten much traction.

It will not print the '.' after word traction. What can I do to fix it? Thanks

Was it helpful?

Solution

Goal for regex, span 80 or fewer characters, and end on a non space. It should be followed by either a space or end of string, and then we should gobble any following spaces so that the following line starts on a character. A web provided description can be found here.

while (<DATA>) {
    chomp(my $line = $_);

    print "$_\n" for $line =~ /(.{1,80})(?<=\S)(?=\s|$)\s*/g;
}

__DATA__
So-called smartguns have been in the works for years but haven't gotten much traction.

Can also just rely on cpan Text::Wrap

OTHER TIPS

'.' at the end of the line does not match your regular expression, so it doesn't get returned by the expression /(.{1,80}\b)/g.

It seems to me what you're really trying to do is break up lines on spaces. With your existing regex you could conceivably split a contraction (haven\n't, for example) or any other number of corner cases you haven't considered.

Maybe

/(.{1,80}(?:\s|$))/g

would suit your needs better.

This should do it:

(.{1,80}\b[^a-zA-Z0-9])

example: http://regex101.com/r/oQ3hX1

code:

print "$_\n" for grep substr($_, 0, 80), /(.{1,80}\b[^a-zA-Z0-9])/g;
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top