Truncate a multibyte String to n chars
Question
I am trying to get this method in a String Filter working:
public function truncate($string, $chars = 50, $terminator = ' …');
I'd expect this
$in = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWYXZ1234567890";
$out = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUV …";
and also this
$in = "âãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿĀāĂ㥹ĆćĈĉĊċČčĎďĐđĒēĔĕĖėĘęĚěĜĝ";
$out = "âãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿĀāĂ㥹ĆćĈĉĊċČčĎďĐđ …";
That is $chars
minus the chars of the $terminator
string.
In addition, the filter is supposed to cut at the first word boundary below the $chars
limit, e.g.
$in = "Answer to the Ultimate Question of Life, the Universe, and Everything.";
$out = "Answer to the Ultimate Question of Life, the …";
I am pretty certain this should work with these steps
- substract amount of chars in terminator from maximum chars
- validate that string is longer than the calculated limit or return it unaltered
- find the last space character in string below calculated limit to get word boundary
- cut string at last space or calculated limit if no last space is found
- append terminator to string
- return string
However, I have tried various combinations of str*
and mb_*
functions now, but all yielded wrong results. This can't be so difficult, so I am obviously missing something. Would someone share a working implementation for this or point me to a resource where I can finally understand how to do it.
Thanks
P.S. Yes, I have checked https://stackoverflow.com/search?q=truncate+string+php before :)
Solution
Try this:
function truncate($string, $chars = 50, $terminator = ' …') {
$cutPos = $chars - mb_strlen($terminator);
$boundaryPos = mb_strrpos(mb_substr($string, 0, mb_strpos($string, ' ', $cutPos)), ' ');
return mb_substr($string, 0, $boundaryPos === false ? $cutPos : $boundaryPos) . $terminator;
}
But you need to make sure that your internal encoding is properly set.
OTHER TIPS
Just found out PHP already has a multibyte truncate with
mb_strimwidth
— Get truncated string with specified width
It doesn't obey word boundaries though. But handy nonetheless!
I don't usually like to just code an entire answer to a question like this. But also I just woke up, and I thought maybe your question would get me in a good mood to go program for the rest of the day.
I didn't try to run this, but it should work or at least get you 90% of the way there.
function truncate( $string, $chars = 50, $terminate = ' ...' )
{
$chars -= mb_strlen($terminate);
if ( $chars <= 0 )
return $terminate;
$string = mb_substr($string, 0, $chars);
$space = mb_strrpos($string, ' ');
if ($space < mb_strlen($string) / 2)
return $string . $terminate;
else
return mb_substr($string, 0, $space) . $terminate;
}