Smarty

Spuerhund · Smarty Rookie Joined: 20 Jan 2005 Posts: 16

This is the complete code of the modifier.truncate.php function (copied from the latest Smarty version 2.6.14):

[php:1:6218fff286]function smarty_modifier_truncate($string, $length = 80, $etc = '...',
$break_words = false, $middle = false)
{
if ($length == 0)
return '';

if (strlen($string) > $length) {
$length -= strlen($etc);
if (!$break_words && !$middle) {
$string = preg_replace('/\s+?(\S+)?$/', '', substr($string, 0, $length+1));
}
if(!$middle) {
return substr($string, 0, $length).$etc;
} else {
return substr($string, 0, $length/2) . $etc . substr($string, -$length/2);
}
} else {
return $string;
}
}[/php:1:6218fff286]

There are several things to improve, some people (including me) would rather call them "bugs":

1) It shold be mentioned somewhere that this plugin does NOT work with unicode text! In the manual http://smarty.php.net/manual/en/language.modifier.truncate.php there is no such hint. If you want this code to work correctly with unicode (UTF-8 f. e.) you have to change "strlen" to "mb_strlen" and "substr" to "mb_substr". Finally you should add the "u" modifier to the preg_replace regexp.

mb_* will of course only work if the php mb_string extension is available, that can easily be checked by

boots

Hi and thanks for taking the time to write.

bugmenot · Smarty n00b Joined: 31 May 2006 Posts: 1

"\s+?" does the same as "\s" (at least in this case, because it's not limited at the left).

? does not only mean {0,1}, in cases where a quantifier like + or * is followed by ? it switches the greediness.

And why do you need Unicode support in PHP if you can just write the numeric entity? That could be used even in ASCII. I think it actually should be written as an entity to keep it (almost) independent of the used charset.

boots · Posted: Wed May 31, 2006 3:46 pm Post subject:

Okay, I'll concede the \s+? issue. It really doesn't seem pertinant enough to make a change, though.

As for using an entitiy, I think that is not the way to go. Now I'll refer to the wikipedia:

White Tiger · Smarty n00b Joined: 11 Sep 2008 Posts: 2

I ran into the same truncate with UNICODE problem. I'm aware that this topic is 2 years old and I've found at least 3 others to discuss this question. I would like to reflect on your 'when PHP supports UNICODE out-of-the-box' remark.

- I write my PHP in UTF-8 (the editor codes all .php files in UNICODE). works fine.
- I use MySQL for data storage completely in UTF-8. works perfect.
- I display my HTML in UTF-8 (using <head><meta http-equiv="content-type" content="text/html; charset=utf-8"> </head>). works superb.

Now I do not have to take care about string format anymore: read sg. from MySql into a PHP variable and send it to the HTML. I think this is quite a frequent configuration nowadays.

You are right, that for UTF-8 string manipulation in PHP I have to use mb_ functions but then what truncate is for? The very base of Smarty's philosophy to take ALL of the formatting problems. If I have to do it in PHP then the whole idea is screwed up: I am not able to get rid of formatting problems in PHP.

I do not request or blame anything and I am very thankful to Smarty with much help in my development. But this problem is just disastrous to my multilingual project in the long run. I would very much appreciate the official mb_ versions of string manipulation plugins. Until then I use one of the patches offered here in the forum.

And you really _have to_ make a remark in the documentation that your string manipulation is not UNICODE-ready. Not a shame but a useful information for all the developers.

Notromda · Smarty Rookie Joined: 30 Aug 2004 Posts: 13

If the smarty plugins don't support UTF8 because php doesn't, then let's be consistent and offer official _mb_ functions that do the equivalent as PHP does.

I found this to be extremely useful, a mb_truncate filter:

http://www.guyrutenberg.com/2007/12/04/multibyte-string-truncate-modifier-for-smarty-mb_truncate/

While I'd rather have smarty know what encoding is in use and automatically do the right thing, I can live with alternate function names. But the option at least needs to be there for everyone to use.

nothinghood · Smarty Rookie Joined: 20 Aug 2009 Posts: 5

The