View previous topic :: View next topic |
Author |
Message |
boots Administrator
Joined: 16 Apr 2003 Posts: 5611 Location: Toronto, Canada
|
Posted: Mon Aug 23, 2004 6:18 pm Post subject: modifier capitalize doesn't work properly for quoted text |
|
|
Given:
Code: | {assign var=test value='"i am some text"'}
{$test|capitalize} |
Produces:
Expected:
The following works as expected:
Code: | {assign var=test value="i am some text"}
{$test|capitalize} |
Discussion:
capitalize is based solely on ucwords(). There are suggestions on that page for more sophisticated UC handling. |
|
Back to top |
|
mohrt Administrator
Joined: 16 Apr 2003 Posts: 7368 Location: Lincoln Nebraska, USA
|
Posted: Mon Aug 23, 2004 6:34 pm Post subject: |
|
|
fixed in CVS, please test. |
|
Back to top |
|
boots Administrator
Joined: 16 Apr 2003 Posts: 5611 Location: Toronto, Canada
|
Posted: Mon Aug 23, 2004 6:53 pm Post subject: |
|
|
That's a nice solution, Monte
I have "words" like abc123 (and abc123def) that I do not want capitalized. The following change to your regex gives the desired result.
!\b[a-z]*\b! |
|
Back to top |
|
mohrt Administrator
Joined: 16 Apr 2003 Posts: 7368 Location: Lincoln Nebraska, USA
|
Posted: Mon Aug 23, 2004 6:58 pm Post subject: |
|
|
So only words with no punctuation would get capitallized? What about words with hyphens? Also a "+" would be more appropriate than "*" in this case, and I woud need to capture only the first character, or apply ucfirst() on the result. |
|
Back to top |
|
mohrt Administrator
Joined: 16 Apr 2003 Posts: 7368 Location: Lincoln Nebraska, USA
|
Posted: Mon Aug 23, 2004 7:09 pm Post subject: |
|
|
I guess I need to know the exact expected behavior, example:
non-disclosure
How should that be handled?
non-disclosure
Non-disclosure
Non-Disclosure |
|
Back to top |
|
boots Administrator
Joined: 16 Apr 2003 Posts: 5611 Location: Toronto, Canada
|
Posted: Mon Aug 23, 2004 7:16 pm Post subject: |
|
|
Good points and my simple fix was borked anyhow. Here's my solution:
[php:1:c936d07875]
return preg_replace_callback('/\b[a-z]+([.!?]|\b)/', create_function('$_x', 'return strtoupper($_x[0][0]).substr($_x[0],1);'), $string);
[/php:1:c936d07875]
Here's my simple test case:
Code: | {'"I 1am the best test123abc non-disclosure."'|capitalize} |
My expected results:
Quote: | "I 1am The Best test123abc Non-Disclosure." |
|
|
Back to top |
|
messju Administrator
Joined: 16 Apr 2003 Posts: 3336 Location: Oldenburg, Germany
|
Posted: Mon Aug 23, 2004 8:05 pm Post subject: |
|
|
just to throw in my 2c:
'\b[a-z]' is not the proper match for the first character of a word, IMHO. you are missing all accented character for western-languages for example.
I'd match for '\b\w' instead. preg_match's \w is documented to match word-characters according to the current locale and strtoupper() is documented to uppercase characters according to the current locale. so this should work consistently.
there is a little performance drawback, though, since \b\w also matches digits and already uppercase characters. but i think this is acceptible for the sake of i18n.
regarding performance: i suggest to store the replace functionname in a static var and only create the function once and not for every call to captitalize.
and at last:
i don't think it's wise to generally ignore words containing digits for capitalize:
- if the sentence is "his login is messju72" i want to to be capitalized to "His Login Is Messju72"
- it makes capitalize (IMHO unnecessary, but boots' mileage seems to vary from that) complicated.
greetings
messju |
|
Back to top |
|
mohrt Administrator
Joined: 16 Apr 2003 Posts: 7368 Location: Lincoln Nebraska, USA
|
Posted: Mon Aug 23, 2004 8:20 pm Post subject: |
|
|
So is this any closer to what we want, or is there still a problem with ucfirst() and special chars?
[php:1:e481ecaaec]function smarty_modifier_capitalize($string)
{
return preg_replace_callback('!\b\w+\b!', 'smarty_modifier_capitalize_ucfirst', $string);
}
function smarty_modifier_capitalize_ucfirst($string)
{
if(!preg_match('!\d!',$string[0]))
return ucfirst($string[0]);
else
return $string[0];
}[/php:1:e481ecaaec] |
|
Back to top |
|
boots Administrator
Joined: 16 Apr 2003 Posts: 5611 Location: Toronto, Canada
|
Posted: Mon Aug 23, 2004 8:23 pm Post subject: |
|
|
hmm, always good points, messju
I suppose that if we don't ignore numbers in text then \b\w makes sense. More to that point, given messju's example: "his login is messju72" => "His Login Is Messju72" I disagree with this interpretation. since the login most assuredly is not 'Messju72' but rather the pendantic 'messju72'. Not that I mind much since I can roll my own capitalize easily enough if that is indeed the expected behaviour.
I agree that a real function is indeed better but when I read Monte's solution I thought it looked so elegant as a lambda that I didn't give it a second thought. |
|
Back to top |
|
boots Administrator
Joined: 16 Apr 2003 Posts: 5611 Location: Toronto, Canada
|
Posted: Mon Aug 23, 2004 8:26 pm Post subject: |
|
|
@monte: works great for me |
|
Back to top |
|
messju Administrator
Joined: 16 Apr 2003 Posts: 3336 Location: Oldenburg, Germany
|
Posted: Mon Aug 23, 2004 8:30 pm Post subject: |
|
|
boots wrote: | given messju's example: "his login is messju72" => "His Login Is Messju72" I disagree with this interpretation. since the login most assuredly is not 'Messju72' but rather the pendantic 'messju72'. |
okay, then take "he is using a t3-connection". |
|
Back to top |
|
mohrt Administrator
Joined: 16 Apr 2003 Posts: 7368 Location: Lincoln Nebraska, USA
|
Posted: Mon Aug 23, 2004 8:46 pm Post subject: |
|
|
It is not possible to determine the context of every word, so a generalization must be made.
I vote that we skip upper-casing words with digits by default, but allow a modifier argument that can change this behavior.
{$name|capitalize} {* skips words with numbers *}
{$name|capitalize:true} {* don't skip words with numbers *}
In either case punctuation is considered as a word boundary, so hyphenated words get each segment capitalized separately (?) |
|
Back to top |
|
messju Administrator
Joined: 16 Apr 2003 Posts: 3336 Location: Oldenburg, Germany
|
Posted: Mon Aug 23, 2004 8:51 pm Post subject: |
|
|
mohrt wrote: | I vote that we skip upper-casing words with digits by default, but allow a modifier argument that can change this behavior.
{$name|capitalize} {* skips words with numbers *}
{$name|capitalize:true} {* don't skip words with numbers *}
|
sounds good to me.
Quote: | In either case punctuation is considered as a word boundary, so hyphenated words get each segment capitalized separately (?) |
also sounds good to me. |
|
Back to top |
|
mohrt Administrator
Joined: 16 Apr 2003 Posts: 7368 Location: Lincoln Nebraska, USA
|
Posted: Mon Aug 23, 2004 9:09 pm Post subject: |
|
|
ok committed, comments encouraged. I wasn't sure of the best way to handle the parameter into the callback function, so I used a constant. |
|
Back to top |
|
boots Administrator
Joined: 16 Apr 2003 Posts: 5611 Location: Toronto, Canada
|
Posted: Mon Aug 23, 2004 9:23 pm Post subject: |
|
|
I don't think that will work since you can not redeclare constants.
You may need a global in this case, though that is also nasty. Perhaps the best thing is to be less tricky and use preg_match_all instead of preg_replace_callback.
Off-topic: Here is one more case where it is unfortunate that &$smarty isn't sent to modifiers -- I propose a new plugin type, 'context-modifiers' which always pass $smarty as the first param. |
|
Back to top |
|
|