|
Smarty
WARNING: All discussion is moving to https://reddit.com/r/smarty, please go there! This forum will be closing soon. |
|
View previous topic :: View next topic |
Author |
Message |
lcoffin Smarty Rookie
Joined: 05 Mar 2008 Posts: 5
|
Posted: Wed Mar 05, 2008 2:01 am Post subject: Regex bug in trimwhitespace |
|
|
There's a bug in the trimwhitespace pattern matching that only appears to cause a problem when the pattern matches a large block of text. And it may also only appear when PHP is compiled with newer versions of the PCRE library. I'm pretty sure it is invalid regular expression syntax in the trimwhitespace code so I think it qualifies as a Smarty bug, and not a PCRE bug.
Details of the error:
test_template.htm
Code: | <textarea rows="30">{$data}</textarea> |
guts of test.php
Code: | $smarty->load_filter('output', 'trimwhitespace');
$data = str_repeat('x', 100000);
$smarty->assign('data', $data);
$smarty->display('test_template.htm'); |
output
Code: | (blank -- zero bytes returned via Apache) |
Environment
Code: | PHP: 5.2.0
PCRE: 6.7 04-Jul-2006 |
I traced the problem down to the line in trimwhitespace:
Code: | $source = preg_replace("!<textarea[^>]+>.*?</textarea>!is",
'@@@SMARTY:TRIM:TEXTAREA@@@', $source); |
Before this line $source is not empty, after this line it is.
If $data is much shorter (10,000), everything works fine. The cutoff for me is anything that makes the matched text longer than 100,059 bytes.
Running this on an older version of PHP (with an older version of the PCRE) works fine too (PHP v5.1.6 w/ PCRE 6.6 06-Feb-2006).
The problem is with the '.*?' in the center of the RE. I'm pretty sure this in invalid regex syntax because you are applying two modifiers in a row without sub-pattern grouping -- the '.' says match any character, the '*' says match zero or more occurrences of the '.', and the '?' .... well, if it was '.?' it would mean match zero or one occurrences of the '.'.
I think '.*?' is supposed to semantically mean '(.*)?' which would mean zero or one occurrences of the sub-pattern '.*'. But the '?' in this case is actually redundant since all it is really saying is 'match zero or one occurrence of zero or more characters'. It's really the same as simply saying 'match any occurrence of zero or more characters' -- or simply '.*'.
So the correct code is:
Code: | $source = preg_replace("!<textarea[^>]+>.*</textarea>!is",
'@@@SMARTY:TRIM:TEXTAREA@@@', $source); |
With this change, the above example PHP/template files run correctly and output the <textarea> with the long data inside.
There are a few other places this same regex is used. A complete patch follows.
---Lawrence
Code: |
*** outputfilter.trimwhitespace.php.orig Tue Mar 4 20:08:40 2008
--- outputfilter.trimwhitespace.php.good Tue Mar 4 20:13:10 2008
***************
*** 28,48 ****
function smarty_outputfilter_trimwhitespace($source, &$smarty)
{
// Pull out the script blocks
! preg_match_all("!<script[^>]+>.*?</script>!is", $source, $match);
$_script_blocks = $match[0];
! $source = preg_replace("!<script[^>]+>.*?</script>!is",
'@@@SMARTY:TRIM:SCRIPT@@@', $source);
// Pull out the pre blocks
! preg_match_all("!<pre>.*?</pre>!is", $source, $match);
$_pre_blocks = $match[0];
! $source = preg_replace("!<pre>.*?</pre>!is",
'@@@SMARTY:TRIM:PRE@@@', $source);
// Pull out the textarea blocks
! preg_match_all("!<textarea[^>]+>.*?</textarea>!is", $source, $match);
$_textarea_blocks = $match[0];
! $source = preg_replace("!<textarea[^>]+>.*?</textarea>!is",
'@@@SMARTY:TRIM:TEXTAREA@@@', $source);
// remove all leading spaces, tabs and carriage returns NOT
--- 28,48 ----
function smarty_outputfilter_trimwhitespace($source, &$smarty)
{
// Pull out the script blocks
! preg_match_all("!<script[^>]+>.*</script>!is", $source, $match);
$_script_blocks = $match[0];
! $source = preg_replace("!<script[^>]+>.*</script>!is",
'@@@SMARTY:TRIM:SCRIPT@@@', $source);
// Pull out the pre blocks
! preg_match_all("!<pre>.*</pre>!is", $source, $match);
$_pre_blocks = $match[0];
! $source = preg_replace("!<pre>.*</pre>!is",
'@@@SMARTY:TRIM:PRE@@@', $source);
// Pull out the textarea blocks
! preg_match_all("!<textarea[^>]+>.*</textarea>!is", $source, $match);
$_textarea_blocks = $match[0];
! $source = preg_replace("!<textarea[^>]+>.*</textarea>!is",
'@@@SMARTY:TRIM:TEXTAREA@@@', $source);
// remove all leading spaces, tabs and carriage returns NOT
|
|
|
Back to top |
|
mohrt Administrator
Joined: 16 Apr 2003 Posts: 7368 Location: Lincoln Nebraska, USA
|
Posted: Wed Mar 05, 2008 3:09 am Post subject: |
|
|
fixed in SVN, thanks! |
|
Back to top |
|
messju Administrator
Joined: 16 Apr 2003 Posts: 3336 Location: Oldenburg, Germany
|
Posted: Wed Mar 05, 2008 8:43 am Post subject: |
|
|
wtf? .*? is not two modifiers.
It's an .* with greediness inverted (makes the * non-greedy in this case).
*NOW* trimwhitespace is broken. For example "<pre>.*</pre>" will now match the two outermost <pre>-tags and not a set of <pre>-tag-pairs. |
|
Back to top |
|
Celeb Administrator
Joined: 17 Apr 2007 Posts: 1025 Location: Vienna
|
Posted: Wed Mar 05, 2008 9:27 am Post subject: |
|
|
messju is right. .*? is nothing but a non-greedy .*
Non-greediness is quite important here though.
However, there's still the problem of obviously preg_replace reaching its limits. In my case the text matched by the .*? part of the pattern may not exceed 99.997 characters. If it does preg_replace returns an empty string without throwing any error message (error_reporting = E_ALL, nothing in the logs). _________________ Darn computers always do what I tell them to instead of what I want them to do. |
|
Back to top |
|
mohrt Administrator
Joined: 16 Apr 2003 Posts: 7368 Location: Lincoln Nebraska, USA
|
Posted: Wed Mar 05, 2008 1:47 pm Post subject: |
|
|
This change was reverted in SVN. |
|
Back to top |
|
lcoffin Smarty Rookie
Joined: 05 Mar 2008 Posts: 5
|
Posted: Fri Mar 21, 2008 2:18 pm Post subject: |
|
|
Ooops! My apologies! Forgot about the non-greedy modifiers...
Another simple, although not very elegant solution is to copy the original $source at the start of trimwhitespace, and if $source is empty at the end -- indicating a failed preg_replace -- simply return the unmodified copy. (Or just disabling trimwhitespace for pages you think *might* go over the limit.)
Any other thoughts?
---Lawrence |
|
Back to top |
|
Lemon Juice Smarty Pro
Joined: 24 May 2006 Posts: 109
|
Posted: Tue Apr 01, 2008 5:04 pm Post subject: |
|
|
There is a default limit imposed on preg functions since php 5.2.0, read this:
http://bugs.php.net/bug.php?id=42649
I have come across this limitation once and I didn't know what the problem was because the php docs don't mention this change. |
|
Back to top |
|
m0n5t3r Smarty n00b
Joined: 19 Nov 2008 Posts: 1
|
Posted: Wed Nov 19, 2008 7:19 pm Post subject: |
|
|
I ran into this issue today and solved it by replacing regex calls with a simple function that uses stripos; works for me just fine (it may have problems with bad HTML, but then so would the regex approach):
Code: |
Index: plugins/outputfilter.trimwhitespace.php
===================================================================
--- plugins/outputfilter.trimwhitespace.php (revision 8847)
+++ plugins/outputfilter.trimwhitespace.php (working copy)
@@ -25,25 +25,40 @@
* @param string
* @param Smarty
*/
+function smarty_outputfilter_trimwhitespace_findblocks($source, $tag, $repl) {
+ $repl_blocks = array();
+ $t_source = '';
+ $epos = 0;
+
+ while(($spos = stripos($source, "<$tag", $epos)) !== false) {
+ $old_epos = $epos;
+ $epos = stripos($source, "</$tag>", $spos);
+ if($epos !== false) {
+ $epos = $epos + strlen($tag) + 3;
+ $repl_blocks[] = substr($source, $spos, $epos - $spos);
+ $t_source .= substr($source, $old_epos, $spos - $old_epos)
+ . $repl;
+ }
+ }
+
+ $t_source .= substr($source, $epos);
+
+ return array($repl_blocks, $t_source);
+}
+
function smarty_outputfilter_trimwhitespace($source, &$smarty)
{
// Pull out the script blocks
- preg_match_all("!<script[^>]+>.*?</script>!is", $source, $match);
- $_script_blocks = $match[0];
- $source = preg_replace("!<script[^>]+>.*?</script>!is",
- '@@@SMARTY:TRIM:SCRIPT@@@', $source);
+ list($_script_blocks, $source) = smarty_outputfilter_trimwhitespace_findblocks($source,
+ 'script', '@@@SMARTY:TRIM:SCRIPT@@@');
// Pull out the pre blocks
- preg_match_all("!<pre>.*?</pre>!is", $source, $match);
- $_pre_blocks = $match[0];
- $source = preg_replace("!<pre>.*?</pre>!is",
- '@@@SMARTY:TRIM:PRE@@@', $source);
+ list($_pre_blocks, $source) = smarty_outputfilter_trimwhitespace_findblocks($source,
+ 'pre', '@@@SMARTY:TRIM:PRE@@@');
// Pull out the textarea blocks
- preg_match_all("!<textarea[^>]+>.*?</textarea>!is", $source, $match);
- $_textarea_blocks = $match[0];
- $source = preg_replace("!<textarea[^>]+>.*?</textarea>!is",
- '@@@SMARTY:TRIM:TEXTAREA@@@', $source);
+ list($_textarea_blocks, $source) = smarty_outputfilter_trimwhitespace_findblocks($source,
+ 'textarea', '@@@SMARTY:TRIM:TEXTAREA@@@');
// remove all leading spaces, tabs and carriage returns NOT
// preceeded by a php close tag.
|
|
|
Back to top |
|
dabilou Smarty n00b
Joined: 23 Jul 2011 Posts: 3
|
Posted: Sat Jul 23, 2011 9:49 am Post subject: |
|
|
Hi
I have same problem due to php 5.2 limitation and no way to change it.
What about disabling this plugin ? Is there some bad effect ? |
|
Back to top |
|
U.Tews Administrator
Joined: 22 Nov 2006 Posts: 5068 Location: Hamburg / Germany
|
Posted: Sun Jul 24, 2011 3:32 am Post subject: |
|
|
Have you tried to increase the backtrack limit?
ini_set('pcre.backtrack_limit', 20000000); |
|
Back to top |
|
dabilou Smarty n00b
Joined: 23 Jul 2011 Posts: 3
|
Posted: Sun Jul 24, 2011 8:59 am Post subject: |
|
|
I cannot change the php.ini due to web hosting service limitation (it is shared hosting).
For the moment I turned off this filter.
Does the only purpose of this plugin is to save bandewith ? |
|
Back to top |
|
mohrt Administrator
Joined: 16 Apr 2003 Posts: 7368 Location: Lincoln Nebraska, USA
|
Posted: Sun Jul 24, 2011 2:00 pm Post subject: |
|
|
dabilou wrote: | I cannot change the php.ini due to web hosting service limitation (it is shared hosting).
For the moment I turned off this filter.
Does the only purpose of this plugin is to save bandewith ? |
It is to save bandwidth, pretty much. You can set the limit with set_ini(), you don't have to edit the php.ini file itself. |
|
Back to top |
|
dabilou Smarty n00b
Joined: 23 Jul 2011 Posts: 3
|
Posted: Sun Jul 24, 2011 2:23 pm Post subject: |
|
|
Its ok, thanks a lot ! |
|
Back to top |
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|
|