Smarty Forum Index Smarty
WARNING: All discussion is moving to https://reddit.com/r/smarty, please go there! This forum will be closing soon.

XML (sitemap) bug [RESOLVED - not Smarty problem]
Goto page 1, 2  Next
 
This forum is locked: you cannot post, reply to, or edit topics.   This topic is locked: you cannot edit posts or make replies.    Smarty Forum Index -> General
View previous topic :: View next topic  
Author Message
markowe
Smarty Rookie


Joined: 16 Nov 2010
Posts: 16

PostPosted: Sat Sep 23, 2017 3:34 pm    Post subject: XML (sitemap) bug [RESOLVED - not Smarty problem] Reply with quote

NOT A SMARTY PROBLEM, SEE BELOW

I have this code:

Code:
{!if $IsIndex !}<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
{!for $foo=1 to $MapCount!}
   <sitemap>
      <loc>http://www.mysite.com/sitemap{!$foo!}.xml</loc>
   </sitemap>
{!/for!}
</sitemapindex>
{!else!}blabla...


It results in this:

Code:
<?xml version="1.0" encoding="UTF-8"?>
<head/><sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
   <sitemap>
      <loc>http://www.mysite.com/sitemap1.xml</loc>
   </sitemap>
[etc...]
</sitemapindex>


Note the strange appearance of this <head/> tag, which breaks the XML obviously and Google gives me an error: Your Sitemap appears to be an HTML page. Please use a supported sitemap format instead. (though I imagine it probably extracts the URIs anyway).

Reading around it seems it might be related to caching, and that for example various Wordpress caching plugins also cause this problem (I am not using any but I am using Smarty's native caching - anyway, the problem appears even with Smarty caching turned off). But what's the solution? I even tried a dirty fix like:

Code:
$sitemap = $smarty->fetch('sitemap.tpl', 'sitemap.xml');
echo preg_replace('/<head\/>/', '', $sitemap);


And that didn't seem to fix it either, which is particular confusing, though I am still trying to work out whether I screwed something up there.

Any clues?

P.S. Yeah, I should probably just use native PHP functions to create XML, but I just like the ease of using Smarty, plus I can cache it for a while that way - the search engines really hammer my sitemaps, and they each have a thousand pages so there is a bit of overhead saved there.


Last edited by markowe on Sun Sep 24, 2017 3:45 pm; edited 1 time in total
Back to top
View user's profile Send private message
bsmither
Smarty Elite


Joined: 20 Dec 2011
Posts: 322
Location: West Coast

PostPosted: Sat Sep 23, 2017 7:47 pm    Post subject: Reply with quote

Can you find the compiled version of this template?

The compiled template has a lot of actual PHP in it, and you may be able to determine if <head/> is appearing there, and where it may be getting sourced from.
Back to top
View user's profile Send private message
markowe
Smarty Rookie


Joined: 16 Nov 2010
Posts: 16

PostPosted: Sat Sep 23, 2017 8:21 pm    Post subject: Reply with quote

bsmither wrote:
Can you find the compiled version of this template?

The compiled template has a lot of actual PHP in it, and you may be able to determine if <head/> is appearing there, and where it may be getting sourced from.


Oh, yes - I looked at that, this is the offending part:

Code:
<?php if ($_valid && !is_callable('content_59c6bfc7bb12b1_36081863')) {function content_59c6bfc7bb12b1_36081863($_smarty_tpl) {?><?php if ($_smarty_tpl->tpl_vars['IsIndex']->value){?><?php echo '<?xml';?> version="1.0" encoding="UTF-8"<?php echo '?>';?>

<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">


I wonder if that newline is causing the problem... I don't have that in my template, as you can see in my OP.

Also, weirdly, when I turn on PHP errors, suddenly I get this:

Code:
<head/><br/>
<b>Notice</b>:  The called constructor method for WP_Widget in storem_recent_widget is <strong>deprecated</strong> since version 4.3.0! Use <pre>__construct()</pre> instead. in <b>/opt/lampstack-5.6.31-0/apache2/htdocs/blabla/wp-includes/functions.php</b> on line <b>3894</b><br/>
<?xml version="1.0" encoding="UTF-8"?>


Not bothered about the warning, that's something I have to fix, but weirdly the <head/> tag is now BEFORE the php warning, which makes me think maybe Smarty is not the cause, maybe it's apache doing something weird... Turned off a few things like mod_deflate, but it's not that...
Back to top
View user's profile Send private message
bsmither
Smarty Elite


Joined: 20 Dec 2011
Posts: 322
Location: West Coast

PostPosted: Sat Sep 23, 2017 10:12 pm    Post subject: Reply with quote

I am not seeing the <head/> in the compiled code block.

I am also not too concerned with any whitespace that gets added or subtracted from between the XML nodes. (My understanding is that whitespace is disregarded in XML syntax.)

I will assert (disclaimer: I have less than no knowledge about WP) that WP may have its own error_handler and exception_handler. That Notice certainly looks like it is coming from WP.

WP may also have its own syntax when assembling errors for logging/displaying. Or PHP's error_prepend_string INI value could be getting set with what is necessary for proper WP display.

But then, I have never seen a self-closing head tag (like <img /> or <br/>).

It just seems to me that something is setting up the display of errors such that they don't get hidden inside an HTML page's head section.
Back to top
View user's profile Send private message
AnrDaemon
Administrator


Joined: 03 Dec 2012
Posts: 1785

PostPosted: Sat Sep 23, 2017 11:23 pm    Post subject: Reply with quote

XML standard allows all tags to self-close if they are allowed to be empty.
Other than that, the problem is obviously not in Smarty, as compiled version does not have any extra tags.
Back to top
View user's profile Send private message
markowe
Smarty Rookie


Joined: 16 Nov 2010
Posts: 16

PostPosted: Sun Sep 24, 2017 10:46 am    Post subject: Reply with quote

Thanks for the thoughts, I will investigate further and report what I found.

I don't THINK it's Wordpress because I am outputting the contents of the template directly using a headers() directive and bypassing any Wordpress function/filters etc. I am suspecting something to do with the configuration of the Apache server, maybe some minification, caching or compression mod is causing this. Pretty weird, anyway.
Back to top
View user's profile Send private message
AnrDaemon
Administrator


Joined: 03 Dec 2012
Posts: 1785

PostPosted: Sun Sep 24, 2017 11:37 am    Post subject: Reply with quote

markowe wrote:
I am outputting the contents of the template directly using a headers() directive
That's EXACTLY how the <head/> tag is inserted.
Back to top
View user's profile Send private message
markowe
Smarty Rookie


Joined: 16 Nov 2010
Posts: 16

PostPosted: Sun Sep 24, 2017 12:14 pm    Post subject: Reply with quote

AnrDaemon wrote:
markowe wrote:
I am outputting the contents of the template directly using a headers() directive
That's EXACTLY how the <head/> tag is inserted.


Adds a <head/> tag in the middle of my output..? Have you a source for this that I can read up on, I am not sure what you mean there? I thought headers() altered the raw http headers, not the content in any way.

I HAVE omitted a header("Content-type: text/xml"); - I see that now, can't try it now, but maybe that's part of the problem. I could swear this behaviour wasn't there before I used Smarty, but if I have unjustly accused, I am sorry, Smarty has served me well, I will attempt to get to the bottom of it.
Back to top
View user's profile Send private message
AnrDaemon
Administrator


Joined: 03 Dec 2012
Posts: 1785

PostPosted: Sun Sep 24, 2017 1:27 pm    Post subject: Reply with quote

If you mean http://php.net/header function, you can't output content using it.
Back to top
View user's profile Send private message
markowe
Smarty Rookie


Joined: 16 Nov 2010
Posts: 16

PostPosted: Sun Sep 24, 2017 3:30 pm    Post subject: Reply with quote

AnrDaemon wrote:
If you mean http://php.net/header function, you can't output content using it.


I didn't express myself very well, what I meant was using header to set the headers and then echoing the content, something like:
Code:

header("HTTP/1.1 200 OK");
header("X-Robots-Tag: noindex");
//etc. etc. and then...
$smarty->display('sitemap.tpl', $matches[1]);
exit ();


OK, so I now added a header("Content-type: text/xml"); directive like I should've done in the first place and now, guess what, the problem went away. That was silly. But there it is if anyone else comes across it - not a Smarty problem, I will alter the OP accordingly if I can.
Back to top
View user's profile Send private message
AnrDaemon
Administrator


Joined: 03 Dec 2012
Posts: 1785

PostPosted: Sun Sep 24, 2017 4:05 pm    Post subject: Reply with quote

Can you please try this:
Code:
while(ob_get_level() > 0)
{
  ob_end_clean();
}

$smarty->display('sitemap.tpl', $matches[1]);
die;

…with your original code?

And do NOT, NEVER EVER use
Quote:
header("HTTP/1.1 200 OK");

It has at least two major issues, that are blindingly obvious to the eye.

1. You force protocol version, which may affect clients not expecting it to change.
2. You handwave the protocol line that is not supposed to be changed by a function. The first line is not a HEADER line, it's a request/response line. And it is not necessarily formatted in way you did. I.e. HTTP/2 does not define a Message part of response line. Only version and response status code.

If you want to set a standard status code, use http://php.net/http_response_code ().
If you want to send a non-standard code, try
Code:
header("Status: code message");
Back to top
View user's profile Send private message
markowe
Smarty Rookie


Joined: 16 Nov 2010
Posts: 16

PostPosted: Sun Sep 24, 2017 4:50 pm    Post subject: Reply with quote

AnrDaemon wrote:
Can you please try this:
Code:
while(ob_get_level() > 0)
{
  ob_end_clean();
}

$smarty->display('sitemap.tpl', $matches[1]);
die;

…with your original code?

And do NOT, NEVER EVER use
Quote:
header("HTTP/1.1 200 OK");

It has at least two major issues, that are blindingly obvious to the eye.

1. You force protocol version, which may affect clients not expecting it to change.
2. You handwave the protocol line that is not supposed to be changed by a function. The first line is not a HEADER line, it's a request/response line. And it is not necessarily formatted in way you did. I.e. HTTP/2 does not define a Message part of response line. Only version and response status code.

If you want to set a standard status code, use http://php.net/http_response_code ().
If you want to send a non-standard code, try
Code:
header("Status: code message");


I updated my post, it was definitely not a Smarty problem, so there is no need to try the buffer clean thing you suggested, right?

Thanks for the pointers about header directives, I am actually not even sure where that came from now or why that is there, I might have copy-pasted that from somewhere without thinking. I will revisit that whole bit of code, but the <head/> tag problem was resolved in any case simply by explicitly defining the content-type, in fact I think header('Content-type: application/xml'); is the only directive I need there, other than the noindex one (that's there to stop sitemaps appearing in Google's index, which is a thing as it turns out - took MONTHS to flush them out). Anyway, we are straying away from Smarty, thanks for the pointers.
Back to top
View user's profile Send private message
AnrDaemon
Administrator


Joined: 03 Dec 2012
Posts: 1785

PostPosted: Sun Sep 24, 2017 5:46 pm    Post subject: Reply with quote

Quote:
I updated my post, it was definitely not a Smarty problem, so there is no need to try the buffer clean thing you suggested, right?

I want to get to the bottom of it. And no, it's not cleaning, it's closing all buffering.
I suspect WP installs a filtering buffer on the output.
Back to top
View user's profile Send private message
markowe
Smarty Rookie


Joined: 16 Nov 2010
Posts: 16

PostPosted: Sun Sep 24, 2017 6:24 pm    Post subject: Reply with quote

AnrDaemon wrote:
Quote:
I updated my post, it was definitely not a Smarty problem, so there is no need to try the buffer clean thing you suggested, right?

I want to get to the bottom of it. And no, it's not cleaning, it's closing all buffering.
I suspect WP installs a filtering buffer on the output.


OK - I added the code you suggested and now the output is good:

Code:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
   <url>
      <loc>http://www.blabla.com/upc-search/?upc=9780688178239</loc>
      <changefreq>daily</changefreq>
   </url>
// etc.


Hope that helps.
Back to top
View user's profile Send private message
AnrDaemon
Administrator


Joined: 03 Dec 2012
Posts: 1785

PostPosted: Sun Sep 24, 2017 10:07 pm    Post subject: Reply with quote

Seems like I was right. Thanks for the test.
Back to top
View user's profile Send private message
Display posts from previous:   
This forum is locked: you cannot post, reply to, or edit topics.   This topic is locked: you cannot edit posts or make replies.    Smarty Forum Index -> General All times are GMT
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group
Protected by Anti-Spam ACP