Smarty Forum Index Smarty
WARNING: All discussion is moving to https://reddit.com/r/smarty, please go there! This forum will be closing soon.

[HOW TO] - Hide ALL email addresses from bots & harveste
Goto page 1, 2  Next
 
This forum is locked: you cannot post, reply to, or edit topics.   This topic is locked: you cannot edit posts or make replies.    Smarty Forum Index -> Tips and Tricks
View previous topic :: View next topic  
Author Message
alarconcepts
Smarty Rookie


Joined: 28 May 2005
Posts: 18
Location: NY

PostPosted: Sun Aug 06, 2006 9:33 am    Post subject: [HOW TO] - Hide ALL email addresses from bots & harveste Reply with quote

[HOW TO] - Hide all email addresses from spam-bots & harvesters!

Here's a nice way to make sure that any email addresses in your templates are protected from spam-bots and email-harvesters. This is a long post, which may lead you to believe that this is difficult; rest assured it is not. Read this post through before you add this to your application and the process will be painless.

This is an output filter that works in a couple of ways. It works on HTML email-links as well as any text-based email addresses that may end up in your content here and there. Here's the skinny... Very Happy

The filter will change these three bot-vulnerable email addresses...
[php:1:f4d4a74501]someone@example.org

<a href="mailto:someone@example.org">Someone</a>

<a href="mailto:someone@example.org">someone@example.org</a>[/php:1:f4d4a74501]
...into these browser-friendly, bot-safe, ASCII strings, respectively:
[php:1:f4d4a74501]<a href="mailto:one@..........le.org" title="someone&AT&example&DOT&org">someone&AT&example&DOT&org</a>

<a href="mailto:one@..........e.org" title="Contact Me">Contact Me</a>

<a href="mailto:some..........one@.org" title="someone@example.org">someone@example.org</a>[/php:1:f4d4a74501]
There is a really nice function for obfuscating email addresses posted elsewhere on this forum, but it relies on Javascript, which isn't always enabled. So anyway, I thought the community would get some good use out of this extra functionality I worked up. Here it is...enjoy!

==================================================

First, here's the function...

The block below contains the function you need and it should be pasted into a new file renamed to "outputfilter.encode_emails.php". Once you've got the file created, upload it to your "../Smarty/plugins/.." directory.

FILE:outputfilter.encode_emails.php
[php:1:f4d4a74501]<?php
// ---------------------------------------------------------------------------
// Custom Smarty Output Filter - Ascii Email Encoder
// ---------------------------------------------------------------------------
// By John Alarcon - Y2K6
// http://www.alarconcepts.com
// ---------------------------------------------------------------------------
// This is an output filter that changes HTML email-links and/or text-based
// email addresses into their ascii equivalents to help keep them hidden from
// spam-bots and email-harvesters. Additionally, text-only email addresses are
// converted into working links. All links have title attributes.
// ---------------------------------------------------------------------------
function smarty_outputfilter_encode_emails($source, &$smarty)
{
#############################################################
# CUSTOMIZE THESE FIRST FEW VALUES TO SUIT YOUR NEEDS!
#############################################################

// Customize the onscreen @ and . symbols...get creative!
$custom_AT = ' [at] ';
$custom_DOT = ' [dot] ';

// What to encode?
// 'html' - Only HTML email-links are encoded.
// 'text' - Only text-only addresses are encoded.
// 'both' - Default; both types are encoded.
$encode_these = 'both';

#############################################################
# EDIT BELOW HERE AT YOUR OWN RISK! =)
#############################################################

// Assign the regex to compare against for HTML email-links.
$regex_html_email = '!<a\s([^>]*)href=["\']mailto:([^"\']+)["\']([^>]*)>(.*?)</a[^>]*>!is';

// Assign the regex to compare against for text-only emails.
$regex_text_email = '![a-zA-Z0-9\-_]+@[a-zA-Z0-9\-_]+.[a-z]{2,3}!is';

// A switch to provide the proper array of $regexes...
switch($encode_these)
{
// Regex for comparing html email-links.
case 'html': $regexes = array('html' => $regex_html_email); break;

// Regex for comparing text-only emails.
case 'text': $regexes = array('text' => $regex_text_email); break;

// Regex for comparing both html email-links and text-only emails.
default: $regexes = array('html' => $regex_html_email,
'text' => $regex_text_email); break;
}

// Loop through whichever $regexes we established above...
foreach($regexes as $regex_type => $regex)
{
// ...and check the $regex pattern against the $source...
preg_match_all($regex, $source, $matches);

// ...and then check if there were no matches found...
if(empty($matches[0]))
{
// ...if not, we can skip everything below and simply move on...
continue;
}

// Match(es) found; replicate them in a variable for modifications.
$modifications = $matches[0];

// ...and then loop through the modifications needed...
foreach($modifications as $key => $match)
{
// (re)Initializing hex-encoded address variable.
$hex_address = '';

// (re)Initializing hex-encoded display variable.
$hex_display = '';

// Check if current iteration is comparing for HTML email-links...
if($regex_type === 'html')
{
// ...and if so, grab the email address...
$address = $matches[2][$key];

// ...and the display text from the $matches array.
$display = $matches[4][$key];

// GOTCHA!
// If the display-text is an email address unto itself, the next iteration of the
// $regexes foreach-loop would convert it to a link, which would effectively break the
// original link. To fix this, any "AT" symbol will simply be translated so that the
// next iteration doesn't match it as a pattern to replace.
if(strpos($display, '@'))
{
// ...and if an "AT" symbol exists, convert it to an html-entity.
$display = str_replace('@', '@', $display);
}

// No need to convert the display text, simply transfer it's contents to a new variable.
$hex_display .= $display;

// ...then get length of HTML email link for use in the following for-loop; saves processing.
$length = strlen($address);

// ...then loop through each character in the HTML email link...
for($x = 0; $x < $length; $x++)
{
// ...and encode each character before adding it to the encoded address string.
$hex_address .= '&#' . ord(substr($address, $x)) . ';';
}
}
// ...or if the current iteration is comparing for text-based email addresses...
elseif($regex_type === 'text')
{
// ...and if so, grab the email address only...will make our own display text.
$address = $matches[0][$key];

// ...then get length of text-email for use in the following for-loop; saves processing.
$length = strlen($address);

// ...then loop through each character in the email address...
for($x = 0; $x < $length; $x++)
{
// ...and hex-encode each character as we go...
$hex_value = '&#' . ord(substr($address, $x)) . ';';

// ...then add the character to the encoded address string...
$hex_address .= $hex_value;

// ...but for the display text, we need to check the hex-value of the current character...
switch($hex_value)
{
// ...and if the hex value is that of an "@" symbol...
case '@':
// ...add a custom text to the display-text string instead of the hex value...
$hex_display .= $custom_AT;
// ...then break out of the switch and move on to the next letter.
break;

// ...or if the hex value is that of a "."...
case '.':
// ...add a custom text to the display-text string instead of the hex value...
$hex_display .= $custom_DOT;
// ...then break out of the switch and move on to the next letter.
break;

// ...or in any other case...
default:
// ...add the normal character to the string...
$hex_display .= substr($address, $x, 1);
// ...then break out of the switch and move on to the next letter.
break;
}
}
}

// ...waaaay down here, over-write any changes into the $modifications array...
$modifications[$key] = '<a href="mailto:' . $hex_address . '" title="' . $hex_display . '">' . $hex_display . '</a>';

// ...and finally, perform the $modifications on the $source based on any $matches[0] found.
$source = str_replace($matches[0], $modifications, $source);
}
}

// Return the altered (or unaltered) $source.
return $source;
}?>[/php:1:f4d4a74501]
The closing PHP tag ?> was stripped from the code above...don't forget to add it!
==================================================

Now that you have the function...load it up!

Before this function will work in your system, it must be loaded. To do this, you have a few options. You can either load the filter when the native Smarty object is created...or you can use an object that extends the native Smarty object...or you can load it on a per-page basis. It's up to your needs, really. All three methods are simple; see details below. Note that you'll be using only one of these methods; not all three.

I'd say that for most applications it's going to be preferable to use Method 2, which extends the native Smarty object. This allows you to use the functionality without altering core Smarty files and you can still turn filtering on/off for the entire site in one location. This is how I personally do it, but let your own needs and preferences dictate your approach.

---------------------------------------------------------------------------

Method 1) Loading the filter in the native Smarty object:
Open ../Smarty/Smarty.class.php and find the following line... Take note that your line may be different, already having values set there, and if so, very carefully add the key/value pair from below to your own array.
[php:1:f4d4a74501]var $autoload_filters = array();[/php:1:f4d4a74501]
...add the function/filter as shown here, and you're done...
[php:1:f4d4a74501]var $autoload_filters = array('output' => array('encode_emails));[/php:1:f4d4a74501]
---------------------------------------------------------------------------

Method 2) Loading the filter by extending the native Smarty object:
If you already have your own custom object extending the native Smarty object, you can simply customize your own object with the code below. If you don't already have an object extending the native Smarty object, just create a file called "mySmarty.class.php" which contains all of the following code. Then, instead of including (or requiring, as it were) the 'Smarty.class.php' file in your application, you will now include your newly created "mySmarty.class.php" class file in it's place....and likewise, you will instantiate new mySmarty() objects rather than "plain" Smarty() objects. Here's the code for the extending class; be sure to change the path where specified.

FILE: mySmarty.class.php
[php:1:f4d4a74501]<?php
if (!class_exists('Smarty')) {
require_once ('path/to/your/Smarty/Smarty.class.php');
}

class mySmarty extends Smarty
{
function mySmarty()
{
$this->Smarty();
$this->load_filter = ('output', 'encode_emails');
$this->register_outputfilter('encode_emails');
}
}?>[/php:1:f4d4a74501]
The closing PHP tag ?> was stripped from the code above...don't forget to add it!
---------------------------------------------------------------------------

Method 3) Loading the filter on a page-by-page basis:
To use the filtering only on select pages, follow the very general outline below. I'm not entirely sure where this would be useful, but you never know when it might come up on Jeopardy, right!?
FILE: myFilename.php
[php:1:f4d4a74501]<?php
$mySmarty = new mySmarty();

// Add these two lines somewhere between the
// start of the object and the display of the template.
$mySmarty ->load_filter = ('output', 'encode_emails');
$mySmarty ->register_outputfilter('encode_emails');

// Any other PHP code to get template variables.

$mySmarty->assign('what', $ever);
$mySmarty->display('myTemplate.htm');
?>[/php:1:f4d4a74501]
The closing PHP tag ?> was stripped from the code above...don't forget to add it!
==================================================

And finally, verify your work...

To verify your work, simply load up any page that contains an email address after having followed the procedure detailed above. All email addresses will appear normal onscreen and the links themselves will act exactly the same as any normal email link. The magic is in the page source which, when examined, reveals that your email addresses are nicely encoded. The email addresses in your application will now have a degree of protection.

==================================================

Ok, if you got all the way down here...great! You should now have your email addresses protected to a degree as well as having made them highly accessible by adding title tags. If you think of better ideas, don't hog them all to yourself ... add-on!

Cheers,
- John

PS. For those who don't like fluffy, mega-commented code, see the next post. Wink


Last edited by alarconcepts on Sun Aug 06, 2006 9:39 am; edited 1 time in total
Back to top
View user's profile Send private message
alarconcepts
Smarty Rookie


Joined: 28 May 2005
Posts: 18
Location: NY

PostPosted: Sun Aug 06, 2006 9:37 am    Post subject: outputfilter.encode_emails.php Reply with quote

FILE: outputfilter.encode_emails.php
[php:1:222c8674d3]<?php
// ---------------------------------------------------------------------------
// Custom Smarty Output Filter - Ascii Email Encoder
// ---------------------------------------------------------------------------
// By John Alarcon - Y2K6
// http://www.alarconcepts.com
// ---------------------------------------------------------------------------
// This is an output filter that changes HTML email-links and/or text-based
// email addresses into their ascii equivalents to help keep them hidden from
// spam-bots and email-harvesters. Additionally, text-only email addresses are
// converted into working links. All links have title attributes.
// ---------------------------------------------------------------------------
function smarty_outputfilter_encode_emails($source, &$smarty)
{
$custom_AT = ' [at] ';
$custom_DOT = ' [dot] ';
$encode_these = 'both';

$regex_html_email = '!<a\s([^>]*)href=["\']mailto:([^"\']+)["\']([^>]*)>(.*?)</a[^>]*>!is';
$regex_text_email = '![a-zA-Z0-9\-_]+@[a-zA-Z0-9\-_]+.[a-z]{2,3}!is';

switch($encode_these) {
case 'html': $regexes = array('html' => $regex_html_email); break;
case 'text': $regexes = array('text' => $regex_text_email); break;
default: $regexes = array('html' => $regex_html_email,
'text' => $regex_text_email); break;
}

foreach($regexes as $regex_type => $regex) {
preg_match_all($regex, $source, $matches);
if(empty($matches[0])) {
continue;
}
$modifications = $matches[0];
foreach($modifications as $key => $match) {
$hex_address = '';
$hex_display = '';
if($regex_type === 'html') {
$address = $matches[2][$key];
$display = $matches[4][$key];
if(strpos($display, '@')) {
$display = str_replace('@', '@', $display);
}
$hex_display .= $display;
$length = strlen($address);
for($x = 0; $x < $length; $x++) {
$hex_address .= '&#' . ord(substr($address, $x)) . ';';
}
} elseif($regex_type === 'text') {
$address = $matches[0][$key];
$length = strlen($address);
for($x = 0; $x < $length; $x++) {
$hex_value = '&#' . ord(substr($address, $x)) . ';';
$hex_address .= $hex_value;
switch($hex_value) {
case '@': $hex_display .= $custom_AT; break;
case '.': $hex_display .= $custom_DOT; break;
default: $hex_display .= substr($address, $x, 1); break;
}
}
}
$modifications[$key] = '<a href="mailto:'.$hex_address.'" title="'.$hex_display.'">'.$hex_display.'</a>';
$source = str_replace($matches[0], $modifications, $source);
}
}
return $source;
}
?>[/php:1:222c8674d3]
Back to top
View user's profile Send private message
mohrt
Administrator


Joined: 16 Apr 2003
Posts: 7368
Location: Lincoln Nebraska, USA

PostPosted: Sun Aug 06, 2006 5:19 pm    Post subject: Reply with quote

As an alternative to this approach, you can also use {mailto} to encode them individually in the template.

http://smarty.php.net/manual/en/language.function.mailto.php
Back to top
View user's profile Send private message Visit poster's website
alarconcepts
Smarty Rookie


Joined: 28 May 2005
Posts: 18
Location: NY

PostPosted: Sun Aug 06, 2006 6:42 pm    Post subject: Reply with quote

Yes, that's a manual alternative...but it relies on template designers to ensure each email address by hand.

Using the output filter described, all encoding is automatic, even if template designers happen to forget some addresses here and there, which seems to happen quite a bit when data falls outside of a loop.

Wink
Back to top
View user's profile Send private message
mohrt
Administrator


Joined: 16 Apr 2003
Posts: 7368
Location: Lincoln Nebraska, USA

PostPosted: Sun Aug 06, 2006 8:29 pm    Post subject: Reply with quote

Yep, two different approaches, each with their own strengths.
Back to top
View user's profile Send private message Visit poster's website
alarconcepts
Smarty Rookie


Joined: 28 May 2005
Posts: 18
Location: NY

PostPosted: Fri Aug 11, 2006 7:10 am    Post subject: Reply with quote

That's what I love about programming ... there's a million ways to get things done.

Smile
Back to top
View user's profile Send private message
TGKnIght
Smarty Junkie


Joined: 07 Sep 2005
Posts: 580
Location: Philadelphia, PA

PostPosted: Wed Sep 20, 2006 7:54 pm    Post subject: Reply with quote

This is a really nice filter! Thanks!
_________________
Smarty site with one index.php controller file
Working with MySQL and Smarty
SmartyColumnSort
Custom Smarty Javascript Debug Template
Back to top
View user's profile Send private message Visit poster's website
alarconcepts
Smarty Rookie


Joined: 28 May 2005
Posts: 18
Location: NY

PostPosted: Wed Sep 20, 2006 8:02 pm    Post subject: Reply with quote

Hey, sure thing!

I don't leave home without it anymore...

Smile

Enjoy!
Back to top
View user's profile Send private message
TGKnIght
Smarty Junkie


Joined: 07 Sep 2005
Posts: 580
Location: Philadelphia, PA

PostPosted: Wed Sep 20, 2006 8:42 pm    Post subject: Reply with quote

I personally use {mailto} whenever I'm making templates... my minions however do not always do so when they are working on projects, so this is a good backup.. I might get a bit lazier myself now Smile
_________________
Smarty site with one index.php controller file
Working with MySQL and Smarty
SmartyColumnSort
Custom Smarty Javascript Debug Template
Back to top
View user's profile Send private message Visit poster's website
alarconcepts
Smarty Rookie


Joined: 28 May 2005
Posts: 18
Location: NY

PostPosted: Wed Sep 20, 2006 9:25 pm    Post subject: Reply with quote

Ha! Not lazier by any means!

You're simply taking a more proactive approach...

...commendable really...

Very Happy
Back to top
View user's profile Send private message
TGKnIght
Smarty Junkie


Joined: 07 Sep 2005
Posts: 580
Location: Philadelphia, PA

PostPosted: Fri Sep 22, 2006 4:08 pm    Post subject: Reply with quote

I added in a new variable to indicate whether or not you want to have text email addresses converted to links.

[php:1:5bff864ed4]
<?php
// ---------------------------------------------------------------------------
// Custom Smarty Output Filter - Ascii Email Encoder
// ---------------------------------------------------------------------------
// By John Alarcon - Y2K6
// http://www.alarconcepts.com
// ---------------------------------------------------------------------------
// This is an output filter that changes HTML email-links and/or text-based
// email addresses into their ascii equivalents to help keep them hidden from
// spam-bots and email-harvesters. Additionally, text-only email addresses are
// converted into working links. All links have title attributes.
// ---------------------------------------------------------------------------
function smarty_outputfilter_protect_email($source, &$smarty)
{
$custom_AT = '@';
$custom_DOT = '.';
$encode_these = 'both';
$convert_text_to_link = false;

$regex_html_email = '!<a\s([^>]*)href=["\']mailto:([^"\']+)["\']([^>]*)>(.*?)</a[^>]*>!is';
$regex_text_email = '![a-zA-Z0-9\-_]+@[a-zA-Z0-9\-_]+.[a-z]{2,3}!is';

switch($encode_these) {
case 'html': $regexes = array('html' => $regex_html_email); break;
case 'text': $regexes = array('text' => $regex_text_email); break;
default: $regexes = array('html' => $regex_html_email,
'text' => $regex_text_email); break;
}

foreach($regexes as $regex_type => $regex) {
preg_match_all($regex, $source, $matches);
if(empty($matches[0])) {
continue;
}
$modifications = $matches[0];
foreach($modifications as $key => $match) {
$hex_address = '';
$hex_display = '';
if($regex_type === 'html') {
$address = $matches[2][$key];
$display = $matches[4][$key];
if(strpos($display, '@')) {
$display = str_replace('@', '@', $display);
}
$hex_display .= $display;
$length = strlen($address);
for($x = 0; $x < $length; $x++) {
$hex_address .= '&#' . ord(substr($address, $x)) . ';';
}
$modifications[$key] = '<a href="mailto:'.$hex_address.'" title="'.$hex_display.'">'.$hex_display.'</a>';
} elseif($regex_type === 'text') {
$address = $matches[0][$key];
$length = strlen($address);
for($x = 0; $x < $length; $x++) {
$hex_value = '&#' . ord(substr($address, $x)) . ';';
$hex_address .= $hex_value;
switch($hex_value) {
case '@': $hex_display .= $custom_AT; break;
case '.': $hex_display .= $custom_DOT; break;
default: $hex_display .= substr($address, $x, 1); break;
}
}
$modifications[$key] = ($convert_text_to_link)
? '<a href="mailto:'.$hex_address.'" title="'.$hex_display.'">'.$hex_display.'</a>'
: $hex_display;
}

$source = str_replace($matches[0], $modifications, $source);
}
}
return $source;
}
?>[/php:1:5bff864ed4]
_________________
Smarty site with one index.php controller file
Working with MySQL and Smarty
SmartyColumnSort
Custom Smarty Javascript Debug Template
Back to top
View user's profile Send private message Visit poster's website
alarconcepts
Smarty Rookie


Joined: 28 May 2005
Posts: 18
Location: NY

PostPosted: Fri Sep 22, 2006 4:20 pm    Post subject: Reply with quote

Nice! I don't have a chance to use it right now, but I can definitely think of ways it might come in handy later.

Thanks for sharing!

Very Happy
Back to top
View user's profile Send private message
owczi
Smarty Rookie


Joined: 26 Jul 2006
Posts: 7
Location: Poland, land o' the duck brothers.

PostPosted: Sat Sep 30, 2006 4:41 pm    Post subject: Reply with quote

I used a different solution, similar though. I'm developing quite a big project, with a custom mvc-pattern framework, which uses Smarty as the default html view plugin for it's components. I also had to use some active anti-spam solution, so I here's my idea:

On the server handling dynamic requests, smarty output filter does this:

- catch all the e-mail addresses in the output

- convert them to their md5sums or some other hash / unique id

- add each entry to a database, a table consisting of two columns: hash, e-mail_address

- replace the e-mail addresses in the output with <img src="http://static.domain.net/images/email/{$md5sum}.png"/>

The 'static' server does the rest:

- if the picture doesn't exist, check if there's an entry for it in the database

- if there is an entry in the db, delete that entry, generate the picture containing the e-mail address, store it in the filesystem for later use and pass it to the client

- if there's no entry in the db, do a regular 404 or a redirect

- the e-mail picture cache is maintained by a cron task that removes pictures which are no longer in use

This makes the e'mail addresses quite useless to spambots (unless they incorporate OCR technology), yet perfectly readable to humans.

The whole 'check if there's an entry' might be seen as a potential DOS flaw that could slow down the DBMS when exlploited, but if you know what you're doing, you can deal with it.

Note that there is no need to keep the track of pictures in the database if it's not a web cluster of any kind, or if the cache directories are shared among the system (like a dedicated data server sharing files via nfs or other solution of this kind).
Back to top
View user's profile Send private message
alarconcepts
Smarty Rookie


Joined: 28 May 2005
Posts: 18
Location: NY

PostPosted: Sat Sep 30, 2006 5:10 pm    Post subject: Reply with quote

Hi,

Yes, Captcha-like functionality is pretty nifty. Keep in mind that letters embedded into images pose accessibility issues that should be dealt with. For instance, a link to "pronounce" the embedded letters could work around this. Highly accessible sites are a spider's favorite food. Wink

Is there any code to go along with this?

Do share! Smile

- Alar
Back to top
View user's profile Send private message
owczi
Smarty Rookie


Joined: 26 Jul 2006
Posts: 7
Location: Poland, land o' the duck brothers.

PostPosted: Sat Sep 30, 2006 6:02 pm    Post subject: Reply with quote

alarconcepts wrote:
Keep in mind that letters embedded into images pose accessibility issues that should be dealt with. For instance, a link to "pronounce" the embedded letters could work around this.


Well, you got me, no doubt about it. Though I must admit I'm leaving the accessibility layer for a later phase, and probably I'll just leave it for someone competent in that matter (and yes, what a bad man am I Twisted Evil ) , but any possible issue on this level is important mainly because using a filter is a decision forced by the controller tier. I write component classes that simply export variables for the view, so that the person making a template just gets a list of variables the template has to present. As to the code, I'll share as soon as I polish some things.

A little offtopic conclusion: one one hand, you might think - separate the layers and make it all super elastic only when it's meant to be a big project. My answer is - no, always treat it like a big project, that pays off.
Back to top
View user's profile Send private message
Display posts from previous:   
This forum is locked: you cannot post, reply to, or edit topics.   This topic is locked: you cannot edit posts or make replies.    Smarty Forum Index -> Tips and Tricks All times are GMT
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group
Protected by Anti-Spam ACP