SiT! Bugs

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0001798SiT!inbound emailpublic2012-08-31 11:472013-01-19 19:40
Assigned To 
PlatformWindowsOSWindows ServerOS Version2000+Later
Product Version3.67 LTS 
Target VersionFixed in Version 
Summary0001798: HTML tags stripped in incoming mail also strips other strings
DescriptionWhen an email is sent, and some "<" or ">" are used as characters, and not as HTML tags, the tag stripping done by PHP's strip_tags() in line 277 of inboundemail.php:

$message = strip_tags(html_entity_decode($results['Data'],ENT_QUOTES, 'UTF-8'));

I am not sure if this is how the PHP function is supposed to work, but it definitely does not what we need to get in the incident update.

For example, I sent an email with the following text:
Hi there,

I am doing a test with an open tag < and if you see the rest it should be ok.

On the other hand we may also test like this <to
See if a LF affects it

So here we go with some => arrows and some backwards arrows <=


THIS is what I ended up with in SiT!:
Hi there,
I am doing a test with an open tag < and if you see the rest it should be ok.
On the other hand we may also test like this arrows and some backwards arrows

Steps To ReproduceSend an HTML email to Sit WITH SOME COMBINATIONS OF "<" and ">" that are not HTML tags. Once the email is imported, incorrect text gets stripped.

Some PHP community members seem to suggest that this is the way strip_tags() works, and propose to rather create specific functions to strip only the HTML tags, instead of strip_tags().
Additional InformationI do not know if it is PHP related but I'm running:
Apache Version : 2.2.17
PHP Version : 5.3.5

I did not find any known bugs related to this.
TagsNo tags attached.
Attached Files

- Relationships

-  Notes
User avatar (0004516)
nicdev (developer)
2012-08-31 14:04

I think I have found the problem here (which means it has existed for some time):

In inboundemail.php line 277:

275 else
276 {
277 $message = strip_tags(html_entity_decode($results['Data'],ENT_QUOTES, 'UTF-8'));
278 }

We first "decode" and then "strip the tags"

when in fact we should first "strip the tags" and then only decode the html entities.
This way the "<" and ">" characters do not get seen as tags bu strip tags because they are not decoded yet.

Let me know what everyone thinks. i tried this as a solution and even made the changes on our production SiT 3.62, and all imports seem to be going fine with no more issues:

275 else
276 {
277 $message = strip_tags($results['Data']);// Strip tags first
278 $message = html_entity_decode(message, ENT_QUOTES, 'UTF-8')); //Decode entities
279 }
User avatar (0004517)
nicdev (developer)
2012-08-31 14:52

.. and sorry, this below works even better, especially if you have emails sent with encoding "[Encoding] => windows-1252" (Yes it is from windows of course):

278 $message = html_entity_decode($message, ENT_QUOTES, strtoupper($results['Encoding']));
User avatar (0004655)
paulh (administrator)
2013-01-19 19:40

Hi Nico,

I've tried to reproduce this with the latest SVN and am unable to, my email was "1 is < 2 and 3 is >than 0.5"

- Issue History
Date Modified Username Field Change
2012-08-31 11:47 nicdev New Issue
2012-08-31 14:04 nicdev Note Added: 0004516
2012-08-31 14:52 nicdev Note Added: 0004517
2013-01-19 19:40 paulh Note Added: 0004655
2013-01-19 19:40 paulh Status new => feedback

Copyright © 2000 - 2020 MantisBT Team
Powered by Mantis Bugtracker