SiT! Bugs

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0001798SiT!inbound emailpublic2012-08-31 11:472013-01-19 19:40
Reporternicdev 
Assigned To 
PrioritynormalSeverityminorReproducibilityalways
StatusfeedbackResolutionopen 
PlatformWindowsOSWindows ServerOS Version2000+Later
Product Version3.67 LTS 
Target VersionFixed in Version 
Summary0001798: HTML tags stripped in incoming mail also strips other strings
DescriptionWhen an email is sent, and some "<" or ">" are used as characters, and not as HTML tags, the tag stripping done by PHP's strip_tags() in line 277 of inboundemail.php:

$message = strip_tags(html_entity_decode($results['Data'],ENT_QUOTES, 'UTF-8'));

I am not sure if this is how the PHP function is supposed to work, but it definitely does not what we need to get in the incident update.

For example, I sent an email with the following text:
-------------------------------------------------------------------------------
Hi there,

I am doing a test with an open tag < and if you see the rest it should be ok.

On the other hand we may also test like this <to
See if a LF affects it

So here we go with some => arrows and some backwards arrows <=

END
-------------------------------------------------------------------------------

THIS is what I ended up with in SiT!:
-------------------------------------------------------------------------------
Hi there,
 
I am doing a test with an open tag < and if you see the rest it should be ok.
 
On the other hand we may also test like this arrows and some backwards arrows

-------------------------------------------------------------------------------
Steps To ReproduceSend an HTML email to Sit WITH SOME COMBINATIONS OF "<" and ">" that are not HTML tags. Once the email is imported, incorrect text gets stripped.

Some PHP community members seem to suggest that this is the way strip_tags() works, and propose to rather create specific functions to strip only the HTML tags, instead of strip_tags().
Additional InformationI do not know if it is PHP related but I'm running:
Apache Version : 2.2.17
PHP Version : 5.3.5

I did not find any known bugs related to this.
TagsNo tags attached.
Attached Files

- Relationships

-  Notes
User avatar (0004516)
nicdev (developer)
2012-08-31 14:04

I think I have found the problem here (which means it has existed for some time):

In inboundemail.php line 277:

275 else
276 {
277 $message = strip_tags(html_entity_decode($results['Data'],ENT_QUOTES, 'UTF-8'));
278 }

We first "decode" and then "strip the tags"

when in fact we should first "strip the tags" and then only decode the html entities.
This way the "<" and ">" characters do not get seen as tags bu strip tags because they are not decoded yet.

Let me know what everyone thinks. i tried this as a solution and even made the changes on our production SiT 3.62, and all imports seem to be going fine with no more issues:

275 else
276 {
277 $message = strip_tags($results['Data']);// Strip tags first
278 $message = html_entity_decode(message, ENT_QUOTES, 'UTF-8')); //Decode entities
279 }
User avatar (0004517)
nicdev (developer)
2012-08-31 14:52

.. and sorry, this below works even better, especially if you have emails sent with encoding "[Encoding] => windows-1252" (Yes it is from windows of course):

278 $message = html_entity_decode($message, ENT_QUOTES, strtoupper($results['Encoding']));
User avatar (0004655)
paulh (administrator)
2013-01-19 19:40

Hi Nico,

I've tried to reproduce this with the latest SVN and am unable to, my email was "1 is < 2 and 3 is >than 0.5"

- Issue History
Date Modified Username Field Change
2012-08-31 11:47 nicdev New Issue
2012-08-31 14:04 nicdev Note Added: 0004516
2012-08-31 14:52 nicdev Note Added: 0004517
2013-01-19 19:40 paulh Note Added: 0004655
2013-01-19 19:40 paulh Status new => feedback


Copyright © 2000 - 2019 MantisBT Team
Powered by Mantis Bugtracker