Question

We have multiple systems interacting with each other. There is a public facing website where customers enter text. There is a webservice that enters the text into a CRM database. As a good practice the text is being HTML encoded before forwarding it to the webservice. We have two applications reading this text one is a web application where we have code for HTML decoding. Another is a third party CRM which does not have the decoding code as a result of which it is displaying the HTML characters to the user.

I am trying to find a solution to prevent users from viewing the HTML characters. CRM is a third party application and in production so everyone is reluctant to make any changes to the CRM.

Is there any other solution? One proposal is to decode the text before entering it to the CRM database. Is this a good solution or are there any other.

Was it helpful?

Solution

In most environments, as the number of distinct systems increase you are increasing less likely to have a single integration point. As a result, you will have to build gateways specific to each of the integrated systems which handle the peculiarities of each of these external systems. So, build a gateway for your CRM system which decodes the HTML from your system of record as it sends the data to the CRM (as well as other peculiarities your CRM requires...).

OTHER TIPS

As a good practice the text is being HTML encoded before forwarding it to the webservice.

This is not really good practise, and this should only be done if the user is actually entering HTML via a control on your site such as a rich text editor. OWASP Guidelines recommend only output encoding which is context specific (e.g. HTML required a different set of encoding rules than JavaScript).

One is a web application where we have code for HTML decoding

Isn't the web site simply displaying the encoded text and the browser is doing the decoding? It sounds to me like your CRM system is HTML encoding the already HTML encoded data (as it should), which is why you are seeing encoded characters.

e.g. User enterers & which you store as &. Your website displays this as & which is decoded by the browser as &, the CRM system displays this as & which is decoded as the browser as &.

What you should be doing

Do not HTML encode your data when storing - it should be stored in plain text. If your system is already live you will need to run a batch process to convert the HTML encoded data into plain.

Fix your site to HTML encode the data it receives from the web service when output to users. So & stored will become &. This will mean your site and CRM systems are both correctly encoding the plain text data, and since they are doing it exactly the same way there will be no display issues.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top