ActionView::Helpers::SanitizeHelper fails with multiple tags
When html that is sanitized using the Rails ActionView::Helpers::SanitizeHelper it fails when there are multiple 'wbr' tags.
Text that has been pasted from other applications contains html that has literally hundreds of 'wbr' elements.
When the combined 'depth' of the 'wbr' elements and the outer elements in which they appear reaches 255 all further text in the document appears to be removed.
This can mean that important information is lost.
As an example if you run sanize on the fragment below:
<div>
Some text here
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr>
More important text here
</div>
The result does not contain the second piece of text. ie the result is:
<div>
Some text here
</div>
I am looking for a way to safely sanitize the html that is pasted from other applications without losing any of the content.
I could obviously replace all 'wbr' elements with spaces using gsub prior to sanitizing but I would like to know that there are not other scenarios that will cause data loss in the same way.
Note that the Rails::Html::TargetScrubber has similar issues if you try to remove 'wbr' elements from the example segment then it removes the last text as well.
ruby-on-rails sanitize
add a comment |
When html that is sanitized using the Rails ActionView::Helpers::SanitizeHelper it fails when there are multiple 'wbr' tags.
Text that has been pasted from other applications contains html that has literally hundreds of 'wbr' elements.
When the combined 'depth' of the 'wbr' elements and the outer elements in which they appear reaches 255 all further text in the document appears to be removed.
This can mean that important information is lost.
As an example if you run sanize on the fragment below:
<div>
Some text here
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr>
More important text here
</div>
The result does not contain the second piece of text. ie the result is:
<div>
Some text here
</div>
I am looking for a way to safely sanitize the html that is pasted from other applications without losing any of the content.
I could obviously replace all 'wbr' elements with spaces using gsub prior to sanitizing but I would like to know that there are not other scenarios that will cause data loss in the same way.
Note that the Rails::Html::TargetScrubber has similar issues if you try to remove 'wbr' elements from the example segment then it removes the last text as well.
ruby-on-rails sanitize
I think you should raise an issue in Rails issue tracker. Seems like a bug
– rubyprince
Nov 13 '18 at 6:47
@runyprince. Opened an issue here
– giorgio
Nov 14 '18 at 1:12
Hmm, looks like the bug is with Nokogiri (xml/html parsing gem) which Rails uses internally. The rabbit hole goes very deep.
– rubyprince
Nov 14 '18 at 19:41
add a comment |
When html that is sanitized using the Rails ActionView::Helpers::SanitizeHelper it fails when there are multiple 'wbr' tags.
Text that has been pasted from other applications contains html that has literally hundreds of 'wbr' elements.
When the combined 'depth' of the 'wbr' elements and the outer elements in which they appear reaches 255 all further text in the document appears to be removed.
This can mean that important information is lost.
As an example if you run sanize on the fragment below:
<div>
Some text here
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr>
More important text here
</div>
The result does not contain the second piece of text. ie the result is:
<div>
Some text here
</div>
I am looking for a way to safely sanitize the html that is pasted from other applications without losing any of the content.
I could obviously replace all 'wbr' elements with spaces using gsub prior to sanitizing but I would like to know that there are not other scenarios that will cause data loss in the same way.
Note that the Rails::Html::TargetScrubber has similar issues if you try to remove 'wbr' elements from the example segment then it removes the last text as well.
ruby-on-rails sanitize
When html that is sanitized using the Rails ActionView::Helpers::SanitizeHelper it fails when there are multiple 'wbr' tags.
Text that has been pasted from other applications contains html that has literally hundreds of 'wbr' elements.
When the combined 'depth' of the 'wbr' elements and the outer elements in which they appear reaches 255 all further text in the document appears to be removed.
This can mean that important information is lost.
As an example if you run sanize on the fragment below:
<div>
Some text here
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr>
More important text here
</div>
The result does not contain the second piece of text. ie the result is:
<div>
Some text here
</div>
I am looking for a way to safely sanitize the html that is pasted from other applications without losing any of the content.
I could obviously replace all 'wbr' elements with spaces using gsub prior to sanitizing but I would like to know that there are not other scenarios that will cause data loss in the same way.
Note that the Rails::Html::TargetScrubber has similar issues if you try to remove 'wbr' elements from the example segment then it removes the last text as well.
ruby-on-rails sanitize
ruby-on-rails sanitize
asked Nov 13 '18 at 3:49
giorgiogiorgio
1,12041531
1,12041531
I think you should raise an issue in Rails issue tracker. Seems like a bug
– rubyprince
Nov 13 '18 at 6:47
@runyprince. Opened an issue here
– giorgio
Nov 14 '18 at 1:12
Hmm, looks like the bug is with Nokogiri (xml/html parsing gem) which Rails uses internally. The rabbit hole goes very deep.
– rubyprince
Nov 14 '18 at 19:41
add a comment |
I think you should raise an issue in Rails issue tracker. Seems like a bug
– rubyprince
Nov 13 '18 at 6:47
@runyprince. Opened an issue here
– giorgio
Nov 14 '18 at 1:12
Hmm, looks like the bug is with Nokogiri (xml/html parsing gem) which Rails uses internally. The rabbit hole goes very deep.
– rubyprince
Nov 14 '18 at 19:41
I think you should raise an issue in Rails issue tracker. Seems like a bug
– rubyprince
Nov 13 '18 at 6:47
I think you should raise an issue in Rails issue tracker. Seems like a bug
– rubyprince
Nov 13 '18 at 6:47
@runyprince. Opened an issue here
– giorgio
Nov 14 '18 at 1:12
@runyprince. Opened an issue here
– giorgio
Nov 14 '18 at 1:12
Hmm, looks like the bug is with Nokogiri (xml/html parsing gem) which Rails uses internally. The rabbit hole goes very deep.
– rubyprince
Nov 14 '18 at 19:41
Hmm, looks like the bug is with Nokogiri (xml/html parsing gem) which Rails uses internally. The rabbit hole goes very deep.
– rubyprince
Nov 14 '18 at 19:41
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53273513%2factionviewhelperssanitizehelper-fails-with-multiple-wbr-tags%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53273513%2factionviewhelperssanitizehelper-fails-with-multiple-wbr-tags%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
I think you should raise an issue in Rails issue tracker. Seems like a bug
– rubyprince
Nov 13 '18 at 6:47
@runyprince. Opened an issue here
– giorgio
Nov 14 '18 at 1:12
Hmm, looks like the bug is with Nokogiri (xml/html parsing gem) which Rails uses internally. The rabbit hole goes very deep.
– rubyprince
Nov 14 '18 at 19:41