Strip anything not in a tag #4

Open
opened 2021-03-07 05:41:22 +00:00 by alexgleason · 2 comments
Member

I have some data like this: blah blah blah <iframe src="..." />

I can't seem to figure out how to strip the blah blah blah part. Is that possible?

I have some data like this: `blah blah blah <iframe src="..." />` I can't seem to figure out how to strip the `blah blah blah ` part. Is that possible?
Owner

You're probably thinking about it in the wrong way at least in SGML/XML/HTML5+ data outside of nodes are just plain text and you're not supposed to filter them out.

You're probably thinking about it in the wrong way at least in SGML/XML/HTML5+ data outside of nodes are just plain text and you're not supposed to filter them out.
Member

If you want to strip all plain text in tags, the following will work:

defmodule FastSanitize.Sanitizer.StripPlaintext do

  def scrub(x) when is_binary(x), do: nil

  def scrub(x), do: x
end

If you want to stip only plaintext at the root, that is not possible without modifying the code to provide parent tags for context to sanitizers

If you want to strip all plain text in tags, the following will work: ``` defmodule FastSanitize.Sanitizer.StripPlaintext do def scrub(x) when is_binary(x), do: nil def scrub(x), do: x end ``` If you want to stip only plaintext at the root, that is not possible without modifying the code to provide parent tags for context to sanitizers
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
pleroma-elixir-libraries/fast_sanitize#4
No description provided.