Verified Commit 3623504e authored by Haelwenn's avatar Haelwenn

[Pleroma.Formatter]: Add support for non-HTTP schemes in URIs

The call to the regex in add_links is there just to be sure it’s a legal URI, it can be removed if you want to get more performance.

The URI Schemes list is sorted, but with http(s) at the start (in case it might make it faster for common links).

Closes: pleroma/pleroma#127
parent ba72c51a
......@@ -165,8 +165,29 @@ defmodule Pleroma.Formatter do
@emoji
end
@link_regex ~r/https?:\/\/[\w\.\/?=\-#\+%&@~'\(\):]+[\w\/]/u
@link_regex ~r/[0-9a-z+\-\.]+:[0-9a-z$-_.+!*'(),]+/ui
# IANA got a list https://www.iana.org/assignments/uri-schemes/ but
# Stuff like ipfs isn’t in it
# There is very niche stuff
@uri_schemes [
"https://",
"http://",
"dat://",
"dweb://",
"gopher://",
"ipfs://",
"ipns://",
"irc:",
"ircs:",
"magnet:",
"mailto:",
"mumble:",
"ssb://",
"xmpp:"
]
# TODO: make it use something other than @link_regex
def html_escape(text) do
Regex.split(@link_regex, text, include_captures: true)
|> Enum.map_every(2, fn chunk ->
......@@ -176,11 +197,14 @@ defmodule Pleroma.Formatter do
|> Enum.join("")
end
@doc "changes http:... links to html links"
@doc "changes scheme:... urls to html links"
def add_links({subs, text}) do
links =
Regex.scan(@link_regex, text)
|> Enum.map(fn [url] -> {Ecto.UUID.generate(), url} end)
text
|> String.split([" ", "\t", "<br>"])
|> Enum.filter(fn word -> String.starts_with?(word, @uri_schemes) end)
|> Enum.filter(fn word -> Regex.match?(@link_regex, word) end)
|> Enum.map(fn url -> {Ecto.UUID.generate(), url} end)
|> Enum.sort_by(fn {_, url} -> -String.length(url) end)
uuid_text =
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment