Enhance parens stripping logic
I didn't feel like creating a ton of functions at the moment so I'm not relying on matching custom return codes, but hacking it onto true/false and inverting logic where necessary to get what I want.
It's bad, but it works, and it will be easy to clean up incrementally.
Now it's split into functions so it's both clean AND confusing!
Rules:
- During checks, always use buffer with leading
(
stripped as it can't be part of a valid URL anyway - Short circuit to only strip leading if no trailing exists
- If valid email address when trailing
)
stripped, we can strip trailing)
and return - If valid URL when trailing
)
stripped, continue checks; else just return - If query parameters detected, strip trailing
)
as last character in query params should have been encoded as%29
anyway [1] - If there is a
/
in the valid URL the trailing)
could be part of the URL, so continue checks; else, strip both - If there is at least one
(
in the URI.path, continue checks; else assume)
is not part of the URL and strip both. [2] - If we have an equal count of
(
and)
chars with the leading(
already stripped, we should be confident they are intentional so we only strip leading; else strip both as a last resort [3]
[1] https://foo.com/bar/baz?q=ran0mch@r$)
feels quite improbable
[2] https://blog.soykaf.com/post/encryption/)
is extremely unlikely
[3] https://en.wikipedia.org/(fake_path)/wiki/Frame_(networking)
high confidence, balanced parens
Edited by feld