Remote accounts are exposed to search engine indexing regardless of their "disable search engine indexing" preference on the host instance
When Pleroma pulls remote account metadata, it stores it locally and exposes it to search engines. This wouldn't be a problem if it were restricted to accounts that wanted that – but it's not.
According to https://marf.space/objects/71bcb26a-b934-45f6-807f-69bb417d8509, crawlers can start at the federated timeline and end up at our local copy of the profile.
https://marf.space/objects/7d46a7bd-1053-4712-9aca-35529b0e98c0 explains that Mastodon's feature to tell search engines to not index profiles is done through a meta tag:
<meta content="noindex, noarchive" name="robots">
I've looked on my Pleroma alt instance's copy of my Mastodon main's profile and this meta tag is not there. Could we respect it and add the same meta tag to our local copies of remote accounts?
#1206 (closed) is also relevant here, and looks like it would also solve the problem of exposing remote users to search engines in a different way.