Identify spam users #1

Open
opened 2021-02-12 16:12:02 +00:00 by feld · 3 comments
Owner
  • active
  • frequently has last_sign_in_at: null and some value in website_url
  • may have also have a spammy bio field, but tactics vary
  • usually they'll match via without_projects in the user search query, but there can be some legitimate accounts here. In the past @mfc didn't own projects but was owner of the mfc project namespace so it was accidentally nuked in a mass spam account cleanup. Also, sometimes users open issues but don't actually own projects. Those legitimate users should not be affected.
  • a lot of spam accounts have the same first/last name, so the name field would look like "name": "dtytlbrneirf dtytlbrneirf"
  • would be interesting if we could assign a score to users based on these heuristics
- active - frequently has `last_sign_in_at: null` and some value in `website_url` - may have also have a spammy `bio` field, but tactics vary - usually they'll match via `without_projects` in the user search query, but there can be some legitimate accounts here. In the past @mfc didn't own projects but was owner of the mfc project namespace so it was accidentally nuked in a mass spam account cleanup. Also, sometimes users open issues but don't actually own projects. Those legitimate users should not be affected. - a lot of spam accounts have the same first/last name, so the name field would look like `"name": "dtytlbrneirf dtytlbrneirf"` - would be interesting if we could assign a score to users based on these heuristics
Member

Should we take GeoIP data into account? I remember there were a lot of vietnamese spam.

Also, maybe we should flag users who use non-latin characters.

Should we take GeoIP data into account? I remember there were a lot of vietnamese spam. Also, maybe we should flag users who use non-latin characters.
Author
Owner

We were doing GeoIP blocking at ingress before the GitLab server and we ran into too many problems so I think it's disabled. I'll see what I can find out.

We were doing GeoIP blocking at ingress before the GitLab server and we ran into too many problems so I think it's disabled. I'll see what I can find out.
Author
Owner

I sourced this from @framasky@framapiaf.org. Two scripts he has been using to try to tame spammers. We can probably leverage some tactics here.

https://git.pleroma.social/-/snippets/5280

I sourced this from `@framasky@framapiaf.org`. Two scripts he has been using to try to tame spammers. We can probably leverage some tactics here. https://git.pleroma.social/-/snippets/5280
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
pleroma/janitor#1
No description provided.