Skip to content

WIP: [#477] User search improvements

Fixes #477 (closed)

  • tsquery search with field weights
  • friends & followers boosting

Decided to step away from pg_trgm search since:

  • it doesn't seem to allow to add weights for fields

  • it's about whole strings similarity whilst in this use case fragment similarity is more important. E.g. if we search for lain@ple and have lain@pleroma.soykaf.com lain and nick1 lain candidates (nickname + name), pg_trgm prefers the latter one (with a totally irrelevant nickname nick1) since it's just shorter and "soykaf.com" penalizes first candidate score (the longer it is, the more is the penalty), which seems definitely wrong for this use case.

Current implementation uses standard tsvector / tsquery approach, adds weights to fields (A to nickname, B to name), adds boosting to ranks of followers and friends.

It transforms lain@pleroma.soykaf.com to lain pleroma soykaf com (to prevent getting an "email" type field in tsvector), turns lain@ple request into lain:* | ple:*, uses ts_rank_cd function (which considers density of matched tokens), then friends & followers boosting comes into play (rank coefficients are used, so 0 rank stays 0 and positive ranks are multiplied).

Edited by lain

Merge request reports