WIP: [#477] User search improvements
Fixes #477 (closed)
- tsquery search with field weights
- friends & followers boosting
Decided to step away from pg_trgm search since:
-
it doesn't seem to allow to add weights for fields
-
it's about whole strings similarity whilst in this use case fragment similarity is more important. E.g. if we search for
lain@ple
and havelain@pleroma.soykaf.com lain
andnick1 lain
candidates (nickname + name), pg_trgm prefers the latter one (with a totally irrelevant nicknamenick1
) since it's just shorter and "soykaf.com" penalizes first candidate score (the longer it is, the more is the penalty), which seems definitely wrong for this use case.
Current implementation uses standard tsvector / tsquery approach, adds weights to fields (A to nickname, B to name), adds boosting to ranks of followers and friends.
It transforms lain@pleroma.soykaf.com
to lain pleroma soykaf com
(to prevent getting an "email" type field in tsvector), turns lain@ple
request into lain:* | ple:*
, uses ts_rank_cd
function (which considers density of matched tokens), then friends & followers boosting comes into play (rank coefficients are used, so 0 rank stays 0 and positive ranks are multiplied).