objects_in_reply_to_index migration fails and possible mitigations, possible rum index for performance
Recently I ran into a database migration creating a unique index (objects_in_reply_to_index) on objects.data->>'inReplyTo' that I was finally able to track down fully.
On my [longer running] Pleroma instance the btree index failed to be created and I had to adjust the index creation for it to succeed. However, after testing the change in production for awhile I was having Postgresql timeouts when clients retrieved notification and it seems the objects_in_reply_to_index
index was the culprit. I was able to re-work the index as a RUM index and things seem to be more stable.
Below are my notes for consideration/review.
It may be good to have the index setup as a RUM index long-term but I'm not sure if the work around I found is a viable alternative for anyone given this has been in place for months and probably a non-issue for most users.
I'm currently running Pleroma develop on arm64.
20190603162018_add_object_in_reply_to_index.exs
https://git.pleroma.social/pleroma/pleroma/blob/develop/priv/repo/migrations/20190603162018_add_object_in_reply_to_index.exs
CREATE INDEX objects_in_reply_to_index ON public.objects USING btree (((data ->> 'inReplyTo'::text)));
ERROR: index row size 3328 exceeds maximum 2712 for index "objects_in_reply_to_index"
HINT: Values larger than 1/3 of a buffer page cannot be indexed.
Consider a function index of an MD5 hash of the value, or use full text indexing.
Fix for migration (high CPU usage on notification lookups on arm64 Scaleway VPS)
CREATE INDEX objects_in_reply_to_index ON public.objects USING btree (((data ->> md5('inReplyTo'::text))));
turn on rum indexing
https://git.pleroma.social/pleroma/pleroma/blob/develop/docs/config.md
Search for RUM
20190510135645_add_fts_index_to_objects_two.exs
Example: https://git.pleroma.social/pleroma/pleroma/blob/develop/priv/repo/optional_migrations/rum_indexing/20190510135645_add_fts_index_to_objects_two.exs
alter table objects add column fts_content_inreplyto tsvector;
CREATE FUNCTION objects_fts_update_inreplyto() RETURNS trigger AS $$
begin
new.fts_content_inreplyto := to_tsvector('english', new.data->>'inReplyTo');
return new;
end
$$ LANGUAGE plpgsql;
create index if not exists objects_in_reply_to_index
on objects using RUM (fts_content_inreplyto rum_tsvector_addon_ops, inserted_at) with (attach = 'inserted_at', to = 'fts_content_inreplyto');
CREATE TRIGGER tsvectorupdate_inreplyto BEFORE INSERT OR UPDATE ON objects
FOR EACH ROW EXECUTE PROCEDURE objects_fts_update_inreplyto();
UPDATE objects SET updated_at = NOW();