Skip to content

Fix reply context fixing (Pleroma replies to Misskey threads) and removal of context objects

Hélène requested to merge helene/pleroma:fix/federation-context-issues into develop

Incoming Pleroma replies to a Misskey thread were rejected due to a broken context fix, which caused them to not be visible until a non-Pleroma user interacted with the replies.

This fix properly sets the post-fix object context to its parent Create activity as well, if it was changed.

context fields for objects and activities can now be generated based on the object/activity inReplyTo field or its ActivityPub ID, as a fallback method in cases where context fields are missing for incoming activities and objects, which should help reduce thread context disagreements between Pleroma instances on Misskey threads, and maybe more. This more deterministic context ID generation for remote posts lacking them should reduce thread breakage and better support Misskey's (and others) followers-only threads which may lack remote posts (which tends to cause orphaned threads with replies). This should also fix other edge case bugs related to Misskey threads (Pleroma replies being separated from the OP and Misskey replies, etc.)

However, much more was changed here. Here are the technical explanations.

Removal of the context_id field and context objects

30 to 70% of the objects in the object table are simple JSON objects containing a single field, id, being the context's ID. The reason for the creation of an object per context seems to be an old relic from the StatusNet era, and has only been used nowadays as an helper for threads in Pleroma-FE via the pleroma.conversation_id field in status views. An object per context was created, and its numerical ID (table column) was used and stored as context_id in the object and activity along with the full context URI/string.

This field has been removed and creation of objects for each context has been stopped, which will also allow incoming activities to use activity IDs as contexts, something which was not possible before, or would have been very broken under most circumstances.

Purge of context-only objects (objects with only an id field and no type)

These objects represent from 30 to 70% of the rows on the objects table, based on numbers from a few live instances (IHBA, SPC, FSE, shmibs' single-user instance, my single-user instance, ...)

As those pseudo-objects prevent creating objects with those actual IDs, deleting them is a better solution. This could have happened if an object used another object's ID as its context.

Removal of those objects followed by a vacuum + cluster or pg_repack is reported to have a net impact on the size of the database and its index, which should greatly improve database sizes as well as query times.

The deletion of those is handled as a background task, but can be done manually via DELETE FROM objects WHERE (data->>'type') IS NULL;.

New pleroma.context field

This field replaces the now deprecated conversation_id field, and now exposes the ActivityPub object context directly via the MastoAPI instead of relying on StatusNet-era data concepts. It is recommended for all clients and frontends to use this field instead, if it is required.

The pleroma.conversation_id field has been reimplemented in a way to maintain backwards-compatibility by calculating a CRC32 of the full context URI/string in the object, instead of relying on the row ID for the created context object. The most significant bit of that CRC32 is cleared to keep support with some clients only supporting signed 32-bit integers (e.g. old versions of Husky, which crashed otherwise, and most likely other Java/Kotlin applications, maybe others.)

The pleroma.conversation_id field should be removed in a future version of Pleroma. Pleroma-FE currently depends on this field, as well.

Edited by Hélène

Merge request reports