Fix flaky tests with DB connections; Allow higher amount of restarts for Pleroma.Repo during testing (!3696) · Merge requests · Pleroma / pleroma

Ilja requested to merge ilja/pleroma:fix_flaky_tests_where_we_sometimes_loose_db_connections into develop Jul 14, 2022

This was done by @FloatingGhost as part of a bigger commit in Akkoma. (For this reason I also changed the Author in this commit to floatingghost.) See https://akkoma.dev/AkkomaGang/akkoma/src/commit/37ae047e1652c4089934434ec79f393c4c839122/lib/pleroma/application.ex#L83.

As explained in https://ihatebeinga.live/objects/860d23e1-dc64-4b07-8b4d-020b9c56cff6

there are so many caches that clearing them all can nuke the supervisor, which by default will become an hero if it gets more than 3 restarts in <5 seconds

And further down the thread

essentially we've got like 11 caches (https://akkoma.dev/AkkomaGang/akkoma/src/commit/37ae047e1652c4089934434ec79f393c4c839122/lib/pleroma/application.ex#L165) then in test we fetch them all (https://akkoma.dev/AkkomaGang/akkoma/src/branch/develop/test/support/data_case.ex#L50) and call clear on them so if this clear fails on any 3 of them, the pleroma supervisor itself will die

How it fails?

idk maybe cachex dies, maybe :ets does a weird thing it doesn't log anything, it just consistently dies during cache clearing so i figured it had to be that

honestly my best bet is locksmith and queuing https://github.com/whitfin/cachex/blob/master/lib/cachex/actions/clear.ex#L26 clear is thrown into a locksmith transaction

locksmith says

If the process is already in a transactional context, the provided function will be executed immediately. Otherwise the required keys will be locked until the provided function has finished executing.

so if we get 2 clears too close together, maybe it locks, then doesn't like the next clear?

Edited Jul 14, 2022 by Ilja

Fix flaky tests with DB connections; Allow higher amount of restarts for Pleroma.Repo during testing

Merge request reports