Default job queue limits are too much
Our default limits for concurrent job execution do very little to actually protect instances from overloads, here are some examples:
-
federator_incoming
queue, which is responsible for processing incoming activites and inserting them has a 50 job limit. The problem is that our default db pool size is 10 connections. When there are actually 50federator_incoming
jobs running at the same time, DBConnection pool checkout timeouts are basically spamming the error log. Hell, my instance has a 30 connection pool size and the pool was still saturated when federator queues were fully loaded. I imagine it will get even worse when websocket federation becomes stable, since it has a separate ingestion queue that also has a 50 job limit. -
federator_outgoing
queue, which is responsible for pushing the activity to instances over HTTP has a 50 job limit. The problem is that our connection pool for federation is also limited to 50 connections. It's very easy to get an overload sincefederator_outgoing
queue is far from being the only place to use the federation connection pool.