It is my opinion that right now Oban is overused in Pleroma. Oban is not an in-memory queue, every job creation is a query to the database, therefore it should be used only when it's actually needed.
Things that do not need Oban but use it:
Prefetching preview cards, media proxy prefetch and preload. These tasks just warm the cache, we don't care if they fail or of they get preserved when the node goes down
Pleroma.Workers.Cron.ClearOauthTokenWorker, Pleroma.Workers.Cron.StatsWorker, Pleroma.Workers.Cron.PurgeExpiredActivitiesWorker (a.k.a everything that uses Oban cron except digest emails). Same argument as the last one. Do we care if StatsWorker runs exactly at the start of each hour and that it's scheduled jobs get preserved between restarts? No, it regenerates stats on reboot anyway, send_after(self(), :refresh, 3600000) will do
0 of 2 checklist items completed
Designs
Child items 0
Show closed items
No child items are currently assigned. Use child items to break down this issue into smaller parts.
Linked items 0
Link issues together to show that they're related.
Learn more.
I thought about this a bit, and I think this is not worth reworking to something oban-less, at least not wor the examples you mentioned above.
For the background tasks, we do not use the queue because we care about if they fail or not, but we do use the queue so that we know that we'll only execute a maximum of n (in our default config 5) jobs at the same time. Before using a queue we did have situations where we'd just start too many jobs at the same time and the system would overload. Of course, we could write a separate in memory queue for this, but this would introduce more complexity to the codebase, so the question is if the features we get from oban are worth the price we pay (database traffic)
For cron-like task, we don't care about the number of parallel jobs because they run so infrequently anyway. Here, it's really just the question if the database overhead we get with oban is worth it or not.
So what is the database overhead? I checked the numbers lain.com, which is running a 2vcpu hetzner vps with 4gb of RAM, not a beast but also not very slow.
action
average time in ms
standard deviation
insert
0.2
0.09
update
0.02
0.05
If we say that a job is inserted once and is being updated 10 times (looking at the data i have, these seems to be roughly true), the database overhead for each job is 0.4 milliseconds, so a bit less than half a millisecond
So removing a job that runs once every hour from obancron will save us around 10ms of database action a day. Over a whole year, we'd save around 3.5 seconds.
Overall, that's why I doubt that there is much to gain by changing / rewriting these parts to use a different queue or the supervisor, and the clarity of having one unified mechanism for any kind of job outweighs the seemingly minuscule performance gains.
Sidenote: The oban heartbeat insertion (which will go away in the next release, I heard) has a higher database impact than job insertion or updating in total time, so we should see some nice reductions of db writes once this is changed.
For the background tasks, we do not use the queue because we care about if they fail or not, but we do use the queue so that we know that we'll only execute a maximum of n (in our default config 5) jobs at the same time. Before using a queue we did have situations where we'd just start too many jobs at the same time and the system would overload. Of course, we could write a separate in memory queue for this, but this would introduce more complexity to the codebase, so the question is if the features we get from oban are worth the price we pay (database traffic)
Ah I see, makes sense. But we will need to pull concurrent_limiter for gun pooling anyway, so how about switching them to that?
So removing a job that runs once every hour from obancron will save us around 10ms of database action a day. Over a whole year, we'd save around 3.5 seconds.
Performance is not really my concern. I am more worried about the disk i/o it creates.
Overall, that's why I doubt that there is much to gain by changing / rewriting these parts to use a different queue or the supervisor
Well, in case of cron uses the code is literally simpler without oban, and it's already supervised anyway.
Sidenote: I changed my mind about Pleroma.Workers.Cron.PurgeExpiredActivitiesWorker and Pleroma.Workers.Cron.ClearOauthTokenWorker. They should be using Oban, but not in a way they use it now. Right now they use oban cron to run every minute, check for expired tokens/activities and delete them. Which is kinda dumb, considering oban has integrated scheduled jobs