Better Oban job result handling
I've noticed there can be quite a few errors in Oban jobs that are not handled properly, e.g., re-queueing of jobs that should be completely discarded.
In Oban all errors are treated as retryable unless you explicitly discard cancel them. (discard is soft-deprecated)
e.g. the federator incoming queue
@impl Oban.Worker
def perform(%Job{args: %{"op" => "incoming_ap_doc", "params" => params}}) do
with {:ok, res} <- Federator.perform(:incoming_ap_doc, params) do
{:ok, res}
else
{:error, :origin_containment_failed} -> {:cancel, :origin_containment_failed}
{:error, {:reject, reason}} -> {:cancel, reason}
e -> e
end
end
Any error that is not :origin_containment_failed
or :reject
gets retried, and I've noticed a errors in jobs like
{:error, {:error, {:validate, {:error, #Ecto.Changeset<action: :insert, changes: %{actor: \\\"https://genau.qwertqwefsday.eu/users/8oxbqesrd1\\\", id: \\\"https://genau.qwertqwefsday.eu/ff2efda8-fab9-4bc8-9206-426ce7080856\\\", object: \\\"https://genau.qwertqwefsday.eu/notes/97jbiakhuo\\\", type: \\\"Delete\\\"}, errors: [object: {\\\"can't find object\\\", []}], data: #Pleroma.Web.ActivityPub.ObjectValidators.DeleteValidator<>, valid?: false>}}}}\", \"attempt\": 1}","{\"at\": \"2022-11-13T19:33:48.110564Z\", \"error\": \"** (Oban.PerformError) Pleroma.Workers.ReceiverWorker failed with {:error, {:error, {:validate, {:error, #Ecto.Changeset<action: :insert, changes: %{actor: \\\"https://genau.qwertqwefsday.eu/users/8oxbqesrd1\\\", id: \\\"https://genau.qwertqwefsday.eu/ff2efda8-fab9-4bc8-9206-426ce7080856\\\", object: \\\"https://genau.qwertqwefsday.eu/notes/97jbiakhuo\\\", type: \\\"Delete\\\"}, errors: [object: {\\\"can't find object\\\", []}], data: #Pleroma.Web.ActivityPub.ObjectValidators.DeleteValidator<>, valid?: false>}}}}\", \"attempt\": 2}","{\"at\": \"2022-11-13T19:34:08.249125Z\", \"error\": \"** (Oban.PerformError) Pleroma.Workers.ReceiverWorker failed with {:error, {:error, {:validate, {:error, #Ecto.Changeset<action: :insert, changes: %{actor: \\\"https://genau.qwertqwefsday.eu/users/8oxbqesrd1\\\", id: \\\"https://genau.qwertqwefsday.eu/ff2efda8-fab9-4bc8-9206-426ce7080856\\\", object: \\\"https://genau.qwertqwefsday.eu/notes/97jbiakhuo\\\", type: \\\"Delete\\\"}, errors: [object: {\\\"can't find object\\\", []}], data: #Pleroma.Web.ActivityPub.ObjectValidators.DeleteValidator<>, valid?: false>}}}}\", \"attempt\": 3}","{\"at\": \"2022-11-13T22:22:27.594277Z\", \"error\": \"** (Oban.PerformError) Pleroma.Workers.ReceiverWorker failed with {:error, {:error, {:validate, {:error, #Ecto.Changeset<action: :insert, changes: %{actor: \\\"https://genau.qwertqwefsday.eu/users/8oxbqesrd1\\\", id: \\\"https://genau.qwertqwefsday.eu/ff2efda8-fab9-4bc8-9206-426ce7080856\\\", object: \\\"https://genau.qwertqwefsday.eu/notes/97jbiakhuo\\\", type: \\\"Delete\\\"}, errors: [object: {\\\"can't find object\\\", []}], data: #Pleroma.Web.ActivityPub.ObjectValidators.DeleteValidator<>, valid?: false>}}}}\", \"attempt\": 4}"}
I suspect we should catch this one and probably others and handle them appropriately to keep the queues from being filled with tasks that will never complete.
Edited by feld