Pleroma.Upload.Filter.Dedupe: sharding directory structure

changed the description

If you choose to move your files, you will need a strategy to make them still accessible. A Phoenix Plug could be used long term or while you're in the process of moving your files to their new locations. Here is a working example:

diff --git a/lib/pleroma/web/endpoint.ex b/lib/pleroma/web/endpoint.ex
index 309ca34b0..796f9a301 100644
--- a/lib/pleroma/web/endpoint.ex
+++ b/lib/pleroma/web/endpoint.ex
@@ -43,6 +43,7 @@ defmodule Pleroma.Web.Endpoint do
   plug(Pleroma.Web.Plugs.SetLocalePlug)
   plug(CORSPlug)
   plug(Pleroma.Web.Plugs.HTTPSecurityPlug)
+  plug(Pleroma.Web.Plugs.DedupeRedirect)
   plug(Pleroma.Web.Plugs.UploadedMedia)

   @static_cache_control "public, max-age=1209600"
diff --git a/lib/pleroma/web/plugs/dedupe_redirect.ex b/lib/pleroma/web/plugs/dedupe_redirect.ex
new file mode 100644
index 000000000..b39f74a59
--- /dev/null
+++ b/lib/pleroma/web/plugs/dedupe_redirect.ex
@@ -0,0 +1,49 @@
+# Pleroma: A lightweight social networking server
+# Copyright © 2017-2022 Pleroma Authors <https://pleroma.social/>
+# SPDX-License-Identifier: AGPL-3.0-only
+
+defmodule Pleroma.Web.Plugs.DedupeRedirect do
+  @moduledoc """
+  """
+
+  import Phoenix.Controller, only: [{:redirect, 2}]
+  import Plug.Conn
+
+  alias Pleroma.Upload.Filter.Dedupe
+
+  @behaviour Plug
+  @media_path "media"
+
+  def init(opts \\ []) do
+    static_plug_opts =
+      opts
+      |> Keyword.put(:from, "__unconfigured_plug")
+      |> Keyword.put(:at, "/__unconfigured_plug")
+      |> Plug.Static.init()
+
+    %{static_plug_opts: static_plug_opts}
+  end
+
+  def call(%{request_path: <<"/", @media_path, "/", file::binary>>} = conn, _opts) do
+    upload_dir = Pleroma.Config.get!([Pleroma.Uploaders.Local, :uploads]) |> Path.absname()
+
+    cond do
+      File.exists?(Path.join([upload_dir, file])) ->
+        conn
+
+      File.exists?(Path.join([upload_dir, Dedupe.shard_path(file)])) ->
+        redirect_url =
+          Path.join([Pleroma.Web.Endpoint.url(), @media_path, Dedupe.shard_path(file)])
+
+        conn
+        |> put_status(301)
+        |> redirect(external: redirect_url)
+        |> halt()
+
+      true ->
+        conn
+    end
+  end
+
+  def call(conn, _opts), do: conn
+end

and the results:

> curl --head "https://friedcheese.us/media/4f0d7de057f5d1a0d27b1d4a323728504345202317d002a52e38fcad8623bd71.jpeg"
HTTP/2 301
[trimmed]
location: https://friedcheese.us/media/4f/0d/7d/4f0d7de057f5d1a0d27b1d4a323728504345202317d002a52e38fcad8623bd71.jpeg

It may be easier to manage permanently with an Nginx rewrite.

A working rewrite rule looks like this:

    location /media/ {
        rewrite "^/media/([A-Za-z0-9]{2})([A-Za-z0-9]{2})([A-Za-z0-9]{2})(.*)$" /media/$1/$2/$3/$1$2$3$4 last;
        alias /var/lib/pleroma/uploads/;  # <-- make sure this is correct for your deploy
        allow all;
    }

added 1 commit

ebea518c - B DedupeTest: Add explicit test for the sharding structure

Compare with previous version

mentioned in commit f7bf9a8c

merged

Does it really make a difference?

Someone tested it and found the flat structure is better.

Maybe it varies by filesystem though. I'm using ZFS.

I don't like this because it's going to potentially make 4x the inodes for each file... up to 16 million folders...

I would like this to be configurable at least.

Does it really make a difference?

Yes. This is the generally accepted solution to a common storage scaling problem in *nix. There are Pleroma instances that broke because of the old flat model. They did have millions of files uploaded.

Someone tested it and found the flat structure is better.

Their testing is faulty then. Benchmarking is incredibly hard to do. If you look at the original blog post it was updated after he realized there were kernel errors he didn't notice. His testing methodology is wonky though. He should have written all of the files and then rebooted before attempting his read test. He also should not have tested this on a Vultr VPS but on real hardware as his numbers cannot be trusted when his storage device isn't truly dedicated. The updated conclusion he arrived to: just don't nest too deeply.

I don't like this because it's going to potentially make 4x the inodes for each file... up to 16 million folders...

inode starvation should not be a concern. Directory index full issues are a more likely concern.

When you have millions of files in a single directory it requires a lot more IO to read a single file even if you know exactly its filename and path. There are more technical details in the original issue #1513 (closed)

It's also worth noting that this is how git stores data as well. Go look at any Git repo to see what's inside .git/objects. It does this for the same performance reasons we are doing it. Git doesn't do two-level sharding because it would make it harder to detect when it needs to repack.

Pleroma.Upload.Filter.Dedupe: sharding directory structure

Merge request reports

Activity