Using subdirectories for media storage by splitting the initial bytes of the hash/UUID
The uploads directory will be much more manageable if the first two bytes of the SHA or UUID are used to name subdirectories, e.g.: 9f1ed88e-0233-4a05-90f8-654fc60a8956/apple.mp4
becomes 9f/1e/d88e-0233-4a05-90f8-654fc60a8956/apple.mp4
. (Everything that follows is rationale; feel free to disregard the rest if the reasoning is obvious.)
FSE has 102,960 media files; the uploads directory itself takes up 2,323 4kB blocks just for the list of entries, about 9.1MB. (That is, for the directory itself, not the files inside it. The contents are ~80GB, for the curious.)
The length of directory entries gets excessively long after a while. Files are already stored with either a hash or a UUID in the path, both of which are evenly distributed (or at least evenly distributed enough at the numbers where this starts to matter). Splitting the initial parts of the hash for subdirectories would make these directories significantly smaller and easier to manage in addition to making them faster, especially since the list of entries in a directory entries tends to fragment more often than not:
$ time ls backup/fse/site/uploads >/dev/null
real 0m8.406s
user 0m0.113s
sys 0m0.070s
(The numbers are similar in production, though on the live server the FS cache is much more likely to be warm.)
There isn't a way to store that many files in the FS without doing some pointer-chasing, but if we use subdirectories, the typical case would require just a couple of seeks and much less data to read by taking the first two or three bytes and using those to name subdirectories. For example, for a SHA peeling off the first three bytes, instead of ed57d5d30c5f8dc9fcb3eb6f51ab7fb19bfd870f417b114775cb16affa531c0f.png
, the upload mangler would name the file ed/57/d5/d30c5f8dc9fcb3eb6f51ab7fb19bfd870f417b114775cb16affa531c0f.png
, or for a UUID peeling off the first two bytes, instead of 9f1ed88e-0233-4a05-90f8-654fc60a8956/apple.mp4
, it would use 9f/1e/d88e-0233-4a05-90f8-654fc60a8956/apple.mp4
.
This would automatically balance given an even distribution (this seems to be the case for the UUIDs as well), and the top few levels would have exactly 256 entries at most.
A quick check of our data: we would expect 102960/65536=1.5710 entries on average inside the second-level directories (i.e., ed/57
), and a quick check of the actual contents gives 1.5663. None of the third-level subdirectories (ed/57/d5
) would have more than 9 entries, fitting comfortably inside a single block, though probably wasting a lot of space on block-padding. Splitting only the first two bytes instead of the first three yields an average of 400.97 entries per second-level subdirectory, close to the expected 402.19, with a range of 355-455, which is probably a better fit for a typical FS with a 4 or 8kB block size.