I’m working on setting up cluster on local hardware that I can access from outside my home network. I want to store all the files on an extra hard drive that is mounted to the machine running WebODM. In the below diagramed setup with 3 machines, are there files from uploads/processing that would persist somewhere other than the local storage location set via the --media-dir parameter? Do I need to worry about storage on the NodeODM machines beyond what is required for processing the split groups for a single processing task?
As far as NodeODM go, do you need to worry about storage? Maybe. NodeODM does keep tasks in storage for 2 days by default (to allow for restarts). So if you process a lots of tasks within a 2 day frame, you might have a lot of files.
If disk space is a concern, and you don’t need the ability to restart tasks from the middle of the pipeline also check the --optimize-disk-space option when processing tasks. It will decrease overall disk utilization by a lot while processing.
Since Dan is among the very few who handles massive datasets, this is a consideration for sure. I have run out of space due to caching in ClusterODM, but due to how I have storage allocated, it’s much more common for me on individual nodes.