Large Data Sets Uploads + Distributed Processing

Hi All,

Greetings from the Kenya Red Cross.

We’ve got a new project coming up that we need to process huge batches of drone images. (We estimate 7 missions, 15,000 images each)

We’ve so far setup the infrastructure as follows running on Proxmox VE, and Ubuntu16_04 VM’s

WebODM ==> ClusterOdm ====> NodeODM (ten servers)

NodeODM running on docker, while webodm and clusterODM are running natively.
We manage to send a batch for processing to clusterODM but on monitoring the dashboard, it’s only one node that is in use. We processed 43 images in 35 minutes as a test batch and are happy with the results.

We’ve tried using the options, such as sm-cluster url option, min-num-features, split-overlap and split with different variables, but no difference

When we use split option (we set it to 20 and also tried other integers, ) we get error and it doesn’t process. (Error code 1)

The second challenge we have, is uploading from an end user computer, not all files get selected (MacOS Client) and attempts to run a partial upload, which eventually fails after a few GB. Is there a recommended process for bulk upload of files for processing?

Looking forward to the community supporting us on this. Thanks in advance.

Kind regards,

Taariq

3 Likes

--split 400 --split-overlap 120 are conservative values and should be a good starting point for processing settings.

--sm-cluster does not need to be set explicitly as ClusterODM should take care of setting it. When a task starts, you should be able to see in the first lines of the log:

[INFO]    sm_cluster: True

If sm_cluster is set to “None”, then you might try setting sm-cluster to the address of ClusterODM (e.g. http://:3000). Although that should not be necessary.

ClusterODM should have a UI that shows the connected nodes that looks something like the below. Are you sure all your ODM nodes are registered on the cluster?

What browser are you using for the upload?

3 Likes

Processing without the --sm-cluster URL shows correctly in the log

[info] sm_cluster: True


ClusterODM shows all 10 nodes as online

When I upload All files selected from open dialog box (both safari and chrome on macOS) - I have 9969 images in the folder , but only 7882 end up on the upload screen, with smaller batches (43 images) I dont have this problem

1 Like

I recommend you gradually increase photo batch size (for example: 100, 200, 500, 1000, 2000, 3000, 4000, etc) and see where things stop working.

1 Like

Yup, and maybe start at 7000 since it seems to be successfully uploading 7800.

Honestly, browsers are barely meant for this. @pierotofy might be able to answer what the cap is on number of files browsers can handle, though your experimentation per Corey’s recommendations above would be informative.

@taariq taariq are you open to running this on the command line?

2 Likes

Yes, I’d be open to doing it this way, was hoping for an FTP solution where I could dump all the images to a folder on webODM and then start the process there.


PS… I managed to get the cluster running after interrogating the logs, I realized the images captured from wingtra drone dont have GPS information (probably stored separately, or not compatible with webodm), ill dig a bit deeper and post the feedback here

But… I’ve uploaded 407 images (From DJI Mavic Pro) and --split 50 and the batch was split to 9 sub models and 9 nodes pick up the processing. I didn’t have to specify any other option

Next steps…

  1. Testing the results file (its currently at 2:39:00 and still going, will update once done.
  2. Figure out how to get WebODM to process the Wingtra images
  3. Work on the upload images process (from the field via LTE / Satellite)
2 Likes

Will definitely try this out! Including trying it on a windows PC

PS: Sorry, I am a new user and limited to only one image per post, so can’t edit the above to update.

UPDATE:
Processing batch from DJI Mavic 2 Pro, 407 images, processed in 03:06:22 (split: 50)

Next:

  1. Figure out how to correctly process Wingtra images on ODM

UPDATE: Wingtra images now processing… The image set we were using had not been processed with geotags after the flight, hence why geodata was missing. Running a small batch of 165 images as a test now.

  1. Work on the upload images process (from the field via LTE / Satellite)

UPDATE: Even with the above 165 images on both chrome/ safari on MacOS, it wouldn’t select all the images, I had to scroll up and down the entire list of files to get it to select all.

1 Like

Great. I agree with @smathermather that browser upload on any platform is likely to be unreliable at best with this many photos. If you can look into command-line operations that should be more stable. I understand you have several users and this may not be easily done, but it’s probably the best approach.

Good catch on the Wingtra images… hopefully someone here has some experience with that platform and can suggest something.

3 Likes

WebODM features chunked uploads, and is capable of handling very large number of images (more than 15000) at least on the web client side. I’m not sure why the failures, perhaps network errors on the server side or browser limitations. Try Firefox.

2 Likes

I forgot about chunked uploads!

1 Like