Split-merge errors with WebODM/ClusterODM/NodeODM

I have been trying to process a fairly large dataset, and I’m still figuring out the split-merge feature. We have 5 processing nodes (AMD Epyc, 256GB RAM), which, together, should be able to process this much data (7.5k images, about 80GB).
It seems like the processing of submodels is finishing, but some step near the end is failing.

I think I may be missing something, and I am not sure where to find the error messages that are mentioned in the output.

Any help or advise would be appreciated!

[INFO]    Finished odm_postprocess stage
[INFO]    No more stages to run
100 - done.
[INFO]    LRE: submodel_0009 finished successfully
[INFO]    LRE: No remote tasks left to cleanup
Traceback (most recent call last):
File "/code/run.py", line 59, in <module>
retcode = app.execute()
File "/code/stages/odm_app.py", line 130, in execute
raise e
File "/code/stages/odm_app.py", line 94, in execute
File "/code/opendm/types.py", line 347, in run
File "/code/opendm/types.py", line 328, in run
self.process(self.args, outputs)
File "/code/stages/splitmerge.py", line 169, in process
File "/code/opendm/remote.py", line 58, in run_toolchain
File "/code/opendm/remote.py", line 252, in run
raise nonloc.error
pyodm.exceptions.TaskFailedError: (2f29353e-e55c-407f-8b87-7b05cfb028fd) failed with task output: File "/code/stages/mvstex.py", line 117, in process
system.run('"{bin}" "{nvm_file}" "{model}" "{out_dir}" '
File "/code/opendm/system.py", line 106, in run
raise SubprocessException("Child returned {}".format(retcode), retcode)
opendm.system.SubprocessException: Child returned 134

===== Done, human-readable information to follow... =====

[ERROR]   Uh oh! Processing stopped because of strange values in the reconstruction. This is often a sign that the input data has some issues or the software cannot deal with it. Have you followed best practices for data acquisition? See https://docs.opendronemap.org/flying/
100 - done.
Full log saved at /var/www/data/6c14f12f-8f39-4d81-bed4-de240ae00378/submodels/submodel_0003/error.log
1 Like

If it helps, here are the params I used (I may be getting something wrong if I’m misunderstanding splitmerge features)

dsm: true, dtm: true, feature-quality: high, mesh-octree-depth: 11, mesh-size: 300000, min-num-features: 24000, optimize-disk-space: true, pc-quality: high, skip-3dmodel: true, sm-cluster:, split: 600, split-overlap: 120

1 Like

Do you have the time/bandwidth to try the dataset again with all Defaults (with the exception of your cluster and split)?

Are you fully current/updated on WebODM/ClusterODM/NodeODM?

1 Like

Yes, I can do that; it will take about 36 hours before I have a result from it though.
In the meantime, my versions are WebODM==1.9.15, ClusterODM==1.5.3, and NodeODM==2.2.0, so I am a few minor versions behind.

1 Like

If you can, I’d highly recommend getting everything current as well.

1 Like

I’ve made it past the point I was at last time, though it is still running (big set, 95 hours and counting). I think the problem was that I had not included http:// on the beginning of the sm-cluster address. I’ll update if I run into further problems, thanks for your help!



Hope you can share a screenshot or two once you’re done :sunglasses:

1 Like

Hi, I have something similar,
2023-03-18 00:29:52,767 DEBUG: Matching DJI_0190.JPG and DJI_0188.JPG. Matcher: FLANN (symmetric) T-desc: 0.804 T-robust: 0.006 T-total: 0.811 Matches: 901 Robust: 833 Success: True
terminate called without an active exception
/code/SuperBuild/install/bin/opensfm/bin/opensfm: line 12: 264 Aborted “$PYTHON” “$DIR”/opensfm_main.py “[email protected]

===== Dumping Info for Geeks (developers need this to fix bugs) =====
Child returned 134
Traceback (most recent call last):
File “/code/stages/odm_app.py”, line 81, in execute
File “/code/opendm/types.py”, line 398, in run
File “/code/opendm/types.py”, line 398, in run
File “/code/opendm/types.py”, line 398, in run
File “/code/opendm/types.py”, line 377, in run
self.process(self.args, outputs)
File “/code/stages/run_opensfm.py”, line 35, in process
File “/code/opendm/osfm.py”, line 411, in feature_matching
File “/code/opendm/osfm.py”, line 416, in match_features
File “/code/opendm/osfm.py”, line 34, in run
system.run(‘“%s” %s “%s”’ %
File “/code/opendm/system.py”, line 110, in run
raise SubprocessException(“Child returned {}”.format(retcode), retcode)
opendm.system.SubprocessException: Child returned 134

I am still learning here so bear with me. I believe this issue may be caused by the reported error. IE. “something is wrong with the photo data”.
Today I used Drone Deploy for the first time, I changed the coordinate system in the drone deploy app to UTM Zone 16 EPGS 26916. I thought that would match webodm processing as webodm now uses utm.
I believe the phantom 4 pro uses NAD or something similar. I am questioning if this is the problem.
but when I check the photo EXIF data it looks normal.
My plan is to try again with NAD83(2011) EPSG 6455 (that should be my area) and see what happens.
I also have the question, if I have bad exif data, can webodm stitch the photos together for an ortho? just out of bounds?

woops, looks like I put this in the wrong thread,

1 Like