Consistent Exit code 1

After a few successful jobs, I am now consistently getting exit code 1 errors from WebODM when doing a 3D Model. I am using a data set that has been processed with success several times last week.

Docker is configured with 16 GB of RAM and 250 GB of disk space, which should be a lot more than needed.

Have tried to do an ‘update’ of WebODM, but the result is stil exit code 1.

How do i proceed ?

This is the log data containing the error:

2020-11-20 15:09:11,550 DEBUG: Computing sift with threshold 0.06666666666666667
2020-11-20 15:09:11,625 DEBUG: Found 16 points in 21.634562969207764s
2020-11-20 15:09:11,643 DEBUG: reducing threshold
2020-11-20 15:09:11,644 DEBUG: Computing sift with threshold 0.06666666666666667
2020-11-20 15:09:11,789 DEBUG: Found 724 points in 21.803841590881348s
2020-11-20 15:09:11,805 DEBUG: reducing threshold
2020-11-20 15:09:11,805 DEBUG: Computing sift with threshold 0.06666666666666667
Traceback (most recent call last):
File “/code/SuperBuild/src/opensfm/bin/opensfm_main.py”, line 8, in
commands.command_runner(commands.opensfm_commands)
File “/code/SuperBuild/src/opensfm/opensfm/commands/command_runner.py”, line 27, in command_runner
command.run(args)
File “/code/SuperBuild/src/opensfm/opensfm/commands/command.py”, line 12, in run
self.run_impl(data, args)
File “/code/SuperBuild/src/opensfm/opensfm/commands/detect_features.py”, line 10, in run_impl
detect_features.run_dataset(dataset)
File “/code/SuperBuild/src/opensfm/opensfm/actions/detect_features.py”, line 21, in run_dataset
parallel_map(detect, arguments, processes, 1)
File “/code/SuperBuild/src/opensfm/opensfm/context.py”, line 66, in parallel_map
res = Parallel(batch_size=batch_size)(delayed(func)(arg) for arg in args)
File “/usr/local/lib/python3.8/dist-packages/joblib/parallel.py”, line 1061, in call
self.retrieve()
File “/usr/local/lib/python3.8/dist-packages/joblib/parallel.py”, line 940, in retrieve
self._output.extend(job.get(timeout=self.timeout))
File “/usr/local/lib/python3.8/dist-packages/joblib/_parallel_backends.py”, line 542, in wrap_future_result
return future.result(timeout=timeout)
File “/usr/lib/python3.8/concurrent/futures/_base.py”, line 439, in result
return self.__get_result()
File “/usr/lib/python3.8/concurrent/futures/_base.py”, line 388, in __get_result
raise self._exception
joblib.externals.loky.process_executor.TerminatedWorkerError: A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker.

The exit codes of the workers are {SIGKILL(-9)}
Traceback (most recent call last):
File “/code/run.py”, line 69, in
app.execute()
File “/code/stages/odm_app.py”, line 86, in execute
self.first_stage.run()
File “/code/opendm/types.py”, line 361, in run
self.next_stage.run(outputs)
File “/code/opendm/types.py”, line 361, in run
self.next_stage.run(outputs)
File “/code/opendm/types.py”, line 361, in run
self.next_stage.run(outputs)
File “/code/opendm/types.py”, line 342, in run
self.process(self.args, outputs)
File “/code/stages/run_opensfm.py”, line 30, in process
octx.feature_matching(self.rerun())
File “/code/opendm/osfm.py”, line 273, in feature_matching
self.run(‘detect_features’)
File “/code/opendm/osfm.py”, line 22, in run
system.run(’%s/bin/opensfm %s “%s”’ %
File “/code/opendm/system.py”, line 79, in run
raise Exception(“Child returned {}”.format(retcode))
Exception: Child returned 1

1 Like

How many photos of what resolution, and what are your processing parameters?

207 photos at 7680 × 4320 -which is a stupid mistake, as the sensor on my Autel Evo 2 8K - is really 4000x3000

Plan to shoot all future missions at 4000x3000

Trying again with 24 GB of RAM:

Was wondering a few days ago why my hyped up MacBook Pro was so slow - until I found, that Docker was auto starting and grabbing 48 GB of RAM at every reboot :slight_smile: - so adjusted docker down to 16 - and disabled autostart.

2 Likes

30GB allocated to my WSL2 instance with about 171 12MP images, and I Code 1 regularly with the below settings:

./run.sh --crop 0 --debug --dem-resolution 1.0 --dsm --feature-quality ultra --gps-accuracy 5 --mesh-octree-depth 12 --mesh-size 1000000 --min-num-features 160000 --opensfm-depthmap-method BRUTE_FORCE --orthophoto-resolution 1.0 --pc-classify --pc-quality ultra --pc-rectify --time --use-3dmesh --use-hybrid-bundle-adjustment --verbose

So, depending upon your settings, you might have to really dial back with that many images at 33MP.

Looks like it is proceeding now with 24G - would be really nice with some kind of ‘out of memory’ monitor to suggest the reason for an exit - though I did had a suspicion as it happened after reducing RAM - but also after upgrading the Mac OS.

1 Like

24 GB made the job complete, but result is not quite what expected - this is a flat field with a building and some trees.

1 Like

Been a while since I’ve seen a reconstruction that broken.

What are your processing parameters?

Maybe try increasing min-num-features and using BRUTE_FORCE instead of default.

Have you edited the images prior to putting them through the pipeline at all?

I was using default ‘3d Model’: Options: depthmap-resolution: 1000, use-3dmesh: true, mesh-size: 300000, mesh-octree-depth: 11

This job was shot at 65 degree gimbal tilt - which gives very good quality vertial surfaces - but perhaps makes stitching the images harder ?

Previously I did 70 degrees.

Will give you suggestions a try next run - right now trying out with depthmap-resolution = 500 to examine the difference

New run - just changed depthmap-resolution to 500, nice result - will re-run with 1000 - but this result is OK for what I need - giving visitors a good idea about the surroundings.

3 Likes

Naib, memory appears to be the key as you suggested. When I increased resolution from 500 to 1000, I got another exit code 1.

Increased RAM from 24 - 32 GB - re ran the process, and it completed.

A memory monitor on the task pane would be great.

4 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.