Constant dead process code 1 issue


#1

Keep having issues, anywhere from running out of memory to just code crashes.
I’ve updated my vm from 8 gig all the way up to 64 gig. On memory job code1s it only consumes about 14-16 gig of memory…

I keep trying all types of settings, default, building, 3d, custom…

Created on: 2/3/2019, 12:38:28 AM

Processing Node: node-odm-1:3000 (auto)

Options: fast-orthophoto: true, mesh-octree-depth: 10, resize-to: 1024, texturing-nadir-weight: 28, mesh-size: 100000, dem-terrain-type: FlatNonForest, orthophoto-resolution: 3, max-concurrency: 2, depthmap-resolution: 1000, rerun-from: opensfm

Here is output from the last crash:

[DEBUG] running PYTHONPATH=/code/SuperBuild/install/lib/python2.7/dist-packages /code/SuperBuild/src/opensfm/bin/opensfm export_geocoords /var/www/data/1ae81594-efaa-479b-99e4-a4a73da634c1/opensfm --transformation --proj '+units=m +no_defs=True +datum=WGS84 +proj=utm +zone=14 ’
[INFO] Running ODM OpenSfM Cell - Finished
[INFO] Running ODM Meshing Cell
[DEBUG] Writing ODM 2.5D Mesh file in: /var/www/data/1ae81594-efaa-479b-99e4-a4a73da634c1/odm_meshing/odm_25dmesh.ply
[DEBUG] ODM 2.5D DSM resolution: 0.03
[INFO] Created temporary directory: /var/www/data/1ae81594-efaa-479b-99e4-a4a73da634c1/odm_meshing/tmp
[INFO] Creating DSM for 2.5D mesh
[INFO] Creating …/var/www/data/1ae81594-efaa-479b-99e4-a4a73da634c1/odm_meshing/tmp/mesh_dsm_r0.0848528137424 [max] from 1 files
[DEBUG] running pdal pipeline -i /tmp/tmpfFR3dQ.json > /dev/null 2>&1
Traceback (most recent call last):
File “/code/run.py”, line 47, in
plasm.execute(niter=1)
File “/code/scripts/odm_meshing.py”, line 108, in process
method=‘poisson’ if args.fast_orthophoto else ‘gridded’)
File “/code/opendm/mesh.py”, line 36, in create_25dmesh
max_workers=get_max_concurrency_for_dem(available_cores, inPointCloud)
File “/code/opendm/dem/commands.py”, line 38, in create_dems
fouts = list(e.map(create_dem_for_radius, radius))
File “/usr/local/lib/python2.7/dist-packages/loky/process_executor.py”, line 794, in _chain_from_iterable_of_lists
for element in iterable:
File “/usr/local/lib/python2.7/dist-packages/loky/_base.py”, line 589, in result_iterator
yield future.result()
File “/usr/local/lib/python2.7/dist-packages/loky/_base.py”, line 433, in result
return self.__get_result()
File “/usr/local/lib/python2.7/dist-packages/loky/_base.py”, line 381, in __get_result
raise self._exception
Exception: Child returned 1

This was caused directly by
“”"
Traceback (most recent call last):
File “/usr/local/lib/python2.7/dist-packages/loky/process_executor.py”, line 418, in _process_worker
r = call_item()
File “/usr/local/lib/python2.7/dist-packages/loky/process_executor.py”, line 272, in call
return self.fn(*self.args, **self.kwargs)
File “/usr/local/lib/python2.7/dist-packages/loky/process_executor.py”, line 337, in _process_chunk
return [fn(*args) for args in chunk]
File “/code/opendm/dem/commands.py”, line 92, in create_dem
pdal.run_pipeline(json, verbose=verbose)
File “/code/opendm/dem/pdal.py”, line 232, in run_pipeline
out = system.run(’ ‘.join(cmd) + ’ > /dev/null 2>&1’)
File “/code/opendm/system.py”, line 34, in run
raise Exception(“Child returned {}”.format(retcode))
Exception: Child returned 1
“”"


#2

Mm, the pdal process is crashing, but I can’t think of why. Have you been able to process a smaller subset of your images? Would you be able to share your images?


#3

I was trying to find a way to extract the ones I uploaded, but will find them tonight and get them upto google drive and send you the link.

I’ll try to run the task a few different ways and capture the various outputs to help you help me. :slight_smile:


#4

PM sent with link


#5

I also cut the taask down in 1/2, only 95 images and have same results…


#6

Could you try to process it with --use-opensfm-dense? I’m fairly certain that this is a memory issue (related to the number of points in the point cloud).


#7

It ran with --use-opensfm-dense, however the results was unusable. ortho was very distorted and the 3d was just black dots.


#8

Also for <200 images, i figured 32 gig would be plenty. Even happens on the 64 gig workstation.


#9

Try to lower depthmap-resolution.


#10

did that and the 190 image count would fail, 2 out of 3 times and the even count images to try a smaller sample was successful, however the output was not usable at all.

I am currently testing with a 128 gig swap file and right now (about 2+ hours so far) i have 3 pdal processes sitting at 53g each just chewing up cpu/memory time. I am going to let it run through the morning 7pm cst now, to see if it ever finishes.

I dont mind long delays, as turn around time isn’t as important as quality. However, I do require a useable product.

I am enjoying troubleshooting all of this, so this isn’t an issue.

I am running on a esxi host, for each workstation they have 8 cores dedicated and one vm has 32 gig allocated to it and the other 64 gig. The 32 gig vm is the one i am currently testing with the 128 gig swap file.

Both were clean ubuntu 16.04 installs, running WebODM via docker.


#11

Current task with 95 images are running these settings:

Processing Node: node-odm-1:3000 (auto)

Options: min-num-features: 16000, resize-to: -1, use-3dmesh: true, build-overviews: true, verbose: true, dem-terrain-type: FlatForest, orthophoto-resolution: 2, dsm: true, opensfm-depthmap-method: BRUTE_FORCE


#12

So adding 128gig swap file worked on the 95 image task.

i also ran against the 190 image dataset, and changed the concurrent to 4 (which is down from the default 8)

It took over 6 hours, however like i stated before time is not an issue.

next would be to clean up the results… the points for 3d seem ok; i believe my angled shots are creating an issues building the imagery. Next time i get some less wet weather, i will reshoot the same mission but at 90 degree angle to see what i can come up with. Opening a new thread if i run into issues. I will keep testing different datasets to see if i can reproduce anymore issues.


#13

Better to shoot at 75 or 80 degrees than 90 degrees (straight down) if you have the luxury of setting your camera angle.


#14

Also, another approach (if running natively) is to let it run until the PDAL step and rerun with max-concurrency set to 1 or 2.

But honestly, I think we need to decrease our memory requirements for PDA which requires breaking up the point cloud into overlapping chunks and running on those smaller pieces, and then reassembling at the end.


#15

I’ll try the angle change and compare it against the 90 degree. Some of the missions i fly just require a flat ortho look, just need good quaility imagery and sharp images.

but i dont mind setting up another plan to fly at a different angle just cause i enjoy it :slight_smile:


#16

I have found setting 4 as the concurrency seems to work for me right now. Time of the processing isn’t as important as to me right now as much as the task completing and the output be usable.


#17

That small of a change in angle won’t really affect the ortho look, but will significantly improve the accuracy.