A really big problem

This is not really what I want to see 296 hours into a task! :fearful:

The task is onto the 4th submodel of 5, which it has been adding images to for the past 4 or 5 days. It didn’t run out of memory.

Apparently the problem is “too large”, but afterwards, images continued to be added… should I abort, or hope for the best that the “FAILURE” (twice) isn’t a terminal condition that could take days more processing to be revealed?

2022-06-19 11:59:51,342 INFO: DJI_0477_15.JPG resection inliers: 7415 / 9025
2022-06-19 11:59:51,737 INFO: Adding DJI_0477_15.JPG to the reconstruction
2022-06-19 11:59:52,309 INFO: Re-triangulating
2022-06-19 11:59:52,750 INFO: Shots and/or GCPs are well-conditioned. Using naive 3D-3D alignment.
CHOLMOD error: problem too large. file: D:\ODM\vcpkg\buildtrees\suitesparse\src\dd8ca029e2-bdd475c274.clean\SuiteSparse\CHOLMOD\Include…/Supernodal/cholmod_super_symbolic.c line: 683
2022-06-19 13:03:06,273 DEBUG: Ceres Solver Report: Iterations: 1, Initial cost: 1.726604e+06, Final cost: 1.726604e+06, Termination: FAILURE
CHOLMOD error: problem too large. file: D:\ODM\vcpkg\buildtrees\suitesparse\src\dd8ca029e2-bdd475c274.clean\SuiteSparse\CHOLMOD\Include…/Supernodal/cholmod_super_symbolic.c line: 683
2022-06-19 15:06:05,697 DEBUG: Ceres Solver Report: Iterations: 1, Initial cost: 3.823265e+06, Final cost: 3.823265e+06, Termination: FAILURE
2022-06-19 15:12:04,440 INFO: Removed outliers: 4438812
2022-06-19 15:12:25,988 INFO: -------------------------------------------------------
2022-06-19 15:12:32,222 INFO: DJI_0479_15.JPG resection inliers: 6809 / 7335
2022-06-19 15:12:32,738 INFO: Adding DJI_0479_15.JPG to the reconstruction

1 Like

I think 90% of your struggles are being on the bleeding edge of project size using graphics cards. This is an Nvidia utility that probably wasn’t designed for data this size.

I’m curious, is GPU saving you time, or are you using GPU because you have it?

2 Likes

I wasn’t aware that GPU was in use during the adding images stage, GPU hasn’t been used for feature extraction for quite a while, I’ve only seen it used for dense reconstruction with the past 2 builds, and the task is not yet up to that stage.

1 Like

Well if the “Problem too large” issue wasn’t terminal, then:

ModuleNotFoundError: No module named 'osgeo’

was, it has crashed my task, after a gruelling 342 hours :sob:

The settings:
22598 images 342:08:23 Cannot process dataset

Created on: 07/06/2022, 10:03:05
Processing Node: node-odm-1 (auto)
Options: auto-boundary: true, dem-resolution: 10, dsm: true, dtm: true, feature-quality: medium, gps-accuracy: 8, optimize-disk-space: true, pc-quality: low, split: 5000, use-3dmesh: true

This was build 61, so GPU wasn’t involved in feature extraction.

Smaller subsections of this set have worked on high settings, but it seems medium/ low just don’t work with split/merge on the full set. I have still not had any split/merge tasks succeed.

[INFO] Smoothing iteration 1
[INFO] Completed smoothing to create D:\WebODM\resources\app\apps\NodeODM\data\45743fa8-b31f-42ce-bbfb-a8d00ea30b39\submodels\submodel_0000\odm_dem\dsm.tif in 0:05:45.205775
[INFO] Completed dsm.tif in 0:16:01.554840
[INFO] Cropping D:\WebODM\resources\app\apps\NodeODM\data\45743fa8-b31f-42ce-bbfb-a8d00ea30b39\submodels\submodel_0000\odm_dem\dsm.tif
[INFO] running gdalwarp -cutline “D:\WebODM\resources\app\apps\NodeODM\data\45743fa8-b31f-42ce-bbfb-a8d00ea30b39\submodels\submodel_0000\odm_georeferencing\odm_georeferenced_model.bounds.gpkg” -crop_to_cutline -co TILED=YES -co COMPRESS=DEFLATE -co BLOCKXSIZE=512 -co BLOCKYSIZE=512 -co BIGTIFF=IF_SAFER -co NUM_THREADS=16 “D:\WebODM\resources\app\apps\NodeODM\data\45743fa8-b31f-42ce-bbfb-a8d00ea30b39\submodels\submodel_0000\odm_dem\dsm.original.tif” “D:\WebODM\resources\app\apps\NodeODM\data\45743fa8-b31f-42ce-bbfb-a8d00ea30b39\submodels\submodel_0000\odm_dem\dsm.tif” --config GDAL_CACHEMAX 43.7%
Creating output file that is 24896P x 29754L.
Processing D:\WebODM\resources\app\apps\NodeODM\data\45743fa8-b31f-42ce-bbfb-a8d00ea30b39\submodels\submodel_0000\odm_dem\dsm.original.tif [1/1] : 0Using internal nodata values (e.g. -9999) for image D:\WebODM\resources\app\apps\NodeODM\data\45743fa8-b31f-42ce-bbfb-a8d00ea30b39\submodels\submodel_0000\odm_dem\dsm.original.tif.
Copying nodata values from source D:\WebODM\resources\app\apps\NodeODM\data\45743fa8-b31f-42ce-bbfb-a8d00ea30b39\submodels\submodel_0000\odm_dem\dsm.original.tif to destination D:\WebODM\resources\app\apps\NodeODM\data\45743fa8-b31f-42ce-bbfb-a8d00ea30b39\submodels\submodel_0000\odm_dem\dsm.tif.
…10…20…30…40…50…60…70…80…90…100 - done.
[INFO] Cropping D:\WebODM\resources\app\apps\NodeODM\data\45743fa8-b31f-42ce-bbfb-a8d00ea30b39\submodels\submodel_0000\odm_dem\dsm.unfilled.tif
[INFO] running gdalwarp -cutline “D:\WebODM\resources\app\apps\NodeODM\data\45743fa8-b31f-42ce-bbfb-a8d00ea30b39\submodels\submodel_0000\odm_georeferencing\odm_georeferenced_model.bounds.gpkg” -crop_to_cutline -co TILED=YES -co COMPRESS=DEFLATE -co BLOCKXSIZE=512 -co BLOCKYSIZE=512 -co BIGTIFF=IF_SAFER -co NUM_THREADS=16 “D:\WebODM\resources\app\apps\NodeODM\data\45743fa8-b31f-42ce-bbfb-a8d00ea30b39\submodels\submodel_0000\odm_dem\dsm.unfilled.original.tif” “D:\WebODM\resources\app\apps\NodeODM\data\45743fa8-b31f-42ce-bbfb-a8d00ea30b39\submodels\submodel_0000\odm_dem\dsm.unfilled.tif” --config GDAL_CACHEMAX 43.8%
Creating output file that is 24896P x 29754L.
Processing D:\WebODM\resources\app\apps\NodeODM\data\45743fa8-b31f-42ce-bbfb-a8d00ea30b39\submodels\submodel_0000\odm_dem\dsm.unfilled.original.tif [1/1] : 0Using internal nodata values (e.g. -9999) for image D:\WebODM\resources\app\apps\NodeODM\data\45743fa8-b31f-42ce-bbfb-a8d00ea30b39\submodels\submodel_0000\odm_dem\dsm.unfilled.original.tif.
Copying nodata values from source D:\WebODM\resources\app\apps\NodeODM\data\45743fa8-b31f-42ce-bbfb-a8d00ea30b39\submodels\submodel_0000\odm_dem\dsm.unfilled.original.tif to destination D:\WebODM\resources\app\apps\NodeODM\data\45743fa8-b31f-42ce-bbfb-a8d00ea30b39\submodels\submodel_0000\odm_dem\dsm.unfilled.tif.
…10…20…30…40…50…60…70…80…90…100 - done.
[INFO] Computing euclidean distance: D:\WebODM\resources\app\apps\NodeODM\data\45743fa8-b31f-42ce-bbfb-a8d00ea30b39\submodels\submodel_0000\odm_dem\dsm.euclideand.tif
[INFO] running gdal_proximity.py “D:\WebODM\resources\app\apps\NodeODM\data\45743fa8-b31f-42ce-bbfb-a8d00ea30b39\submodels\submodel_0000\odm_dem\dsm.unfilled.tif” “D:\WebODM\resources\app\apps\NodeODM\data\45743fa8-b31f-42ce-bbfb-a8d00ea30b39\submodels\submodel_0000\odm_dem\dsm.euclideand.tif” -values -9999.0
Traceback (most recent call last):
File “D:\WebODM\resources\app\apps\ODM\venv\Scripts\gdal_proximity.py”, line 5, in
from osgeo.utils.gdal_proximity import * # noqa
ModuleNotFoundError: No module named ‘osgeo’

===== Dumping Info for Geeks (developers need this to fix bugs) =====
Child returned 1
Traceback (most recent call last):
File “D:\WebODM\resources\app\apps\ODM\stages\odm_app.py”, line 94, in execute
self.first_stage.run()
File “D:\WebODM\resources\app\apps\ODM\opendm\types.py”, line 346, in run
self.next_stage.run(outputs)
File “D:\WebODM\resources\app\apps\ODM\opendm\types.py”, line 346, in run
self.next_stage.run(outputs)
File “D:\WebODM\resources\app\apps\ODM\opendm\types.py”, line 346, in run
self.next_stage.run(outputs)
[Previous line repeated 6 more times]
File “D:\WebODM\resources\app\apps\ODM\opendm\types.py”, line 327, in run
self.process(self.args, outputs)
File “D:\WebODM\resources\app\apps\ODM\stages\odm_dem.py”, line 125, in process
commands.compute_euclidean_map(unfilled_dem_path,
File “D:\WebODM\resources\app\apps\ODM\opendm\dem\commands.py”, line 293, in compute_euclidean_map
run(‘gdal_proximity.py “%s” “%s” -values %s’ % (geotiff_path, output_path, nodata))
File “D:\WebODM\resources\app\apps\ODM\opendm\system.py”, line 106, in run
raise SubprocessException(“Child returned {}”.format(retcode), retcode)
opendm.system.SubprocessException: Child returned 1

===== Done, human-readable information to follow… =====

[ERROR] Uh oh!

===== Dumping Info for Geeks (developers need this to fix bugs) =====
Child returned 1
Traceback (most recent call last):
File “D:\WebODM\resources\app\apps\ODM\stages\odm_app.py”, line 94, in execute
self.first_stage.run()
File “D:\WebODM\resources\app\apps\ODM\opendm\types.py”, line 346, in run
self.next_stage.run(outputs)
File “D:\WebODM\resources\app\apps\ODM\opendm\types.py”, line 327, in run
self.process(self.args, outputs)
File “D:\WebODM\resources\app\apps\ODM\stages\splitmerge.py”, line 164, in process
system.run(" ".join(map(double_quote, map(str, argv))), env_vars=os.environ.copy())
File “D:\WebODM\resources\app\apps\ODM\opendm\system.py”, line 106, in run
raise SubprocessException(“Child returned {}”.format(retcode), retcode)
opendm.system.SubprocessException: Child returned 1

===== Done, human-readable information to follow… =====

[ERROR] Uh oh!

Despite the 5000 split, the 4th submodel had over 8250 images in it! Not that appears to have been the issue, since it was working on the 1st submodel 0000 when it crashed.

1 Like

@Gordon
I’m curious to find out what CPU, RAM, GPU and storage resources you’ve got for processing this volume of images…?
Are you using GCP’s and if so, how many?

2 Likes

CPU i7 10700K base speed 3.8GHz
96GB RAM, 352GB including Virtual Memory on the SSD
GPU NVIDIA GeForce GTX 1650 SUPER (not that it has had much use lately, fixed with latest build)
Running WebODM (nothing else on the drive) on a 2TB SSD
Not using GCPs

The full set of images was processed at the equivalent of WebODM high feature quality by my client with Agisoft Metashape on a computer with 512GB RAM

2 Likes

And what are the images, maybe 20MPx from a P4P?

2 Likes

M2P, which I think is the same 20MP Hasselblad camera as the P4P.
Resized to 3644 px wide in this case

2 Likes

Do you get good quality orthophoto with M2P?
I am wondering, with such a lens distortion and rolling shutter,
how accurate and usable products are from this drone?

2 Likes

Yes, I’ve been getting pretty good orthophotos, relative accuracy is generally quite good, and not too bad absolute, according to the QR.

2 Likes