I ran some tests using “fast ortho” settings with a set of 4830 images (20MPx).
These images were grouped in separate folders according to which flight they were taken, e.g. nadir north-south, nadir east-west, oblique angle north-south, oblique angle east-west.
The first two tests I ran both failed after around 1 hour with error “Processing stopped because of strange values in the reconstruction”.
For the third test I tried adding the images in a different order (e.g. ob-ew folder then ob-ns folder then na-ew folder then na-ns instead of na-ns,na-ew,ob-ns,ob-ew). Processing completed successfully (it took about 23 hours).
Can anyone explain why the order of image selection would have such a significant impact?
1 Like
This is odd… Are you certain that all images were uploaded to the node each time? Same image count?
In theory, the order they’re passed during upload should not affect anything…
Yes same image count each time. It’s perplexing…
1 Like
Hmmm…
Could you please give me a bit more information about your system?
For instance:
- Operating System and Version
eg: Windows 11, MacOS 15.1, Ubuntu Linux 20.04LTS, etc…
- Hardware Specifications
eg: 32GB RAM, i7-6700k, NVIDIA GTX 1050TI OC, 1TB SSD, etc…
- WebODM Install Method
eg: Native installer, Docker, Snap, GitHub download, compiled from source
- WebODM/ODM Version
eg: WebODM v1.9.12 Build 55, ODM v2.8.0, etc…
- Procesing Node
eg: Automatic, Lightning, local (node-odm-1), etc…
- Screenshots demonstrating the issue/behavior/error messages
Are you using the WebODM WebUI Dashboard (Drone & UAV Mapping Software | WebODM) to create and upload your Tasks?
Sure thing Saijin - yes I’m using the WebUI Dashboard.
System info:
Ubuntu 20.04.4 LTS
512GB RAM
Dual Xeon 20-core CPUs
NVidia Telsa K80
Magnetic HDDs (3 RAID volumes: 2TB volume for OS, 2TB volume for Docker, 2TB volume for swap)
1.5TB swap file
Docker version 20.10.14, build a224086 (not using Docker Desktop)
WebODM 1.9.12
nodeodm:gpu
And for this job:
WebODM settings:
auto-boundary: true, debug: true, fast-orthophoto: true, gps-accuracy: 1, orthophoto-resolution: 1, rerun-from: dataset, resize-to: -1, verbose: true
4830 images at 20Mpx taken with a P4P RTK
North-south nadir
East-west nadir
North-south oblique (45 degree gimbal pitch)
East-west oblique (45 degree gimbal pitch)
23 GCPs with 145 image cross-reference entries in GCP file
When the processing succeeded, here are the summary parameters:
Average GSD: 0.99
Area: 158,535sqm
Reconstructed points: 4109255
1 Like
Hmm… Are you willing/able to update to current and try again?
What browser(s) have you tried? Is there a difference in behavior between them? I’d say try a webkit (Chrome/Edge) and a Gecko (Firefox) to compare.
I only have Firefox on this machine. Once my current job finishes I’ll update and re-test.
1 Like
Ok, I updated WebODM to the latest version & retested - same issue occurred where if I selected the images in one particular order then processing completed successfully, but a different order failed after 50min with the “Processing stopped because of strange values in the reconstruction”.
1 Like
Are you willing/able to make an issue with all the details we discussed here on our GitHub for WebODM so we can track this and maybe ping you to get further details about it?
If so:
1 Like
Yes I’d like to help - I’m not a software dev so hopefully that isn’t a prerequisite though…
Issue #1186 raised now.
1 Like
Not at all! Just system details and your STR (Steps to Reproduce) so we can try and make it happen and debug what exactly is happening.
Thanks for raising the issue!
1 Like
I’ve been looking at the log files and the issue seems to have something to do with opensfm generating bad zip files. Here’s some log excerpts:
Example a)
File “/usr/local/lib/python3.8/dist-packages/numpy/lib/npyio.py”, line 112, in zipfile_factory
return zipfile.ZipFile(file, *args, **kwargs)
File “/usr/lib/python3.8/zipfile.py”, line 1269, in init
self._RealGetContents()
File “/usr/lib/python3.8/zipfile.py”, line 1336, in _RealGetContents
raise BadZipFile(“File is not a zip file”)
zipfile.BadZipFile: File is not a zip file
terminate called without an active exception
/code/SuperBuild/install/bin/opensfm/bin/opensfm: line 12: 175150 Aborted (core dumped)
Example b)
_zip = zipfile_factory(fid)
File “/usr/local/lib/python3.8/dist-packages/numpy/lib/npyio.py”, line 112, in zipfile_factory
return zipfile.ZipFile(file, *args, **kwargs)
File “/usr/lib/python3.8/zipfile.py”, line 1269, in init
2022-05-25 09:15:20,626 DEBUG: Matching EW2_DJI_0569.JPG and NS2_DJI_0259.JPG. Matcher: FLANN (symmetric) T-desc: 2.837 T-robust: 0.016 T-total: 2.860 Matches: 168 Robust: 139 Success: True
self._RealGetContents()
File “/usr/lib/python3.8/zipfile.py”, line 1336, in _RealGetContents
raise BadZipFile(“File is not a zip file”)
zipfile.BadZipFile: File is not a zip file
terminate called without an active exception
/code/SuperBuild/install/bin/opensfm/bin/opensfm: line 12: 4804 Aborted (core dumped)
Example c)
_zip = zipfile_factory(fid)
File “/usr/local/lib/python3.8/dist-packages/numpy/lib/npyio.py”, line 112, in zipfile_factory
return zipfile.ZipFile(file, *args, **kwargs)
File “/usr/lib/python3.8/zipfile.py”, line 1269, in init
2022-06-08 00:44:58,418 DEBUG: Matching OB2_DJI_0023.JPG and OB1_DJI_0265.JPG. Matcher: FLANN (symmetric) T-desc: 2.551 T-robust: 0.028 T-total: 2.579 Matches: 37 Robust: 10 Success: False
self._RealGetContents()
File “/usr/lib/python3.8/zipfile.py”, line 1336, in _RealGetContents
raise BadZipFile(“File is not a zip file”)
zipfile.BadZipFile: File is not a zip file
terminate called without an active exception
terminate called recursively
/code/SuperBuild/install/bin/opensfm/bin/opensfm: line 12: 481253 Aborted (core dumped)
I’ve updated this info in issue #1186 as well.
1 Like
I’ve been testing with different loading sequences and varying volumes of images and have now replicated the same error with a different file sequence & quantity.
I think what’s happening is there is an underlying issue which causes corrupt zip files to be created during the feature matching in the SFM processing stage. It is exacerbated by higher resolution images, larger quantities of images, and the sequence of loading images seems to have an impact also.
This is the common error that is displayed in the logs:
File “/usr/lib/python3.8/zipfile.py”, line 1336, in _RealGetContents
raise BadZipFile(“File is not a zip file”)
zipfile.BadZipFile: File is not a zip file
terminate called without an active exception
terminate called recursively
/code/SuperBuild/install/bin/opensfm/bin/opensfm: line 12
I’ve added this info to issue #1186.
1 Like
Johnny5, are you at any time swapping/paging? I’ve been getting corrupted .npz/features when I swap heavily for extended periods of time (with seemingly higher likelihood the longer I swap/page).
1 Like
I didn’t measure that so I’ll run another test and let you know. If it is deterministic it should only take thirty minutes to trigger the error.
1 Like
Test completed - did not use any Swap space and still encountered the “bad zip file” error after 25 minutes.
1 Like
Awesome, thank you.
Other sanity checks:
Local internal volume, right? Not a NAS/network share, not an external USB drive, etc, correct?
Yes local volume on an internal RAID.
~$ df -h
Filesystem Size Used Avail Use% Mounted on
udev 252G 0 252G 0% /dev
tmpfs 51G 2.7M 51G 1% /run
/dev/sda5 412G 36G 355G 10% /
tmpfs 252G 0 252G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 252G 0 252G 0% /sys/fs/cgroup
/dev/loop0 56M 56M 0 100% /snap/core18/2344
/dev/loop1 128K 128K 0 100% /snap/bare/5
/dev/loop3 62M 62M 0 100% /snap/core20/1518
/dev/loop5 255M 255M 0 100% /snap/gnome-3-38-2004/106
/dev/loop6 249M 249M 0 100% /snap/gnome-3-38-2004/99
/dev/loop2 56M 56M 0 100% /snap/core18/2409
/dev/loop4 219M 219M 0 100% /snap/gnome-3-34-1804/77
/dev/loop7 51M 51M 0 100% /snap/snap-store/547
/dev/loop8 219M 219M 0 100% /snap/gnome-3-34-1804/72
/dev/loop9 62M 62M 0 100% /snap/core20/1494
/dev/loop10 55M 55M 0 100% /snap/snap-store/558
/dev/loop11 66M 66M 0 100% /snap/gtk-common-themes/1519
/dev/loop12 82M 82M 0 100% /snap/gtk-common-themes/1534
/dev/loop13 45M 45M 0 100% /snap/snapd/15904
/dev/sdb 1.8T 567G 1.2T 33% /mnt/2TBRAID
/dev/sda1 511M 4.0K 511M 1% /boot/efi
/dev/sdc 1.8T 1.6T 204G 89% /mnt/2TBSWAP
//192.168.2.1/Computer_Backups 11T 6.3T 4.3T 60% /media/jgnas
tmpfs 51G 8.0K 51G 1% /run/user/128
/dev/loop15 47M 47M 0 100% /snap/snapd/16010
tmpfs 51G 28K 51G 1% /run/user/1000
1 Like
I repeated the test 3 times. It failed the first two times (with the bad zip file error) and completed successfully the third time.
1 Like
That’s fairly good reproduction… Maybe a timing issue?