Can't process Large Dataset

Hi guys,
I have obtained a large dataset with 40,000 images of a dry area in Africa with buildings
It is split into areas of approx 7,000 images but I am getting errors every time I try to process them.
I am processing on a VM with:
RAM: 125G
Cores: 32
Hard Disk: 1TB (120G used)
I am running using the following command:
docker run -ti --rm -v /odm/datasets:/datasets opendronemap/odm --project-path /datasets CampFull --ignore-gsd --dsm --pc-quality high --dem-resolution 2.0 --orthophoto-resolution 2.0 --min-num-features 15000 --matcher-neighbors 20
If there is something I should change here, please do let me know.

2 Likes

Log.json includes:
{
“name”: “opensfm”,
“startTime”: “2022-01-04T14:26:28.297518”,
“messages”: [
{
“message”: “Running opensfm stage”,
“type”: “info”
},
{
“message”: “Writing exif overrides”,
“type”: “info”
},
{
“message”: “Maximum photo dimensions: 4056px”,
“type”: “info”
},
{
“message”: “Photo dimensions for feature extraction: 2028px”,
“type”: “info”
},
{
“message”: “Altitude data detected, enabling it for GPS alignment”,
“type”: “info”
},
{
“message”: [
“use_exif_size: no”,
“flann_algorithm: KDTREE”,
“feature_process_size: 2028”,
“feature_min_frames: 15000”,
“processes: 32”,
“matching_gps_neighbors: 20”,
“matching_gps_distance: 0”,
“optimize_camera_parameters: yes”,
“undistorted_image_format: tif”,
“bundle_outlier_filtering_type: AUTO”,
“sift_peak_threshold: 0.066”,
“align_orientation_prior: vertical”,
“triangulation_type: ROBUST”,
“retriangulation_ratio: 2”,
“feature_type: SIFT”,
“use_altitude_tag: yes”,
“align_method: auto”,
“local_bundle_radius: 0”
],
“type”: “info”
},
{
“message”: “Wrote reference_lla.json”,
“type”: “info”
},
{
“message”: “running /code/SuperBuild/install/bin/opensfm/bin/opensfm extract_metadata “/datasets/DukraFull/opensfm””,
“type”: “info”
},
{
“message”: “running /code/SuperBuild/install/bin/opensfm/bin/opensfm detect_features “/datasets/DukraFull/opensfm””,
“type”: “info”
},
{
“message”: “running /code/SuperBuild/install/bin/opensfm/bin/opensfm match_features “/datasets/DukraFull/opensfm””,
“type”: “info”
},
{
“message”: “running /code/SuperBuild/install/bin/opensfm/bin/opensfm create_tracks “/datasets/DukraFull/opensfm””,
“type”: “info”
},
{
“message”: “running /code/SuperBuild/install/bin/opensfm/bin/opensfm reconstruct “/datasets/DukraFull/opensfm””,
“type”: “info”
},
{
“message”: “Uh oh! Processing stopped because of strange values in the reconstruction. This is often a sign that the input data has some issues or the software cannot deal with it. Have you followed best practices for data acquisition? See Flying Tips — OpenDroneMap 2.7.0 documentation”,
“type”: “error”
}
],
“endTime”: “2022-01-08T16:21:23.335320”,
“totalTime”: 352495.04
}
],
“processes”: [
{
“command”: “/code/SuperBuild/install/bin/opensfm/bin/opensfm extract_metadata “/datasets/DukraFull/opensfm””,
“exitCode”: 0,
“output”: [
“2022-01-04 14:37:15,436 INFO: Extracting EXIF for Area1_Route8_2065.JPG”,
“2022-01-04 14:37:15,518 INFO: Extracting EXIF for Area1_Route8_2067.JPG”,
“2022-01-04 14:37:15,603 INFO: Extracting EXIF for Area1_Route8_2069.JPG”,
“2022-01-04 14:37:15,688 INFO: Extracting EXIF for Area1_Route8_2071.JPG”,
“2022-01-04 14:37:15,771 INFO: Extracting EXIF for Area1_Route8_2073.JPG”,
“2022-01-04 14:37:15,854 INFO: Extracting EXIF for Area1_Route8_2075.JPG”,
“2022-01-04 14:37:15,936 INFO: Extracting EXIF for Area1_Route8_2077.JPG”,
“2022-01-04 14:37:16,019 INFO: Extracting EXIF for Area1_Route8_2079.JPG”,
“2022-01-04 14:37:16,102 INFO: Extracting EXIF for Area1_Route8_2081.JPG”,
“2022-01-04 14:37:16,186 INFO: Extracting EXIF for Area1_Route8_2083.JPG”
]
},
{
“command”: “/code/SuperBuild/install/bin/opensfm/bin/opensfm detect_features “/datasets/DukraFull/opensfm””,
“exitCode”: 0,
“output”: [
“2022-01-04 14:59:45,705 DEBUG: No segmentation for Area1_Route3_0037.JPG, no features masked.”,
“2022-01-04 14:59:45,823 DEBUG: No segmentation for Area1_Route7_0135.JPG, no features masked.”,
“2022-01-04 14:59:45,897 DEBUG: No segmentation for Area1_Route3_0039.JPG, no features masked.”,
“2022-01-04 14:59:45,988 DEBUG: No segmentation for Area1_Route3_0057.JPG, no features masked.”,
“2022-01-04 14:59:46,122 DEBUG: No segmentation for Area1_Route7_0137.JPG, no features masked.”,
“2022-01-04 14:59:46,188 DEBUG: No segmentation for Area1_Route3_0043.JPG, no features masked.”,
“2022-01-04 14:59:46,411 DEBUG: No segmentation for Area1_Route3_0045.JPG, no features masked.”,
“2022-01-04 14:59:46,462 DEBUG: No segmentation for Area1_Route3_0047.JPG, no features masked.”,
“2022-01-04 14:59:46,497 DEBUG: No segmentation for Area1_Route3_0041.JPG, no features masked.”,
“2022-01-04 14:59:46,671 DEBUG: No segmentation for Area1_Route3_0049.JPG, no features masked.”
]
},
{
“command”: “/code/SuperBuild/install/bin/opensfm/bin/opensfm match_features “/datasets/DukraFull/opensfm””,
“exitCode”: 0,
“output”: [
“2022-01-04 16:42:15,256 DEBUG: Matching Area1_Route3_0407.JPG and Area1_Route4_1255.JPG. Matcher: FLANN (symmetric) T-desc: 2.301 T-robust: 0.001 T-total: 2.303 Matches: 109 Robust: 79 Success: True”,
“2022-01-04 16:42:15,276 DEBUG: Matching Area1_Route8_0253.JPG and Area1_Route7_1539.JPG. Matcher: FLANN (symmetric) T-desc: 2.586 T-robust: 0.001 T-total: 2.588 Matches: 169 Robust: 131 Success: True”,
“2022-01-04 16:42:15,387 DEBUG: Matching Area1_Route8_1147.JPG and Area1_Route3_0745.JPG. Matcher: FLANN (symmetric) T-desc: 2.094 T-robust: 0.001 T-total: 2.096 Matches: 701 Robust: 663 Success: True”,
“2022-01-04 16:42:15,428 DEBUG: Matching Area1_Route5_0229.JPG and Area1_Route8_0793.JPG. Matcher: FLANN (symmetric) T-desc: 2.516 T-robust: 0.014 T-total: 2.530 Matches: 36 Robust: 9 Success: False”,
“2022-01-04 16:42:15,430 DEBUG: Matching Area1_Route2_0377.JPG and Area1_Route6_0281.JPG. Matcher: FLANN (symmetric) T-desc: 1.925 T-robust: 0.002 T-total: 1.927 Matches: 680 Robust: 643 Success: True”,
“2022-01-04 16:42:15,471 DEBUG: Matching Area1_Route2_1411.JPG and Area1_Route7_1477.JPG. Matcher: FLANN (symmetric) T-desc: 2.588 T-robust: 0.001 T-total: 2.591 Matches: 616 Robust: 580 Success: True”,
“2022-01-04 16:42:15,472 DEBUG: Matching Area1_Route8_1175.JPG and Area1_Route1_0409.JPG. Matcher: FLANN (symmetric) T-desc: 1.936 T-robust: 0.014 T-total: 1.951 Matches: 26 Robust: 9 Success: False”,
“2022-01-04 16:42:15,587 DEBUG: Matching Area1_Route1_1913.JPG and Area1_Route3_1887.JPG. Matcher: FLANN (symmetric) T-desc: 1.950 T-robust: 0.014 T-total: 1.964 Matches: 41 Robust: 10 Success: False”,
“2022-01-04 16:42:15,641 DEBUG: Matching Area1_Route5_1963.JPG and Area1_Route7_1941.JPG. Matcher: FLANN (symmetric) T-desc: 2.262 T-robust: 0.013 T-total: 2.276 Matches: 40 Robust: 10 Success: False”,
“2022-01-04 16:42:15,782 INFO: Matched 87289 pairs (brown-brown: 87289) in 6143.291716268286 seconds (0.07037876157043144 seconds/pair).”
]
},
{
“command”: “/code/SuperBuild/install/bin/opensfm/bin/opensfm create_tracks “/datasets/DukraFull/opensfm””,
“exitCode”: 0,
“output”: [
“2022-01-04 17:05:29,572 INFO: reading features”,
“2022-01-04 17:15:54,116 DEBUG: Merging features onto tracks”,
“2022-01-04 17:30:55,901 DEBUG: Good tracks: 21116815”
]
},
{
“command”: “/code/SuperBuild/install/bin/opensfm/bin/opensfm reconstruct “/datasets/DukraFull/opensfm””,
“exitCode”: 1,
“output”: [
“reconstruct.run_dataset(dataset)”,
“File “/code/SuperBuild/install/bin/opensfm/opensfm/actions/reconstruct.py”, line 9, in run_dataset”,
“report, reconstructions = reconstruction.incremental_reconstruction(”,
“File “/code/SuperBuild/install/bin/opensfm/opensfm/reconstruction.py”, line 1348, in incremental_reconstruction”,
“reconstruction, rec_report[“grow”] = grow_reconstruction(”,
“File “/code/SuperBuild/install/bin/opensfm/opensfm/reconstruction.py”, line 1284, in grow_reconstruction”,
“brep = bundle(”,
“File “/code/SuperBuild/install/bin/opensfm/opensfm/reconstruction.py”, line 90, in bundle”,
“report = pysfm.BAHelpers.bundle(”,
“MemoryError: std::bad_alloc”
]
}
],
“success”: false,
“error”: {
“code”: 1,
“message”: “Child returned 1”
},
“stackTrace”: [
“Traceback (most recent call last):”,
“File “/code/stages/odm_app.py”, line 94, in execute”,
“self.first_stage.run()”,
“File “/code/opendm/types.py”, line 347, in run”,
“self.next_stage.run(outputs)”,
“File “/code/opendm/types.py”, line 347, in run”,
“self.next_stage.run(outputs)”,
“File “/code/opendm/types.py”, line 347, in run”,
“self.next_stage.run(outputs)”,
“File “/code/opendm/types.py”, line 328, in run”,
“self.process(self.args, outputs)”,
“File “/code/stages/run_opensfm.py”, line 37, in process”,
“octx.reconstruct(self.rerun())”,
“File “/code/opendm/osfm.py”, line 53, in reconstruct”,
“self.run(‘reconstruct’)”,
“File “/code/opendm/osfm.py”, line 34, in run”,
“system.run(’%s %s “%s”’ %”,
“File “/code/opendm/system.py”, line 106, in run”,
“raise SubprocessException(“Child returned {}”.format(retcode), retcode)”,
“opendm.system.SubprocessException: Child returned 1”,
“”
],
“endTime”: “2022-01-08T16:21:23.335320”,
“totalTime”: 352540.44
}

1 Like

I’m not an expert but “ignore-GSD” can cause problems. Try without it.

1 Like

Yep. I have seen this before… I will remove that.
I have mostly used WebODM in the past so I am getting used to using the command line.
So I am assuming the issue is not memory one. it is my settings that are causing the problem.

2 Likes

They are going to remove that option from WebODM I’ve heard.

2 Likes

We will at least hide it a bit. :slight_smile:

2 Likes

It could be insufficient RAM .
You have error line
"“MemoryError: std::bad_alloc” “” , which basically means – not enough memory
Could you try with smaller batch of images

1 Like

I have restarted it without the ignore-GSD.
Could take a day or two before I get an outcome.
It was running on a server with other processes and the technician said there was no issue with RAM at the time it finished.
He seemed to think it must be an application issue. (but he would :wink: )

1 Like

Agreed. This and removing ignore-gsd will get you close, but you will have a higher probability of success if you reduce the batch size to 5000 images. I often return to these settings in ClusterODM to remember the approximate RAM use limitations.

    "imageSizeMapping": [
        {"maxImages": 40, "slug": "s-2vcpu-2gb"},
        {"maxImages": 250, "slug": "s-4vcpu-8gb"},
        {"maxImages": 500, "slug": "s-6vcpu-16gb"},
        {"maxImages": 1500, "slug": "s-8vcpu-32gb"},
        {"maxImages": 2500, "slug": "s-16vcpu-64gb"},
        {"maxImages": 3500, "slug": "s-20vcpu-96gb"},
        {"maxImages": 5000, "slug": "s-24vcpu-128gb"}
    ]

These numbers assume you are using as much swap as you have RAM, so it would be good to allocate that if you haven’t. I like DO’s documentation on this:

1 Like

I have 2GB swap and 125GB RAM. Would 16GB swap improve things? Or should I be asking for 125GB?

1 Like

Ideally, 1x your RAM or more for really heavy work should help.

Think of it like this: If you could add another stick of (really slow) RAM for free, would you add just 2GB? Or would you add another 128GB or MORE.

Realistically, how much can you request from your IT Team?

1 Like

The alternative is to reduce your batch size to 2500, which is still quite large and should minimize your edge effects.

1 Like

My current test is on 8k images.

1 Like

Do you think the fact that we have 125GB RAM and 2GB swap could be the cause of the problem?
Our technician has suggested that this request is very unusual and doesn’t understand why. If he allocates e.g. 64GB swap and the server uses all of this, then the server will crawl…
I am assuming ODM uses swap in a way most applications do not?

1 Like

Actually it’s the OS that rules the ram and swap.

1 Like

The technician must be used to different problem spaces. This is not a DBMS or another system with linear, predictable load, this is an application that can allocate massive amounts of address space in large spikes. You either need an absurd amount of RAM (Ask Stephen about his 768GB :wink: ), or a large amount of RAM and a lot of SWAP to back that up when you’re working at the scale you are.

If you don’t have the total allocation space size, you will Out-Of-Memory and crash. All you can do at that point is make smaller split/merge groups as Stephen suggested and/or scale back your quality parameters and resize your images (which can present its own challenges to reconstruction) (and/or optionally setup a Cluster to process on multiple machines in parallel).

If your SWAP is backed by a reasonably fast storage volume, it should not crawl, as in all likelihood, you won’t be saturating the SWAP the entire time (if at all).

What is the most your IT Team feels comfortable allocating at this time?

This article provides a bit of an overview that might be helpful (or not, since various major Distros all disagree on how much SWAP is enough :rofl:):

2 Likes

This is all super info. I am learning loads and hopefully this will lead to more informed questions in future :wink:
We have agreed to add 64GB SWAP for the moment. Hopefully this will be installed tomorrow.
The area was split into 6 sections of approx 8k images. Drone imagery was taken from 8 angles. So there is a natural split in the imagery.
Unfortunately the sun was shining and there are a few moving objects like trucks and people but we will see how well these are managed.

2 Likes

Excellent! Please let us know how you get on.

The trucks/people will most likely get filtered out of the ortho with the default texturing setting.

Highly spectral surfaces in the sun, however, might not reconstruct well. Was that a large portion of what you needed to reconstruct?

1 Like

We didn’t really have a choice. So certain areas were photographed before noon on consecutive days and other areas in the afternoon. The sun was high so I think we should be ok. I have generated a 3d model before with shadows and it came out very well. So lets hope for the same here.

2 Likes

One other question… I am processing using WebODM on another pc (fewer images) and it has not started processing. The status is just ‘Queued’. Do I need to restart the processing node?

1 Like