Windows Native WebODM hung matching images

Native WebODM 1.9.14 Build 61 on Windows 10 Pro. 64GB memory. 2TB SSD

I have about 400 images. The photos are all taken handheld with a GoPro 5 in a 30x30’ area (GPS data is useless). There is sky in many of the images (can’t be avoided). I processed something similar with ~120 images successfully.

Options: auto-boundary: true, debug: true, mesh-size: 300000, min-num-features: 30000, verbose: true

After approximately 9 hours, Task Output stopped updating. That was almost 4 hours ago. Downloading the log file shows the same thing. Here are the last few lines:

2022-05-24 19:38:25,470 DEBUG: Matching G0017061.JPG and G0017193.JPG. Matcher: FLANN (symmetric) T-desc: 274.920 T-robust: 0.022 T-total: 274.947 Matches: 66 Robust: 11 Success: False
2022-05-24 19:38:40,150 DEBUG: Matching G0017103.JPG and G0017053.JPG. Matcher: FLANN (symmetric) T-desc: 263.768 T-robust: 0.026 T-total: 263.827 Matches: 53 Robust: 11 Success: False
2022-05-24 19:39:33,604 DEBUG: Matching G0016991.JPG and G0017036.JPG. Matcher: FLANN (symmetric) T-desc: 279.608 T-robust: 0.020 T-total: 279.637 Matches: 53 Robust: 10 Success: False
2022-05-24 19:39:54,833 DEBUG: Matching G0016990.JPG and G0016940.JPG. Matcher: FLANN (symmetric) T-desc: 268.316 T-robust: 0.049 T-total: 268.380 Matches: 51 Robust: 10 Success: False

(How do I insert text that will scroll left/right instead of wrapping?)

The system shows a little CPU (7% on an 8-core/16 thread system) used by python.exe, a little disk activity by WebODM and a couple of postgres tasks, and a little network activity by WebODM, a postgres and couple python tasks.

Any idea what may have happened?

For now, I’m going to cancel the task, reboot my computer, restart WebODM and try again. I hate to lose the 12+ hours of processing :slight_smile:

2 Likes

That happens quite often in the matching and the adding images stages for me - nothing happening for hours in the console log, then a dump of activity which fills in the times when there was nothing obvious happening. Check task manager for changes in memory working set and commit size. If they are changing for the python processes, then things are still going on and it’s best to let it keep running. I guess at some point you have to pull the plug and try with different settings though.

2 Likes

Thanks, Gordon. I killed everything and started processing a different set of similar images. About 10 hours in it also stopped writing to the log. Yet I’m still seeing similar activity to the previous set of images. 7% CPU is about one thread. Me thinks there’s a problem with the log writer, and python is busy working away on matching.

1 Like

I restarted the same set of images as before, and it appears it was probably running fine before, just a long gap of not updating the log file. Most of the time the log file is within a few minutes of current time, but it does at times lag quite a bit (13 minutes at the moment).

Now the problem I have is that this has been running for 55 hours. I know my images were not ideal. There was a lot of sky in many of them. I have reshot the subject and tried to avoid sky as much as possible. I hate to throw away 55 hours of processing, but I think that might be wisest.

Two things that come to mind: It would be nice if the log file was written to regularly; no long periods without updating. And wouldn’t it be nice if image matching were multi-threaded? Instead of 55 hours with a single thread, it could be more like 3.5 hours with all 16 threads.

3 Likes

Currently 5 hours into matching 5880 image pairs from 526 images, and it appears that it is multi-threaded, although with other tasks I’ve seen it sitting on under 10% CPU total utilisation for extended periods.

2 Likes

I have run/am running two different sets of images of different subjects. I thought these would be similar to each other, but obviously not. Notice the first is matching 2966 image pairs, whereas the 2nd is matching 66,430 pairs. No wonder it isn’t completing in 50+ hours. The first completed in <10 hours.

[INFO] Loading 294 images
2022-05-24 23:42:09,976 INFO: Matching 2966 image pairs
2022-05-24 23:42:09,976 INFO: Computing pair matching with 16 processes

[INFO] Loading 365 images
2022-05-27 21:39:07,215 INFO: Matching 66430 image pairs
2022-05-27 21:39:07,215 INFO: Computing pair matching with 14 processes

Both say they are using multiple processes. The 2nd is using 14 processes instead of 16 as I limited it to 14 so I could do a little work on my computer while it was running. Other than max-concurrency, the options for both are the same:

auto-boundary: true, debug: true, max-concurrency: 14, mesh-size: 350000, min-num-features: 30000, resize-to: -1, verbose: true

If I look at system resources, not much memory is being used and disk access is pretty low, with most or all of the processes doing disk access being unrelated to WebODM. CPU usage is also relatively low, running at about 17% with regular spikes into the mid-20s. This screen capture is sorted by descending CPU usage:

The top process is consistently python.exe, using 7% with occasional 6% and up to 9%. There are 7 python.exe processes, but only this one is getting any CPU time.

One postgres.exe process is showing quite a bit of network activity, along with a couple of python.exe processes, but I haven’t determined what they are talking to.

I would expect the image matching to be CPU-intensive (but what do I know?). Since I’m only seeing 7% CPU usage on one python task, and overall my system is not what you’d call very busy, I’m surmising that only one matching is actually going on at a time.

2 Likes

That is an extraordinary number of pairs for 365 images! Are they all covering a small area with >90% overlap?

1 Like

I’m trying to figure out why so many pairs, and how to reduce the number. The subject is a complex bronze sculpture with lots of detail. It’s probably about 8’ long, 4’ wide, and 7’ high. Reflections are a problem. I shot a lot of images with a GoPro 5, somewhat random trying to cover everything. I tried earlier with ~120 images, and there were lots of missing areas and bad reconstructions. So I went back and shot more photos. Since 66K matches were going to take a very long time, I’ve culled a lot of images. In examining the image details, I’ve found the GoPro image quality isn’t really the greatest, and a lot of images were not focused well, and depth of field isn’t adequate in many cases. I am now re-running with about 170 images. There’s a lot of background that can’t be avoided, which I will remove later with Blender. I used auto-boundary: true on my latest completed run, and that trimmed away about half of the sculpture, so am running without that option now, and with additional images.

First step is to try to get something reasonable. Second step may be to go back and shoot it all again. I’m not where the sculpture is right now, so can’t do that for a few weeks, then I need to wait for another cloudy day for shooting. I may try using my DSLR (I’ll have to take a ladder - I put the GoPro on a selfie stick to get the top), or maybe my cell phone. Kind of opposite ends of the spectrum, but I believe either could get sharper images with good depth of field. I did try my cell phone, but when shooting many images it started lagging, I suspect due to sensor heating.

2 Likes

For a difficult object like that, I’d try something like a curtain of dense black hessian around it. That should remove annoying background features and reflections.

2 Likes

That’s a great idea, but it’s in a public park. There are limits to what I can do. :smiley: But I’ll keep that in mind for future projects.

2 Likes

Attempting a large task here for the past 4 days and it’s been 11 1/2 hours since the last matching images update! I’m not concerned though :slight_smile:

Using performance monitor I can see the CPU has been working its way though the task, even though it is currently only at ~20% utilisation. The period with no updates started when the CPU activity increased.

Hmm, I’ve just noticed another example of incorrect timing! The task has been running for 4 days and 1.5 hours, ie 97.5 hours, yet the timer says only 87h03m.
I went to diagnostics and back to the task, and now it says it only been going 85 hours 6 minutes! Something is clearly wrong with the timer.

2 Likes

I also ran into another instance where the screen log and the log file did not update for more than 12 hours. As the task had been running for over 24 hours, and I projected it wouldn’t complete for another ~90 hours, I killed it. I’m working on a post about image matching - it is not running multiple threads, even though it states that it is in the log. I’m wondering if this is a documentation problem or a software bug.

2 Likes

I wasn’t watching at the time, but my example yesterday updated after over 14 hours, the timer clock jumped to the new correct time, but it’s been over 19 hours since then, with no further updates in the console.

Things are still happening though, and all 16 logical processors appear to be active in the current matching stage, 5 days into the task.

1 Like

I’ve just passed 46 hours since the console updated! It’s in the middle of adding images to the reconstruction. CPU has averaged about 60% utilisation over the past couple of days, so it is clearly still churning away at the task.
268 hours into it now, although the console isn’t quite up to 227 hours.

2 Likes