Curious if anyone has spent any time profiling/optimizing/multithreading OpenSFM?
I just completed a map of roughly 1km x 1km, from ~1,500 photos and I’d like to go a bit bigger, but I’m worried that I’m going to hit the limits of processing time/RAM pretty quick.
I ran ODM on a machine with 32 cores and 256GB of RAM, but for much of the processing time I was only using a single core (and ~120GB of RAM). The reconstruction stage, in particular, took a huge amount of time – ~10 hours.
Before I dive in and set up a profiler and start looking for areas to improve it, has anyone done this already? If folks have already invested a lot of time tightly optimizing things, I don’t want to try and eke out a small improvement. But if there’s been only cursory attention paid to performance maybe I can find some ways to shave down processing time.
1 Like
I’m certain there are several areas where OpenSfM could be optimized.
One of the biggest challenges in the OpenSfM codebase is that many operations are still being done on the Python side (the mapillary folks have been moving a lot of code to C++, which has improved performance a lot). And the GIL in Python complicates optimizations that leverage multi-threading.
1 Like
So WebODM is based on OpenSFM?
Good to know, because of my other question concerning the import with the photogrammetry add-on in Blender.
1 Like
Part of our pipeline is, yep!
I had (perhaps naively) assumed that it was built on the fork of OpenSFM in the OpenDroneMap github. (GitHub - OpenDroneMap/OpenSfM: Open source Structure from Motion pipeline) but the docs in the ODM repo point to the mapillary OpenSFM.
Trying to parse through the snapcraft.yaml files and configure.sh, maybe my assumption was wrong?
(Also, TIL about the GIL in Python and became very sad.)
Edit: Nope, found it. It looks like my assumption was correct: ODM/External-OpenSfM.cmake at master · OpenDroneMap/ODM · GitHub
1 Like
Yeah, you are correct. We maintain forks of some things in our pipeline that do not have behaviors and changes we use upstream yet.
Interesting. I had also assumed that dev was pretty stale, but it just looks like development happens in a series of side branches rather than in master?
1 Like
For our stuff? Yes, I think Piero does work in branches so they can be grabbed by the various scripts to integrate everything into the final package(s) that become WebODM, etc.
Yeah, noticed that the currently deployed branch is ~1600 commits ahead and 8 commits behind master. That explains the confusion. 
1 Like
Alright, got a dev environment set up, which was kind of a hassle on MacOS (some of the dependencies conflicted, which sent me down a rabbit hole of dependency resolution). But I got it running, started profiling various pieces. Found a few potential candidates for speedups (triangulation in the reconstruction stage might benefit from, e.g. JIT compilation via numba, for instance – which would work around the GIL limitations if we ever did want to parallelize that process).
A bunch of code does just farm out work to opencv, which I assume is already about as fast as it’s going to be (e.g. the flann_index function is simply going to take the time it takes).
Going to experiment a bit over the next couple days and I’ll write up anything I find.
5 Likes
Wrote up a quick and dirty profiling pass on the various stages. In general, I think there’s not a ton of fat to trim (unfortunately/fortunately), but there are definitely a few places I want to poke around. It’s pretty surface level, but I think there’s interesting meat in there.
Particularly in the reconstruction, mesh, and construct tracks segments.
Comments here (or in the doc) very welcome, also I’m new to this project, so I may have made some obvious mistakes or missed some observations, and my profiling experience is more in Java/C++/C#/JS land than Python-Land, so I may not be the best at that either.
But nonetheless I hope this is useful/interesting to y’all while I poke around a bit and see what I can do.
1 Like
Looks great!
Beyond me, haha.
Thanks for taking the time to help us all out!
Alright, spent the day tinkering and exploring the reconstruction code in depth.
There are a lot of places where C++ calls are mixed in with Python in ways that do make it hard to squeeze much more out of the code. Techniques like JIT compilation have challenges passing between python and C++.
I put about 4 hours into pulling it apart and putting it back together in different ways and I didn’t find any low hanging fruit. I did manage to get numba working in very, very small places but the benefit was really quite minimal.
Someone braver than me might be able to take a look at the grow_reconstruction
method and figure out how to parallelize it, but in the hours I spent today it looks like there’s a lot of state being updated serially as the process chews through data, and ensuring you could grow that reconstruction without a race condition would be a ton of work.
I spent an hour with mesh and create tracks, and I think it might be possible to speed up create tracks, but meshing also has C++ objects mixed in, making it harder to tease apart. Not much I could do there. I did manage to get a few small optimizations, but in profiling them they didn’t save a noticeable amount of time.
3 Likes