Differences between Triangulation and Incremental SfM

Is there a one paragraph explanation of the difference between Triangulation and Incremental SfM. What I think I understand is Triangulation depends mostly upon the position and omega, phi, kappa angles in the geo.txt file to orient the images to generate the DEM. Incremental uses scene content to perform the same function. Help, please



So, classical structure from motion needs to solve for the full pose graph (position, omega, phi, kappa) iteratively. So (in rough terms) first feature extraction, then matching, then a whole lot of what is essentially fancy statistics to iteratively guess at the full pose graph as each camera gets added.

This is known as a global SfM (as opposed to local in the case of video SLAM and similar techniques) approach and is an exponential nightmare: every time we add a camera to the existing pose graph, we need to recalculate everything based on the new information. Global approaches are the most accurate, and for a few hundred images, they are doable. For a few thousand images or more, the compute becomes unsustainable rather quickly. And as with such exponential problems, the inflection point is annoying: maybe 550 images process reasonably on a given set of hardware but suddenly 560 images is annoyingly slow. Early days of OpenDroneMap were this way.

By contrast, you may imagine that local approaches eventually have to deal with global corrections as well, but they often just do it at the end of the process. Imagine, for example, using a video from a camera going in a loop around a building, matching and doing structure from motion all the way along the way (with respective incremental and additive errors, thus decreased accuracy relative to global approaches), and then when one gets back to the start of the loop, making some back calculations to ensure the geometry represents a loop. So: lots of small mistakes, one global correction at the end, instead of correcting for each image. (Footnote: some local methods do “loop closure” as often as able, not just at the end)

So, a number of years ago (2016-2017?) we worked with Mapillary to explore a new approach that combined the attributes of global SfM with local SfM. The principle was simple: use a loop closure approach incrementally or occasionally throughout a quasi-global approach. In other words: don’t update and refine the pose graph with each photo but every n-photos (this can be tuned in OpenSfM, but we no longer expose this parameter in OpenDroneMap). Without checking the code, I think we do this every 100 images.

Thus, the incremental approach was born. And, to be honest, it was successful beyond any hope I had. Its pretty performant, very accurate, and just works™.

Triangulation layers pre-populated pose graphs to (I think?) incremental (or maybe it is efficient enough to be a global approach – I haven’t looked at the code base since it was added… but Piero may be able to comment). We can imagine that if we have sufficiently decent model initialization with respect to position, omega, phi, and kappa, we can much more quickly and accurately converge on a good SfM model.

Ok, that was a lot of words and not a lot of pictures, but hopefully that covers it. Corrections/changes welcome.


Thank you VERY MUCH. This was completely adequate. I’m consulting on a project which presently uses the Triangulation SfM. There will be a number of cases in which the Ortho looks like a jigsaw puzzle where someone shook the pieces. The pieces are all laying on the table, but everything doesn’t line up.

When I opened the community to read your response, I found a parallel article by Alan Terra in which he is finding similar results.

If Triangulation works as you suggest, Preloading the Omega, Phi, and Kappa and then proceeding with the SfM shouldn’t be a detriment at all. I suspect your triangulation implementation weights the pre-loaded omega, phi, kappa values over the values inferred from the scene content.

I am also pushing my customer to switch from Triangulation to Incremental SfM. Mr Terra’s work only re-enforces my beliefs.

1 Like

Ah, yes: if the values are misread (consistent support for encoding within and between manufacturer is remarkably lacking), then incremental is the way to go. I’m not sure about the rest of the pose graph, but we’ve seen changes to how position is stored with some manufacturers between firmware updates. I wouldn’t be surprised if we see this on the angular encoding as well, which would make proper parsing of it a within model, within manufacturer, between manufacturer inconsistency.

In short, I would consider triangulation to be under preliminary support. It would be a useful exercise to start cataloging successes and failures per model and manufacturer so we can at least know which drones we expect it to work with, but I don’t think anyone has had the bandwidth for this yet.

1 Like

Yeah coordinate systems inconsistency between manufacturers and even firmware versions is widespread.

What would really simplify this is to create a tool that can import a set of images and display the assumed drone orientation, with support for adjusting any inconsistency and submit those adjustments back to the ODM codebase (similar to the RSCalibration web page).


What about a tool that performs incremental SfM on a dataset and uses that to create a reference make, model, firmware version, EXIF (+) tag interpretation.

I’d be willing to recommend the org host the processing for this. We’d have to think through licensing of the database results, whether those are free or open.


That would be cool too.


Do we have a catalog of existing formats and associated datasets?

1 Like

Nop. Just some logic from ODM/photo.py at master · OpenDroneMap/ODM · GitHub

1 Like

Logic is fine for a starting point. Thanks!

1 Like

Indeed the formats for camera orientation are bewildering.

I’ve got some data from a Wingtra One that contains what appear to be yaw, pitch, and roll angle values, and I’ve manually inspected them. I have to use a ridiculous formula to parse them; for example, I can get a reasonable-seeming Kappa value in degrees with (yaw - 0.32) * 72. Why those values? Absolutely no idea. No obvious connection to degrees, radians, gradians, arcminutes, hexacontades, or zams.

For a moment I thought maybe they were using binary degrees—which might make sense to reduce storage demands, though this seems like overkill given the size of the actual photos; saving 3 or even 7 bytes per photo (depending if 32-bit or 64-bit system) doesn’t seem worth the hassle using a weird unit—but alas, no.

The obvious solution seems to be—as Steve suggests—extracting the Omega, Phi, and Kappa angles from the reconstruction alone on a subset of data from a given platform, and comparing those values to the orientation data provided by the platform to calculate a conversion formula.

However, if that’s done centrally, someone will have to keep track of it, and it’ll inevitably get out of date (particularly as manufacturers apparently change units between firmware updates).

Maybe the low-hanging fruit is a mini-utility for operators to use a subset of their own data to first either veriify or create a conversion formula (if a stock one seems within the margin of error, it should just be validated rather than replaced), and use that “local” formula for the rest of their own dataset, which is presumably generated with similar hardware and firmware versions. Then they could be encouraged to submit the formula to a database for use by others.

So I might process one flight’s worth of my Wingtra data (a few hundred photos, easily done in an hour or two) without using the camera angle data, use the resulting scene graph to create a conversion formula for the weird Wingtra Yaw, Pitch, and Roll units, and use that formula to speed up the processing of the rest of my dataset (many thousands of photos, painfully slow due to the exponential issues Steve mentioned earlier). Then I could submit my working conversion formula to an ODM community database, which would help other Wingtra data users—at least until the manufacturer switches units!


I feel like a why not both GIF might be appropriate here. A python tool that could suggest likely formulae could also be the basis for infrastructure. One could use it separately (maybe with a copy-data mode that rewrites tags), submit photos to a hosted service, and / or submit some formula to the service, with coded make, model, firmware version. Perhaps I’m naive, but I’m not too worried about bit rot. We’ve got enough users who will find and get frustrated when the firmware version changes stuff, and they’ll have a tool to fall back to if they cannot or will not share raw data.

I feel like we need to curate a list of potential infrastructure projects… .

Edit: would we do this as code so we can license under AGPL, or data? If we do data, the licensing gets trickier to make copyleft in a way that doesn’t encumber products. Any thoughts on this side most welcome.

1 Like

Shall we try a low tech version first for wingtra? You’ve got mounds of data to process.

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.