Ok my friends. I present to you my findings on combining ground level photos with aerial imagery in a cohesive and fantastic model. Now, there are some limitations to this model: one set of photos was taken on a sunny day at sunset and the other on a partially cloudy day a couple hours before sunset. In addition to lighting differences, we can expect challenges from differences in GPS between days (we cannot rely on lower relative error for short flights). But the challenges are an important part of understanding how to do this even under less-than-ideal conditions.
I have been trying to think through the problem space of combining aerial and ground photos in an effective and scalable way for a while. At first I thought it would be easy (circa February 2021), and now I know it is not as easy as it could be.
But let’s start with some results and the successful endpoint(cloud):
So, let’s talk about failures when this doesn’t work. This was the first result I got when I combined ground images and low flight images with higher orbits:
The ortho isn’t quite… I mean, I suppose it is orthorectified, but the position from which its been orthorectified is way off… .
And this has everything to do with the fact that OpenSfM is now doing a more meaningful job integrating all possible data into the model (yay!) but not doing a good job initializing the model (awwww!).
To test this theory, I tried setting the GPS accuracy for the imagery to encourage the SfM to prioritize appropriately the good data when setting up the model. This didn’t work well:
See how we can see the side of the buildings? It is better, but not good enough. What we need to do instead is initialize the models using the best available data so that the orientation is better fixed from go.
Baring changes to OpenSfM (which I will be proposing, I suspect it’s at worst a 10-line fix), I simply stripped all the GPS tags out of the less accurate data, and reprocessed. This ensures that no orientation data is taken from the images with poor GPS. It means we have to use BOW matching, which isn’t as fast usually as the current incremental approach.