Recently, I did some processing on pix4d and checked its processing logs. I noticed that the number of key points it produced is quite stable. For an image sized 5472x3648, it roughly extracts 60000 keypoints at fullscale, and 5000 keypoints at 0.25 scale. Also for the image has low contrast, I didn’t observe the number of keypoints reduced significantly.
The current ODM/opensfm setup limits the lower bound but not the upper bound. For same-sized images, I can observe up to 400000 points for most of the images and down to 10000 for some low-contrast images. As I understand, the number of keypoints is crucial to feature matching and reconstruction speed, 400000 points for feature matching and reconstruction should increase processing time a lot.
For accuracy aspects, my intuition is that high-resolution images would also have more accurate location extraction compared to low-res images even if the detection is at the same place.
So I think maybe we should also have some limitations for max number of key points for each image. The problem is how we can keep good features, opencv’s keypoints have response
to represent its cornerness, which could be used as one criteria. One idea from alicevision’s implementation is that separate the image space into grids and try to sample the points evenly from the grids. Another idea is evenly sample points from different scale levels, as sometimes downscaling could be helpful to match hard examples.
I’m interested in doing research on this topic, this could potentially improve the speed while not compromising accuracy. I need some suggestions for the metrics to measure the matching quality so that I can run some systematic comparisons. Also, I want to find some “hard examples”(like images couldn’t match in low res or couldn’t match in high res), still for testing purposes to see how the limitations could affect the matching quality.