GPU based processes

Hi,

I am recently testing WebODM with GPU.

I was monitoring GPU usage continuously. For the feature extraction, GPU was being used. Now, matching process is going on and there is no usage of GPU as such. Feature matching and building sparse point cloud is quite a lengthy process.

This led me to wondering whether all the steps involved use GPU or not.

  1. Which are the steps where GPU is being used?
  2. Has anyone compared the time taken between CPU based processed and GPU based processing?
  3. Is it worth to use WebODM with GPU for cloud?
  1. feature extraction , density point cloud
  2. ….
  3. Why not? iI you need dense point cloud as product … probably will be way faster.

Yes

However, that was only for that set of images, results will no doubt vary depending on the content of the images, and the number of images.

Thanks. That brings me to another question. So, I have been monitoring the GPU utilization.

NVIDIA Tesla T4 has a memory of 16 GB. But after observing for quite a sometime, I see GPU memory usage is 0.5 GB. Even for feature extraction, the GPU memory was not consumed more than 0.5 GB.

I have started webodm using the following command:
sudo ./webodm.sh --gpu start

  1. Is it the normal way WebODM works with GPU?
  2. If not, how can utilize the maximum memory of the GPU?

What size images?
20MP M2P images wont fit in 4GB of VRAM.

For now, I tested with GoPro Hero 10 dataset now. It has 23 MP and the image size varies from 10 MB to 14MB more or less.
But the last dataset which I processed was of M2P and the image size varied from again 11 to 13 MB.

And when both the dataset were being processed, the GPU memory did not exceed beyond 0.5 GB to 0.6 GB. In both the projects, there were roughly 1000 images.

Assuming you were using ultra quality feature extraction and no resizing, that’s a bit strange. That’s similar to the VRAM use I see with M2P resized to 3644 wide, and then using medium quality feature extraction.

1 Like

I did not use ultra quality feature. It was set to High and also, the images are being resized to 2048 (which is the default setting)

I think high feature quality overrides the 2048 default if you don’t change it.
If you want to use more of your GPU capability you could use ultra quality feature extraction and set resize to -1, and don’t use the initial resize option, use ‘No’.

In my testing, processing 20MPx images with no resizing and feature-quality=Ultra used a maximum of 7.4GB of VRAM.
With feature-quality=High and resizing the 20MPx images to 2048x1355px that reduced that to a maximum of 0.6GB VRAM.

3 Likes

Right now, looking at the logs,
2022-09-04 03:54:04,076 INFO: ht_0243_102.JPG resection inliers: 1412 / 1661 2022-09-04 03:54:04,099 INFO: Adding ht_0243_102.JPG to the reconstruction
I am trying to understand which process does this line suggest? Is it the sparse point cloud generation step?

Using htop command, I can see the output as python3/code/SuperBuild/install/bin/opensfm/bin/opensfm_main.py rs_correct /var/www/data/06e05a1c-9014-4be9-a0f0-d0b7223162aa/opensfm

I was searching and found my own Ticket:

Seems like this process does not use GPU and moreover does not utilize all the cores. Hence, it kinds of defeat the purpose of using GPU as GPU remains idle. This remains right now the longest time taking process.

Is it possible to speed up the Reconstruction process?

I think you better play with parameters to gain time.
I upgraded from 2.7.1 where I got an error message about the GPU (maybe from bigger image size than GPU can handle) now with 2.8.8 I can’t see any error after “CUDA detected”, but I can only spot a few very short spikes in GPU activity, like 4-5 x 1 second over >1 hour processing.
I suppose this is similar to this AI stuff, you have all these free tools to build a project, but when it comes to process something serious, you need to use their own services…