pc-tile
is amazing, and it seems to be causing less seaming than when we first implemented it.
I’m testing to see if I want to enforce it by default.
Turn it on and give it another go
pc-tile
is amazing, and it seems to be causing less seaming than when we first implemented it.
I’m testing to see if I want to enforce it by default.
Turn it on and give it another go
Not that great. The same dataset on the Win10, 5900x, 128GB RAM, 980PRO NVME, and 3090FE is going on 38 hrs still in the matching phase. I gave it 256GB of pagefile. So far my dual Xeon is coming out on top!
Haha No doubt about that.
But even if I would own a dual Xenon system, I would not have the electricity for it. Especially to leave it running during the night. Also, in case I would upgrade to a Ryzen 9 5900X, I would limit its TDP to run @ 65 Watt. That is my allowed energy budget here in the off-grid ecovillage
We share a photovoltaic system with a few people.
@Saijin_Naib
I saw that pc-tile
switch, but totally forgot about it again. I’ll give that a try to see if that can limit that memory spike.
is this your 2800 image dataset that took 48 hours on the dual Xenon? It’s somewhat encouraging it hasn’t crashed yet with out of memory. Thanks for sharing. Everything I learn here is enlightening.
Thanks for the tip. I enabled it and restarted from meshing (I hope that’s the right place ;D) I’ll let you know what happens. If meshing is wrong please tell me where to restart. I’m never quite sure…
Should be from point cloud, but if you’ve passed that already you likely don’t need it, I don’t think. Be interested to see how it goes on your machine.
Failed at 7.5 hrs
2023-03-19 06:05:35,563 INFO: Shots and/or GCPs are well-conditioned. Using naive 3D-3D alignment. /code/SuperBuild/install/bin/opensfm/bin/opensfm: line 12: 5766 Killed “$PYTHON” “$DIR”/opensfm_main.py “[email protected]”
Exact same dump as before. No need to duplicate. I just don’t think this machine has the power to handle it. It’s very old. i5 6600K was released in 2015. Passmark scores it 6343. 24 core Threadripper Pro 5965WX scores 66,371. That should put it into perspective.
I’m starting to think a super beast WRX80 threadripper pro with 512 is the only machine. I had high hopes for a R9 with 128GB…Just not thinking it’s got the memory. If @69 charger can run 2800 images with his dual xenon 512, then a 24 core threadripper pro should handle things pretty well. Safe bet I’d think.
Add to that I’m seeing reviews that NVMe on PCIe v5 just aren’t that impressive in real use. They can be quick for sequential but there isnt really any difference for random read/write from v4. So that’s looking like it’s going to be a non-issue for performance. Why pay the premiums for v5 when it will be a minor improvement?
Good morning,
just ran a 300 images batch (300x100m in real world size) from your dataset while the last images are still downloading
First run on defaults with 5cm GSD and then a rerun with 0.5cm GSD.
It does not look like there is really any difference in the overall processing time. The GSD setting is only for the orthophoto, while 3D model quality and resolution is controlled by point cloud quality. Even a mediocre point cloud can still create an amazing orthophoto in my experience.
Now to the difference between orthophoto resolutions:
The orthophoto @ 5cm GSD is 34 Mbyte in size, the higher resolution one comes in at 2 Gbyte. Resolution for the first is 1899x6375 and the 0.5cm GSD image has 18999x63751 pixels. Like exactly 10x as much, who could have known
Here a non-scaled 100% sample photo showing both orthophoto resolutions:
In Megapixel that is an increase from 12MP to 1200MP. But as I suspected the orthophoto resolution is not really computing intensive. The bulk of the work for ODM is the feature detection, image matching and creating a point cloud from that. To then stretch a skin (texture) over it, is (comparatively) not that much work.
I also ran the little test batch to see how the endless sea of this kind of photo:
would perform for matching images. But it does not seem to pose a problem, probably because of the really, really high resolution of your imagery.
And I am really impressed by the 45MP resolution of the Zenmuse P1. Not that I would want to fly anything more complicated than a Phantom 4 Pro, but the resolution and detail you catch from 40m (130 feet) height is spectacular
Though the actual test dataset is not really a stunner
At the moment there is certainly no real reason to move to PCIe 5.0 SSDs. Most PCIe 4.0 drives can hardly ever max out the available bandwidth. I am using different PCIe 4.0 drives (Samsung 980 Pro, WD Black SN770 amongst others) and in many situations I get transfer rates of ~1000Mbyte/s and less. Which even PCIe 3.0 is still plenty fast enough for.
Though using NVMe drives for swap purposes squeezes out a bit more performance, though that “squeeze” ages them. But then again, NVMe drives got so cheap and I saw people trying to “kill” their NVMe drives with 24/7 random read/writes and many disks could take multiple times their official TBW and the tests were still running.
If you can afford that, it will totally be an amazing rig to do computing on
But I am running 2800+ image sets on an 8 core Ryzen 7 with 64Gbyte of memory, no problems.
Yet, if that Threadripper tickles you so much, go for it …
but I hope you would then be open to run some datasets to compare between our machines?
I actually just want to know and justify that I also need a Threadripper That is why I started benchmarking. But if you have one and it proves to be necessary, I would finally have a good argument for a serious upgrade
Yes I just stopped it at 59 hrs. Was still in the matching stage. So I’ll stick with processing on the server
Without question I would be happy to stay involved and run tests.
This was my fear and why I’m hesitant to go the desktop route. Thank you for taking the time and effort. I’m fairly convinced that this isn’t the way to go for my use case.
LOL. I think I said it was unimpressive. It’s much better than when it had snow all over it. I flew a few flights when it was snowy and it couldn’t match at all. Learned that photogrammetry and snow don’t mix.
Thanks for the in depth look at gsd and processing times. I’m surprised at the result but your explanation makes sense.
Can you post your parameters please? I’d like to give the dataset a shot on the parameters you’d use on your 64GB PC. Maybe some of my problem is poor parameter choice.
Appreciate everything everyone has done here. What a great group of folks. You’ve really helped me learn a lot. This feels like the beginning of a long journey.
120MP I think, 100X, not 1000.
I also thought that first, but just now calculated it again. The resolution of the image is 18999×63751 which multiplied equals 1211205249.
Putting thousands separators 1,211,205,249. So 1200MP?!
Oh, just checking again again … actually you are right: the resolution of the smaller image is 1899x6375, so 12,106,125 pixel which is 12MP and not 1.2MP.
So you are right, its by factor of 100x not 1000x, but the lower number was incorrect
Sounds a bit funny to be that long in matching stage. I sometimes remove GPS tags to make ODM match by image instead of GPS. That can really, really extend the process time. But it helps with imagery taken during multiple flights where the GPS went a bit off.
But, just read your post again and I suspect something must have gone funny.
Also thinking that your Ryzen 9 5900X shouldn’t really be slower than my Ryzen 7
Ran @fullwatts test dataset.
With a 8 core, 64 Gbyte machine with a CUDA graphics card on DEFAULT settings the process took just under 17 hours and finished without problems. The dataset contains 1314 images with 45MP resolution which adds up to be 36 Gbyte of data (subjectively a lot of data )!
Peak memory usage (physical and virtual combined) was 122 Gbyte and 350 Mbyte of the graphics card memory were used. Surprisingly to me, the point cloud creation was not the memory peak, but during image matching for an extended period of time:
point cloud quality
.
If I would run similar datasets more often, I would clearly recommend to have 128Gbyte of physical memory installed.
During the process roughly 5 Terabyte of data was written to the swap file located on a NVMe drive. Which subjectively is quite a bit. But that is why I would upgrade my memory if I would run such datasets more often.
About the GSD and resolution of the orthophoto:
With default settings the orthophoto is 135Mbyte in size and has a resolution of 4,210x8,713 pixel (36Megapixel). When changing the orthophoto resolution to 0.5cm GSD and rerunning the dataset from orthophoto stage the file size increases to 10.1Gbyte with a resolution of 48,342x100,054 pixel (4.8Gigapixel). Rerunning the dataset for the higher resolution orthophoto takes 20 min, if the process data is still available.
Do not open a >10Gbyte GeoTIFF in your image editor if you have any less than 32 Gbyte of RAM or you will run out of memory. Software like QGIS is clearly preferable and can handle GeoTIFFs (especially with pyramids/previews) very smoothly.
To mention it: during processing 200Gbyte of storage was used. Most of that will be freed after the process has completed, but that is the amount of free space needed during processing. In case that’s tight, enable optimize-disk-space
in the options.
Relating this result to the topic question:
a desktop PC with up to 128Gbyte of memory can handle such datasets and compute them in a reasonable time.
And the 8 core computer I am using is even far from being a high-end desktop PC like a Ryzen 9 or Intel Core i9.
Here a link if somebody wants to see the detailed hardware utilization during the process:
http://crearconamor.com/webodm-graph/index.php?dataset=datasets%2F1314pics_default-64G-gpu_45MP_east-blair-avenue_webodm-monitor.csv&load=1
I so appreciate this discussion. I just upgrade to an M2 Max and would like to be able to take full advantage of this hardware, here at ODM but also in my python development. Right now everything executes on my CPUs. My limited understanding is that the only way to access/leverage the GPU is vis-a-vis NVIDIA hardware. Can someone here provide a context for this dilemma and how to workaround it?
A bit of hijacking this thread
It’s because of CUDA which is a toolkit developed and provided by NVIDIA to allow computing to take place on a GPU.
Some of the software that ODM uses are writtten for CUDA. That’s it. And it only runs on NVIDIA cards.
There are some translation layers out there to run CUDA code on AMD cards
But I wouldn’t know if anything of that is already ripe for implementation.
So for now, it stays with NVIDIA cards only
I sold my Radeon for a Geforce because of ODM and because of NVenc
Out of curiosity I ran @fullwatts dataset again on DEFAULT settings, but I lowered the resolution from 45MP to 20MP.
The GSD changed from 0.5cm to 0.71cm.
Yet the processing time dropped from almost 17 hours to 3.5 hours. Also memory usage was significantly lower (48Gb instead of 122Gb):
An indication for me that an 8 core, 64 Gbyte PC seems to be suitable for 20MP images while going any higher in resolution more cores and more memory will clearly be helpful/necessary.
This is the difference in orthophoto quality:
I did not mark which side is 0.5cm GSD and which is the 0.71cm GSD, but I guess you can tell.
Sorry for not replying sooner. It’s been a week. These are some very interesting results. Perhaps most telling is the second run when you compare it to the first. That’s an 80% improvement in run time. I would not have guessed that just re-sizing the images would have made such a difference. Out of curiosity I pulled up the mission and put in .71 gsd to see how many images it would have taken if flown at the altitude needed to give me .71gsd. Controller says ~650. Which is roughly half as many. I really wonder if we ran the exact same dataset with 650 45MP images and a native GSD of .71 if the run would take half the time of the 17 hour original run or come in closer to the 3.5 hours of the resized run. I’d also love to see what difference a faster CPU with more cores would have. It’s too bad andy and 69charger weren’t able to run the set. I’d be really interested in their results. I get the sense that CPU cores/speed are a more important factor with the larger images. More so than sheer quantity of memory. Knowing which one, core count or speed, would also be good info.
I am super encouraged at what was revealed about memory. I’m starting to re-think some assumptions I’d made about how much memory is needed. It doesn’t make sense to buy memory and have it sit idle. That money is better spent on processor or maybe GPU. So this brings me to the next question. Given what we’ve seen I’m genuinely curious how you’d rate CPUs. The R9 is an amazing processor. It tops out at 16 cores with a base clock of 4.5 and a boost clock of 5.7. The threadripper pro 5975WX has double the cores for a total of 32 but with a slower base of 3.6 and a boost of 4.5. What’s your take on advantages/disadvantages of the Threadripper between these two processors? Will WebODM be able to use all 32 cores or the TRP 5975? Given the nature of the images, being big 45 MP images, would the 32 core be better suited than the 16 core to handle the images? I’m trying to decide between the two. I’m leaning toward the TRP 5975WX because it gives me more future-proofing. I could start with 256GB of memory, but upgrade if I ever needed more. The R9 is maxxed out at 128, and while I might not need all of it for 1300 or 3000 image datasets, this drone is more than capable of shooting 5K or 8K or more images in oblique, and I’m afraid that datasets of that size could be difficult if I were to process them at full resolution. Last question is about AVX512. How consequential is it? The R9 has it, the TRP doesn’t. Do we have any data that quantifies its advantages? If it’s a small improvement then it’s one thing, but if it’s substantial especially for larger datasets then it makes the R9 look more appealing. What stage of the pipeline is AVX 512 primarily important for? I know Saijin said that odm looked for it but besdes that I know almost nothing about what it does for processing.
Thank you again for all your efforts. This has been very eye-opening for me. I really can’t thank you enough. Whatever I decide to get I will continue to provide you with data and help with runs from my new system. I intend to give back as thanks for all the help so generously given.
Hoi,
many questions, but since I am a hardware enthusiast, I feel pulled to respond
However, I would like to mention that I have owned, upgraded, overclocked, worked with and traded computer hardware for a good 25 years. So for me assembling any kind of computer / workstation is a lot of fun and it goes as intuitive as driving a car is for others
Every generation of processors brings some advantage over the last generation. In case of Ryzen CPUs compared to Threadripper CPUs a Ryzen 9 5950X released in 2020 beats a high end AMD Ryzen Threadripper 1950X released in 2017. Similarly a Ryzen 9 7950X beats a AMD Ryzen Threadripper 3970X released 4 years earlier. That shows that every 3-4 years performance almost doubles.
Yet in certain applications the sheer amount of threads translates to higher performance.
Looking at ODM it seems only partially so to me. I ran some tests deactivating half my cores, but the overall processing time did not double, it was increasing by roughly 25%.
That said, it indicates that if you would want to have Ryzen 9 7950X performance you would need at least a Threadripper 3970X or 5975WX to have the same or more performance. Though looking at single threaded performance which you will feel in almost every daily task, a Ryzen 9 7950X outperforms any Threadripper.
Which would make me not see a real advantage in sheer processing power between Ryzen 9 and Threadripper. The Threadripper advantage will be more likely to be found in memory capacity and more threads.
I only have 8 cores / 16 threads available, but looking at how ODM works, it can spawn as many threads as it pleases in most parts of the process. So I would say that ODM is able to take advantage of many threads, BUT you will have to increase physical memory with every additional thread. Every thread will need a certain amount of memory to run at its full capacity. You can find many indications and equations for that in the forum.
Future proofing is an interesting idea. I always like to add up-front investment into the equation. If you now buy a Ryzen 9, 128Gb setup you will probably spend half to one-third of what a Threadripper setup would cost you. If you use the Ryzen 9 for 3 years and find it coming to its limit, you would then upgrade to whatever is the latest generation then. Sell your “old” Ryzen 9 setup, upgrade to the latest. By then the latest desktop Ryzen setup will probably beat the Threadripper 5975 setup, but you will have by then PCIe 6.0 and DDR5. And buying the two Ryzen 9 setups will cost you still less then a single Threadripper setup.
So it might actually be “more” future-proof, to now go with a setup that you know can handle your daily needs and rather upgrade again in three years then spending all that money on a setup now and try to use that for 5-6 years to pay for its investment.
Looking at the plain specs and given that the Threadripper advantage lies in more threads and more memory only, a Threadripper PRO 3975WX may already give you all the advantages at a much better price point. Especially if you consider going second hand. Both Threadripper generations (3000 and 5000) use the same TRX80 platform. The performance difference between Zen 2 (3000 series) and Zen 3 (5000 series) is not as significant as between Zen 3 to Zen 4 (7000 series).
On Ebay.com you can find registered DDR4 memory for very good prices. Not at the highest possible frequency, but using 2400Mhz RAM over 3200Mhz RAM will not significantly impact processing time. But you could have 512Gbyte of registered DDR4 at a very good price point.
There I would pose the question: how often do you think you will do that (running 5000-8000 image datasets)?
If your answer is occasionally, I would say get the Ryzen 9 system. It will handle your daily needs up to 3000 images with relative ease and the occasional bigger dataset it will also do, giving it enough time. Though as you saw, reducing resolution by a bit will reduce computing time drastically.
If you have a huge dataset, you could run it at reduced resolution to see if it computes. When you know it does, you can then confidently commit to let the machine run for a day or two to compute at full resolution.
Or you maybe just adjust what you offer to your customers: up to x acres of size you can offer 0.5cm GSD, for more acres the max GSD will be 0.7cm.
Since you live in the US and have fiber internet, what about renting an online machine for a couple of days if you really want to run huge datasets?
There are offers for renting >64 core machines with 512Gb of memory and NVIDIA Tesla cards for days or a month online. The best thing: your customer would pay for it instead of you investing and binding a good amount of capital in degrading / value loosing hardware.
Sorry if I over think things here, but I am self-employed since forever and am always looking for most bang for the buck
Shortly searching the internet, it looks like ODM and the software used therein, does not yet make use of AVX512 , but AVX. Which all modern CPUs offer.
You maybe can read that I would go for the Ryzen 9 solution. Especially if you also use that machine for other daily tasks. But be aware that while running ODM processes, that computer will be occupied and may not serve you so well for office tasks during that time.
Another route would be to maybe keep your actual desktop (Core i5 6600K with 64Gbyte). I am sure that most tasks still run plenty well on it and with that amount of memory you will also be able to work with the ODM results (orthophotos, DEMs, point clouds etc) in QGIS or whatever software you use.
Then put together a render rig (Threadripper PRO 3000 for best value, Threadripper PRO 5000 if you have it ) and put that in the basement or garage. It can toil away on the rendering jobs and your daily use computer stays available for office use.
I am actually using a similar setup: daily work from a quad-core laptop and a more “powerful” machine standing in the technical space, next to the photovoltaic system to do the heavy lifting (WebODM and 4k Video en-/decoding). The fans in the render machine are on maximum to keep temps low, but since it stands far away from where I sit, it can be as noisy as it likes to be and also run during nights.
Though, looking into the subject, reading tons about it in the forum and running all kinds of datasets on my 8 core machine … if I would right now go professional with a drone business, have a 45MP camera but no established customer base / regular income through it yet, I would go:
Ryzen 9 3950X or Ryzen 9 5950X with 128Gbyte DDR4 and a NVIDIA Geforce 3060 with 12Gbyte VRAM and a 2-4Tbyte PCIe 4.0 NVMe and a 6-12TB HDD for long time storage.
Buying that second hand is very, very affordable and I think I could run 99% of tasks on my own machine and the occasional heavier task either just takes a few days or if the customer pays for it, rent an online machine to do the computing.
Comparing that to a Ryzen 9 7950X or even Threadripper, I think the computing time difference is not enough to justify the investment. And if a process of 1500-3000 images takes 6 or 12 hours does not matter much to me. The render machine works during day and night and if I can start a process today and it is done by the next morning, all good.
Latest generation Ryzen costing double than Ryzen 5000 with DDR4 and Threadripper costing 4-10x more then the proposed Ryzen 9 5950X or Ryzen 9 3950X system. But the difference in actual computing time will rarely be double or 10 times as fast.
As mentioned above, I am working a long time in hardware and there is no substantial advantage to say DDR5, PCIe 5.0 or AVX512 are needed or even usable within the next couple of years. Not with focus on WebODM
Samsung just released another 990 PRO NVMe on PCIe 4.0. It is now one of the fastest drives around and it still just manages occasionally to max out PCIe 4.0 bandwidth. Many people thought that Samsung would release the 990 PRO with PCIe 5.0, but clearly Samsung still does not yet fully utilize even PCIe 4.0. And Samsung acknowledges that by releasing another PCIe 4.0 drive, even though PCIe 5.0 motherboards are already entering the market.
So for my knowledge about hardware, the Ryzen 3000 and Ryzen 5000 CPUs will still serve super well for another 3 years.
And if I make a lot of money in those 3 years, have a stable income and customer base, then I would be willing to go fully in and assemble a Threadripper or whatever system and use the Ryzen 9 for my office purposes
That subjectively would give me most bang for the buck at the moment
And kind of sorry for all the text but I am just enthusiastic about the subject