My project just finished so now I have 3 good runs showing the impact of adding system ram to my T7920 system. Even swapping to NVME appears to really slow things down as this just reduced the runtime by around 5.5 hours by going from 192 to 256 GB ram.
I do have one caution though which is that my NVME drive on the 192GB run ran low on space even though I was also using a second NVME drive as swap. By default Windows 11 Hibernate mode is set to on and that writes out a Hiberfile.sys which is the size of RAM everytime the system shuts down. I’ve since deactivated that feature.
Thanks for sharing these results. Maybe it’s too late now but I’d be curious what windows native vs windows docker would look like on the same machine, same data, same params. If I was to place a bet I would wager the Ubuntu run will be quicker than both Windows alternatives. How much and why is the million $ ? I suppose. Interesting results for sure.
I know I’ve mentioned this to Shiva, but I wonder if anyone has any real-world experience like this comparing a bare metal native install to a bare metal hypervisor (proxmox, EXSi). I like the simplicity of a VM for a lot of reasons, and I’d love to know if there was a significant penalty when running on a VM.
In my testing, Windows native beat Docker/WSL2 majorly, and Linux Docker on the same hardware is slightly faster, but not significantly. Somewhere around 5%.
This is consistent with the vast majority of our users’ experience.
We have noted a handful of machines which work faster under Docker/WSL2, but this is a very small number of machines, and we have not yet isolated the cause.
It feels like Mixed-topology CPUs or possibly high core-counts.
After a few bumps in the road I have my project running on my T7920 running Ubuntu 22.04-2 LTS.
So some quick observations:
Linux/Docker WebODM is version 2.0.1 versus the Windows version which is 1.9.18.84
Docker web engine is 3.1.2 versus the Windows engine which is 3.0.2
I see all the CPU cores are being heavily used which is a great sign. The Windows version doesn’t seem to spread the workload out very well. Just have to wait and see.
Okay, the results are in running under Ubuntu 22.04-2 LTS and Docker versus a Native Windows install and it’s very telling. Something is very wrong with the Windows build! Yes, the engines are slightly different but I can’t control that.
I’ve checked the parameters between all “High Res” runs and they match.
The GSM, area and points between all 3 computers and runs are pretty close. The Docker run (with a different engine) has about 18 million more reconstructed points.
Could not have imagined such a difference. Though there were some signs.
I hope @Cronix will read this. He also had a very slow run on his native Windows installation.
I must also say that my Linux installation with Docker and WebODM always went super smooth and very predictable. But I never used the Windows version.
On my 7740 laptop and T7500 desktop I had Avira anti-virus software running but I don’t recall on the T7920. I’m going to swap the NVME drives on the T7920 and check and if installed I will disable the virus checking and rerun the dataset. Must be thorough to understand what the root cause is.
I just confirmed Avira was also running on the T7920, I’ve removed it and have restarted WebODM.
Please bear in mind that we do not support any AV solution aside from Windows Defender, and even then, we recommend allow-listing the install directory to exclude it from Real-Time scans, which do add overhead.
Depending on the NVMe drive you are using, most of them use pseudo SLC for caching.
If your NVMe drive is above 70% it is very likely that it will not reach it’s advertised speed. When I did some testing on NVMe drives (like Samsung 980 Pro or WD Black SN770) the speed diminished to a ridiculous 400-600 Mbytes/sec when the drives were full. Just something to keep in mind.
Short sample:
if a NVMe is advertised to have 100Gbyte of write-cache, that means 100Gbyte in SLC mode. If you have a rather recent NVMe you’ll be likely to use MLC or mostly even TLC. With TLC you will need 300Gbyte of free space in order for the drive to have 100Gbyte available in SLC mode for write-cache.
The project has been running for over 19 hours on my T7920 (without anti-virus software) and the completion bar has a very long way to go so I’m back to my belief that there is something wrong with the native Windows WebODM/Engine build.
This is not entirely encouraging. I’m installing my big machine tonight and considering spending the $57 on the windows native installer. I’m happy to support the project, but if there is an issue perhaps it’s worth waiting a beat and using the docker version for the time being. Shiva & Saijin, what are your takes here? I’m sitting on a brand new install of Win 10 workstation. dual 32 core Epyc with 512 ECC. Should I go docker or native?
If it’s slower than docker, you can request a refund. It is a rare set of circumstances where we are seeing issues, and it isn’t isolated yet for those rare cases.
That’s the last of my concerns. I’ll go ahead and purchase it tonight. Just wanted to make sure there wasn’t a bugfix or new release on the near term horizon. Thanks for the speedy replies.
You are covered in terms of upgrades to the software for life with the purchase.
As for an upcoming update, yes, we should have one coming within the month with the latest ODM engine and some fixes for non-ASCII characters in the PATH, and some other nice things like WebODM 2.x UI.
I’m also interested in this. Do you have any idea about the reason for this performance difference? Do you think it’s about how WebOdm has been compiled on Windows to produce the installer? What about GPU support on Windows Native? I’m trying to understand if it’s worth investing in an installer (that, besides the donation to your great project, in a company working in agriculture I think would be mandatory in the near future…).
EDIT: I originally answered the message stating a difference in favor of Windows Native, but it’s the same for a difference in the opposite direction.
Also, as already asked here it would be great to investigate more about the support of optimized BLAS in WebODM. Being something heavily reliant on CPU computation, having those supported would decrease processing time more than investing in more recent and powerful machines, as broadly discussed on this topic.
I benchmarked such difference in R for data science (which is a completely different story I know) as R comes with very poor BLAS by default, and it’s like night and day.
OK I’ve got a 1146 20MP image run in the new BEAST 64core. I got it loaded up and went to bed so I haven’t been watching it closely and I don’t have anything set up to log yet. My initial observations are surprising. My old machine would peg all of it’s weak CPU at 100% and it would quickly use up all 64GB physical memory.
I’m 6.5 hrs into the run today and I’m seeing very low CPU utilization. Overall it’s showing about 3% utilized. Of the 64 cores 55 of them are basically idle and 9 of them are very lightly loaded. Memory is about 4% utilized. At 6.5 hrs in it’s currently at the matching stage and it’s crawling. Subjectively it’s no faster then my old 4 core.
Don’t know what to say here…I guess I’ll try out docker and maybe install Linux on an empty NVMe. I can’t explain this. Any ideas what might be going on? I’m gonna setup my PFsense today so I can give out VPN access.