In my testing using the GPU can reduce the processing time by about 40%, and processing 20MPx images with feature-quality=ultra required 7.3GB GPU RAM so you’ve probably got more than enough GPU RAM (depending on how many MPx your images have).
With default settings processing 2000 images took about 4.5 hours, using about 50GB of RAM and 100GB disk space on my system (Dual Xeon 20-core CPUs, 512GB RAM, 12GB GPU).
Do you need the quality report?
If yes, you have to process the 2000 images without splitting.
If no, you can reduce overall processing time by setting up a cluster of nodes and doing splitting. I don’t know how much it will speed up processing, but I expect it would be a substantial improvement.