Everything I have is on my own tin apart from a few very exceptional things. The cost to run things in the cloud by measuring either a dollar amount or as a data sovereignty/protection risk are astronomically high. It really bothers me how many companies have fallen for cloud marketing without doing all of the maths, or even part of the maths. In most cases they’ve pretty much swallowed the marketing in full as fact. I have customers come to me all the time saying “We want to move to the cloud” and when I ask “Why?” the number one answer is “That’s where IT is going/what everybody else is doing”, followed closely by “We’re trying to reduce our costs”. I should be shocked at the sheer ignorance of the middle managers making these kinds of purchasing decisions, but I’m not. It has been this way for eternity; people being too lazy to do the maths when spending other people’s (the business’) money.
Let me explain what I’m talking about with some simple 2 minute maths. Here’s an example of how much a decent spec ODM processing node might look in Amazon; an m5d.12xlarge, which has 48 cores, 192GB, and a 900GB SSD. All for the low, low price of $5/hour (correct as of March this year, so prices have probably gone up). That’s $120 a day, or $3,600 a month.
Look at the price of off-lease servers in your area - New Zealand is expensive generally so they might be even cheaper where you are, but you can get off lease RAMless diskless R720s now for $120-150. Roughly equivalent to one day of Amazon compute (whether you use it or not), for a twin CPU, 12/16/20 core. For ECC DDR3, RAM is now at around $2-3 per GB. I’ve bought off-lease locally and also new from eBay where I couldn’t find any 16GB sticks locally - 16GB sticks are around $50 NZD on eBay brand new. So to match the Amazon spec, that’s 12X16GB, or (12x$50) = $600 for 192GB.
Disks are cheap now, you can get 1TB SSDs brand new from $200. So for a roughly equivalent compute node, you’re looking at about $920 - approximately 6 days of AWS compute time for an m5d.12xlarge.
Electricity and cooling comes in at roughly $2-3 a day per server. Dual power/network datacenter rack costs are $1200/month for an ENTIRE 42U rack in Wellington (NZ), and Wellington is generally expensive compared to other parts of the world for rack space - but you’re still well under the cost of AWS, and the gap only widens with time as your one-off hardware costs are carried over several years.
So if you consider a full 42U rack of the same spec machines for a year:
42/2 (2U servers) = 21
21x$920 (compute) = $19,320
21x$3x365 (power) = $22,995
12 x $1200 (rack) = $14,400
21x$3,600x12 (m5d.12xlarge) = $907,200
A MILLION DOLLARS!
Every time I’ve done an AWS calculation like this the numbers are way out vs doing it yourself. I just don’t understand what calculations people are doing (if any) to conclude that cloud is worth it, despite all of its other shortcomings (like data sovereignty, control, trust, increasing prices, the extreme difficultly of getting your data OUT, etc etc etc). And the calculations above are at the ‘large’ scale where people (marketing echo-chambers) are claiming that AWS is most effective. And don’t be fooled into thinking that AWS are using all current gen hardware - they’re not, because they don’t have to, because people aren’t demanding it. And even if they were, and you buy equivalent spec new machines, you’re still way under AWS costs for just one year.
It gets worse than that when you start talking tax. AWS are not GST/VAT registered in most countries, while the cost for your own gear is reduced by 15-20% on the above prices because you can reclaim the GST/VAT tax component. It gets worse still - you can depreciate the cost of your own asses and offset it against the business profits in future years; you can’t depreciate a service that’s provided to you by Amazon. So essentially, even sitting on a shelf, the compute you buy as physical hardware eventually balances out to zero through depreciation, which is considered a business loss. And consider the fact that AWS prices are climbing while your ongoing compute amounts to zero (because you already paid for it), and power cost is generally insignificant - you could have each server on its OWN generator and still come in under AWS. Also consider all the hidden extra bullshit that AWS charge, like network data charges, disk and snapshot charges, etc etc - I really do just want to violently shake people sometimes - “WAKE UP, IT’S A TRAP!”. And I don’t buy the argument that staffing costs to run your own gear is expensive either; for a million bucks a year you can hire or contract a LOT of good people and still have change for a new Lamborghini every year that you can wrap around a tree.
I was in a meeting a couple of years back at a company where the General Manager of IT (salary $250k+) was talking about the ‘cost savings’ for cloud and had two spreadsheets - one for local, the other for cloud. The local sheet was an accurate reflection of what IT costs they had incurred from every wire to every IT staffer, and the cloud sheet was what he considered to be the equal, and the numbers were not that dissimilar but cloud was slightly cheaper. Ignoring for a moment the huge swathe of licence costs for various applications that were missing from the cloud sheet (the second largest cost in the local sheet), I asked about the largest cost: staffing. I asked “Will Amazon be provisioning and managing servers and supporting users on these new servers, or will you need to add all of the existing staff to the cloud sheet too?”. The room fell silent for a rather long time.
From the (generally piss-poor) calculations I’ve seen people do, they’re often including (rightly so) VMware in their operational costs, which is usually one of the larger expenses of local IT. But instead of recognising VMware as a tumour that needs removing from the environment, middle damagers look at replacing the entire environment instead like some cult leader. In other words, they haven’t correctly identified the money haemorrhaging problem as being with VMware and looking for alternatives to that problem first, like Proxmox.
Ever wondered how Bezos got to be the richest guy in the world by a HUGE margin in just a few years? Look no further than your middle manager deciding that “cloud” sounds like a cool idea because all the other middle damagers said it was a good idea. And don’t even get me started on Office 363!
It really rattles my cage.
But to answer your question, yes, bare metal
I’ve got the dashboard, the job distributor and processing nodes all on KVM with all the storage on Ceph RBD. The reason for this is so that I can move everything around while it’s running to take nodes offline for maintenance or add resources; no need to terminate or pause running tasks. I’m happy in the few percent performance trade-off for the migration benefit KVM provides. It should also scale well with this design, and fault tolerant. The piece that I haven’t done yet is being able to lock off a processing VM so that new tasks aren’t assigned to it so that I can also take VMs offline without impacting running jobs. Currently, all jobs are tipped into the API and I allow WebODM to choose the node, but I’ve got a plan to ignore the WebODM task provisioner and instead implement my own pre-queuing system at the front and select nodes using processing_node instead, which should allow nodes to be ‘drained’ without general service impact.
The disks under Ceph were 2TB spinning disks to a total of 24TB spread across the nodes as 2TB disks were the most cost effective capacity at the time. However, performance was shockingly poor, so I doubled the number of disks to try to improve this but it made little difference. I ended up ripping them all out and replacing them with 2TB QVO SSDs because it was random IO that was killing it so the QVOs should be fine as I don’t need sustained writes. So far, they’re OK, but I’m keeping my eye on them. Storage has been the most expensive component in the whole project by a huge margin.
The network is also very simple and inexpensive; it’s just gigabit ethernet connected to a decent sized switch. I’ve got the interfaces bonded and connected to an LACP-capable switch so I’m getting 2 gigabit throughput for Ceph, which is plenty for now. These Dells have 4 ethernet ports, so I have scope to grow to a theoretical 4 gig before I need to start looking at 10 gig hardware.
The VMs themselves are small, ~16GB provisioned as RBD, then the folders are CephFS (kernel) mounted for infinite online growth (exabytes). It’s all scripted, so rolling a new processing node takes minutes:
wget -q -O- 'https://download.ceph.com/keys/release.asc' | sudo apt-key add -
sudo apt-add-repository "deb https://download.ceph.com/debian-luminous/ $(lsb_release -sc) main"
apt-get clean all
apt-get -y install ceph-common
mkdir -p /etc/ceph/
echo '<Key>' > /etc/ceph/admin.keyring
cat << _EOF_ >> /etc/fstab
10.0.0.5:6789,10.0.0.6:6789,10.0.0.7:6789:/$(hostname -f)/www /www ceph name=admin,secretfile=/etc/ceph/admin.keyring,noatime,_netdev 0 0
10.0.0.5:6789,10.0.0.6:6789,10.0.0.7:6789:/$(hostname -f)/code /code ceph name=admin,secretfile=/etc/ceph/admin.keyring,noatime,_netdev 0 0
10.0.0.5:6789,10.0.0.6:6789,10.0.0.7:6789:/$(hostname -f)/webodm /webodm ceph name=admin,secretfile=/etc/ceph/admin.keyring,noatime,_netdev 0 0
10.0.0.5:6789,10.0.0.6:6789,10.0.0.7:6789:/$(hostname -f)/swap /swap ceph name=admin,secretfile=/etc/ceph/admin.keyring,noatime,_netdev 0 0
dd if=/dev/zero of=/swap/swap bs=1G count=300
chmod 600 /swap/swap
echo 'vm.swappiness = 1' >> /etc/sysctl.conf
echo '/swap/swap swap swap sw 0 0' >> /etc/fstab
# Install from Native script