Production Network Design / Diagram

recoilnetworks · December 18, 2020, 7:33am

Hello ODMers! I am working with a new startup to build an image processing pipeline. The imagery is geared towards agriculture and I will be completely honest, I am a Systems / DevOps Engineer and a NodeJS/Javascript Developer, I am not in anyway a Drone pilot (yet) nor do I know a lot about imagery. So, that being said, we’re currently using DroneDeploy to do our image stitching and processing. We know that ODM will do what we need and we have tested WebODM however, now I need to figure out how to scale it. Here are my initial questions:

Can I load balance WebODM?

We need access to the Tile Map Service that comes with WebODM as we are not in a spot to develop our own Tile Map Service based on the images that NodeODM creates. I do not understand enough about Tile Map Services to write one yet so using the one provided by WebODM to show the imagery using Leaflet is the best option we currently have based on my understanding. However, I do not like having a single server so what I would like to do is have a single point of entry (load balancer) hitting two backend WebODM servers to serve the Tile Maps and what not.

Can I create the entire environment manually including using an AWS RDS for PostreSQL?

I would like to roll my own environment using the WebODM and NodeODM components but I do not want use Docker. I know the advantages and disadvantages and right now, Docker is not what I want to use. I would like to install NodeODM directly on an AWS EC2 instance as well as WebODM and then use an AWS PostgreSQL RDS for WebODM’s database. If I need to have a shared Redis instance of WebODM, can I use AWS ElastiCache or just spin up a custom Redis cluster?

Do the WebODM instances and the NodeODM processing instances, need to share storage?

In the diagram provided, I have the WebODM and NodeODM instances all sharing the same AWS EFS storage. I would assume this is required as the WebODM instances would need access to the same imagery and NodeODM would need to process the imagery.

Can I store the images in AWS S3 and still use the WebODM Tile Map Service?

This would not be a replacement for the shared EFS storage but it would be the final resting place for the imagery after it has been processed. Can all of this be automated? For example, the user uploads the imagery via our web application, the imagery is loaded WebODM using the API and the image processing is kicked off. Once completed, the images are essentially copied to an AWS S3 bucket and removed from the shared EFS storage.

Here’s a diagram that I have put together of what I believe the environment would look like.

nodeodm

I have not been able to find any good diagrams or full on tutorials. I would be more than happy to put one together once I have an environment up and running so that others can do the same, should they want or need to. Any other help is appreciated!

Thank you!!

coreysnipes · December 18, 2020, 1:01pm

Hi, you’re going to need to know a lot about the internals of the ODM and how things fit together. I am not the right person to answer most of your low-level code questions, but in general:

Yes, you can install directly and run without docker
Yes, ODM processing jobs can be distributed across several servers but you need to use the internal mechanisms (ClusterODM, split/merge). It is also possible to add load balancing on top as you propose, but look at those other features first and decide whether you need it.
The storage scenarios are probably achievable, and will require either a small or large amount of work to implement

One more important consideration: if you are planning to build a successful business enterprise with open source software at the core of the offering, be sure you consider the responsibility of that business to contribute back to the community, and improve the software for all users.

Saijin_Naib · December 18, 2020, 1:40pm

I’d recommend using ODM to make geotiff which you then serve out with something like GeoServer, which has GeoWebCache built-in, and very powerful controls. You also can load balance it.

smathermather · December 18, 2020, 6:50pm

Yes, check GitHub - OpenDroneMap/ClusterODM: A NodeODM API compatible autoscalable load balancer and task tracker for easy horizontal scaling ♆

I would look instead to the COGs, or cloud optimized geotiff, and how projects handle the display, caching, and distribution of COGs. This is the cloud native way to handle image display, and doesn’t require you build some way to cluster the entire WebODM application.

Check out how ClusterODM and NodeODM function first to understand clustering of the processing services. Then look at these docs on manual setup of WebODM: GitHub - OpenDroneMap/WebODM: User-friendly, commercial-grade software for processing aerial imagery. 🛩

Once you have it running on an instance and have worked through that process, then you can look to potentially plugging in different components from AWS. Then please document the successes!

No shared storage needed.

Updates to the docs appreciated as you build this out. docs.opendronemap.org is where most of the user facing docs go, and developer specific docs go to READMEs and the lot in the specific projects, specifically WebODM, NodeODM, and ClusterODM for your use case.

Yes: it is best to start out with a specific engagement plan. It is valuable to the community and also ensures you don’t build up technical debts and fragility in deployment and implementation that are broken as the project changes. Also, you will have some legal obligations: ODM, WebODM, NodeODM, and ClusterODM are all AGPL, which caries with it compliance elements with respect to sharing back the code base (at least) to your end users.

recoilnetworks · December 19, 2020, 3:30am

Thank you for the reply, it’s greatly appreciated. I am unfamiliar with the AGPL license, so I will look that up and ensure that we follow the rules.

That being said, we cannot use GeoTIFF files. My experience so far is that our GeoTIFF files are very large in size in fact, our smallest one to date is 1.5GB. Forcing a client to download that large of a file is not going to happen and takes way too long. When I built out the MVP of the application in September, we started with having our pilots download the processed images from DroneDeploy and then upload them to our app which stored the imagery on a cloud based block storage service. That quickly became in issue with having to upload the files and then again, when trying to load them as overlays on a map.

Once we realized the GeoTIFFs were an issue, we switched gears and I wrote an integrated using DroneDeploy’s API. Essentially, the pilot creates a DroneDeploy project, uploads the images, selects the image types we need and then DroneDeploy processes them. After that, we create a “Report” in our application which pulls the entire project list from DroneDeploy and we create an association. On the map, the users loads a field and can then select the report date which will allow them to load the images from that flight. This works well and uses the DroneDeploy Tile Map Service so the images are nice and fast (depending on your Internet connection speed, of course). However, DroneDeploy is expensive and we do not control the data nor the environment so we are looking to replace it which is where ODM comes into play.

So, all of that being said. I think I am on the right track, I just need to figure out how everything fits together. I would rather not have to use WebODM and instead, just build a few NodeODM processing servers and creating my own UI within our application for creating tasks, processing imagery, etc. The piece that I am missing is the Tile Map Service which I can get from WebODM. If that is the route that I go, the main question is: can I load balance WebODM with shared database resources so that I do not have a single point of failure? If there is another Javascript based application that I can use to create a Tile Map Service using the images that NodeODM creates, I am all ears!

I will definitely contribute back as I can but I think this is going to be a larger undertaking that I thought. It seems like no one has done what I am looking to do with ODM (Web/Node/Cluster) and ultimately use it in a full on production environment that will process some 25k images per day (maybe more). Each 100 acres that we fly, produces approximately 4000 images. I cannot go into detail about how many acres we have lined up to fly in the next few months (because I honestly do not know) but I know it’s upwards of 10,000 which means some 400,000 images in the next three to five months.

Which brings me to image storage: if WebODM is serving the images using the Tile Map Service via the WebODM API and the NodeODM instances are processing the imagery, how would they not have to share storage? I would rather not pay the internal AWS bandwidth costs between servers, if at all possible. I would be better if I can just use an EFS (AWS NFS) mount between all servers and have the imagery processed from there.

Sorry for the long winded reply here, just try to really figure out all of this and get it all put together. Thank you!

Saijin_Naib · December 19, 2020, 3:35am

Seriously, look into GeoServer and GeoWebCache. They will serve up that geotiff like nothing.

recoilnetworks · December 19, 2020, 3:44am

I suppose I fail to understand how that’s going to solve my problem. The clients computer still has to download the 1.5GB+ image. I will look into GeoServer but if the clients computer still has to download a single 1.5GB+ image, then it will not solve that part of the problem.

Note: I am still reading through all of the ODM documentation and I am waiting for my client to purchase https://odmbook.com/.

pierotofy · December 19, 2020, 2:29pm

You want to use https://www.cogeo.org/ (which is what WebODM uses internally).

To me, you should not worry about load balancers, scaling, etc. until you actually have the need for them. Build something that doesn’t scale first and concentrate on the product. If once you launch you find you need scaling, then worry about the scaling. Otherwise you might fall in the trap of premature optimization. Every component in WebODM can be scaled (perhaps not out of the box for every component and might require some work for certain parts). On the processing side we routinely process those many images (and more) on webodm.net, which runs on ClusterODM.

Also, if you’re worried about egress/ingress fees, just drop AWS. Plain simple. There’s plenty of other providers that will give you the computing you need without the fees of AWS.

WebODM downloads the results from NodeODM (it copies them via HTTP download), so the two do not need to share storage. You could modify them to share storage however.

This is important, it will require that you share the parts you modify with the end users of your application (at the very least). Don’t skip this one.

smathermather · December 19, 2020, 10:05pm

Yup: per the COG website:

A Cloud Optimized GeoTIFF (COG) is a regular GeoTIFF file, aimed at being hosted on a HTTP file server, with an internal organization that enables more efficient workflows on the cloud. It does this by leveraging the ability of clients issuing HTTP GET range requests to ask for just the parts of a file they need.

Range requests are a more web native way of handling serving image data than a tile server, and importantly can be served in all the modern ways: static, cached if desired, and easily scalable. Don’t think of them as geotiffs but as the grandchildren of geotiffs born in an internet age.

Piero (who has 13k customers at WebODM.net) is right in pointing out pre-optimization: get something working and optimize where you need to. And if you aren’t required to use AWS, you will save a lot of $$$ with other providers. The ClusterODM examples here may point you in the direction of affordable alternatives.

system · January 18, 2021, 10:05pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.