ClusterODM, Nodes, and Datasets

What’s the largest dataset you’ve stitched with ClusterODM? What instance sizes worked best for you?

My team has successfully stitched over 7500 images (nearly 200 acres) in size. The set was very difficult to stitch as it was very homogenous in nature. To that, we used 'split': 250 to give the best chance for success.

I’ve heard from @pierotofy, that he’s stitched sets of 13,000+ images.

Same dataset for me (with Piero’s help), 13,000. Here’s Piero’s slug

{
    "provider": "digitalocean",
    "accessToken": "CHANGEME",
    "s3": {
        "accessKey": "CHANGEME",
        "secretKey": "CHANGEME",
        "endpoint" :"nyc3.digitaloceanspaces.com",
        "bucket": "CHANGEME"
    },

    "createRetries": 10,
    "maxRuntime": -1,
    "maxUploadTime": -1,
    "dropletsLimit": 30,
    "region": "sfo2",

    "tags": ["lightning-node"],

    "snapshot": false,

    "imageSizeMapping": [
        {"maxImages": 40, "slug": "s-2vcpu-2gb"},
        {"maxImages": 250, "slug": "s-4vcpu-8gb"},
        {"maxImages": 500, "slug": "s-6vcpu-16gb"},
        {"maxImages": 1500, "slug": "s-8vcpu-32gb"},
        {"maxImages": 2500, "slug": "s-16vcpu-64gb"},
        {"maxImages": 3500, "slug": "s-20vcpu-96gb"},
        {"maxImages": 5000, "slug": "s-24vcpu-128gb"}
    ],

    "addSwap": 1.5,
    "dockerImage": "opendronemap/nodeodm"
}

I’m pretty sure it was split into ~800 image submodels with 120 m of overlap from primary node s-16vcpu-64gb image to ensure enough storage, memory, and so forth for being the primary node. It spun up 16 additional instances of s-8vcpu-32gb.

1 Like

I am not clear why the region is sfo2 while the s3 bucket is nyc3. I haven’t gotten around to asking.

It turns out the reason is as simple as one might imagine: his bucket was in nyc3, but he tends to have more success spinning up images in sfo2, and he just hasn’t moved his bucket.

Sometimes you just stick with what works and fix the other stuff when/if you have the time.

1 Like

Hi - Sorry for going back in time so far…but looking for a little guidance on the rationale for split size and overlap? I have been able to successfully split merge very small data sets, but am working my way to a very large data set…right now I am attempting a 721 image set, split into 103 image sets at 25m overlap. It’s been running for 10+hours (92gb allocated…). Any thoughts, recommendations would be greatly appreciated?

Hi @Scott – let’s start a new thread, but I can share what little I know.

Also, for reference, the same 721 model completed in roughly 3.5 hours without splitting…

Thank you - I just created a new topic (Split Merge - Settings)

1 Like