ClusterODM autoscaling issues when using AWS

Hey, ya’ll :wave:. First, thanks to the Open Drone Map community for helping to make these processing tools more accessible to the general public.

I’m running into an issue with AWS autoscaling. I’m fairly certain the issue is with docker-machine as my security group attached to the ClusterODM & NodeODM instances is wide open. Docker-machine can create the new EC2 instance but fails to SSH into the machine. I see the docker-machine package has been deprecated for a while. Any plans on swapping that out soon? Any ideas about future implementation?

In the meantime, here are a few things I tried:
* Use the --native-ssh flag as described here (although this is not the only reference to this fix)
* Attempted locally (windows 10)
* Attempted in AWS (ubuntu)
* Copied the docker-machine command that ClusterODM uses and ran inside of the terminal.
* Allowed all traffic

Here is my asr conifg

  "provider": "aws",
  "accessKey": "my key",
  "secretKey": "my super secrety key",
    "endpoint": "",
    "bucket": "nodeodm2"
  "vpc": "vpc-09895c2aa7aca4c70",
  "subnet": "subnet-05f672e0a5c14883f",
  "securityGroup": "launch-wizard-1",

  "monitoring": true,
  "maxRuntime": -1,
  "maxUploadTime": -1,
  "region": "us-west-2",
  "zone": "c",
  "tags": ["type,clusterodm"],

  "ami": "ami-0d593311db5abb72b",
  "engineInstallUrl": "\"\"",

  "spot": false,
  "imageSizeMapping": [
    {"maxImages": 40, "slug": "t3a.small", "storage": 60},
    {"maxImages": 80, "slug": "t3a.medium", "storage": 100},
    {"maxImages": 250, "slug": "m5.large", "storage": 160},
    {"maxImages": 500, "slug": "m5.xlarge", "storage": 320},
    {"maxImages": 1500, "slug": "m5.2xlarge", "storage": 640},
    {"maxImages": 2500, "slug": "r5.2xlarge", "storage": 1200},
    {"maxImages": 3500, "slug": "r5.4xlarge", "storage": 2000},
    {"maxImages": 5000, "slug": "r5.4xlarge", "storage": 2500}

  "addSwap": 1,
  "dockerImage": "opendronemap/nodeodm"

Command I tried and the output

docker-machine create --driver amazonec2 --amazonec2-access-key my-key --amazonec2-secret-key my-super-secret-key --amazonec2-region us-west-2 --amazonec2-ami ami-0d593311db5abb72b --amazonec2-instance-type t3a.medium --amazonec2-root-size 100 --amazonec2-security-group launch-wizard-1 --amazonec2-monitoring --amazonec2-tags type,clusterodm --engine-install-url "" --amazonec2-zone c --amazonec2-vpc-id vpc-09895c2aa7aca4c70 --amazonec2-subnet-id subnet-05f672e0a5c14883f clusterodm-78-urbzNi4bU6pnYeHHBzVmLF

Waiting for machine to be running, this may take a few minutes...
Detecting operating system of created instance...
Waiting for SSH to be available...
Error creating machine: Error detecting OS: Too many retries waiting for SSH to be available.  Last error: Maximum number of retries (60) exceeded

I would love to see the config of folks who have gotten this to work properly in AWS in the last year or so! Thanks for following along this far. I hope ya’ll have some answers!

Related Issues:



Sorry for the trouble.

Indeed, this is a bit of a tricky configuration it seems.

If you get anything figured out that will improve this process (or can help us replace docker-machine with whatever is currently supported), we’d all be very grateful!

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.