ODM task freezes at downloading

I have been trying to use ODM and ClusterODM for distributing image merging workload across multiple systems. I managed to do it between local PCs. But when I tried aws autoscaling, it is always getting stuck at “Downloading assets for submodel”. Here is what I did

  1. I have a public s3 bucket.
  2. I am running ODM, clusterODM and a single dummy locked nodeODM in an instance(m5ad.4xlarge)
  3. I took the default aws settings and tweaked the security and bucket address and aws-image id settings. Also added security group where port 3000,3001 and 22 inbound traffic are allowed.
  4. Then I started running a task with 1001 images.
  5. it created 2 spot instances t3a.xlarge. I monitored CPU usage and such.
  6. After a while, when it started downloading the result from each instance, it gets stuck and stays there.
  7. I check the s3 bucket. it seems two new folders were created containing all.zip file.
    Also, the instances are getting terminated so I think the download to s3 is successful.

I wonder if I am doing something wrong.

If I read this right, those instances have only 16GB of RAM.

That’s stretching it for 1k images (maybe with sufficient swap, but even then…).

Add more memory.

1 Like

Thank you for replying
I shall do that. But can the freezing of download from s3 bucket be related to it?