ODM task freezes at downloading

Hello,
I have been trying to use ODM and ClusterODM for distributing image merging workload across multiple systems. I managed to do it between local PCs. But when I tried aws autoscaling, it is always getting stuck at “Downloading assets for submodel”. Here is what I did

  1. I have a public s3 bucket.
  2. I am running ODM, clusterODM and a single dummy locked nodeODM in an instance(m5ad.4xlarge)
  3. I took the default aws settings and tweaked the security and bucket address and aws-image id settings. Also added security group where port 3000,3001 and 22 inbound traffic are allowed.
  4. Then I started running a task with 1001 images.
  5. it created 2 spot instances t3a.xlarge. I monitored CPU usage and such.
  6. After a while, when it started downloading the result from each instance, it gets stuck and stays there.
  7. I check the s3 bucket. it seems two new folders were created containing all.zip file.
    Also, the instances are getting terminated so I think the download to s3 is successful.

I wonder if I am doing something wrong.

If I read this right, those instances have only 16GB of RAM.

That’s stretching it for 1k images (maybe with sufficient swap, but even then…).

Add more memory.

1 Like

Thank you for replying
I shall do that. But can the freezing of download from s3 bucket be related to it?

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.