"leaked semaphores" error

Hi Guys,
after 24h rendering (5820pics split on 20 submodels - about 290pics per submodel) the project was stop by error:

[INFO] Computing mask raster: /var/www/data/eb2868e0-6a39-4073-94e5-2c058898ac87/odm_orthophoto/odm_orthophoto_cut.tif
/usr/local/lib/python2.7/dist-packages/joblib/externals/loky/backend/semaphore_tracker.py:198: UserWarning: semaphore_tracker: There appear to be 6 leaked semaphores to clean up at shutdown
len(cache))
/code/run.sh: line 6: 20418 Killed python $RUNPATH/run.py “[email protected]

what the error does mean?

1 Like

You ran out of memory.

1 Like

damn. so even i have a cluster from two machine: 128 and 32GB of RAM, finally the WebODM when start merge need to handel all submodels on one machine?
ps. normally i use “pure” pics from DJI, the pics weight about 8Mb.
will help if you reduce its size?

At some point the orthophoto needs to be merged, so yes. But it looks like it failed here though: https://github.com/OpenDroneMap/ODM/blob/master/opendm/orthophoto.py#L90 which is not the merge step. I would try to fetch the results from eb2868e0-6a39-4073-94e5-2c058898ac87 and see if there’s anything wrong with them (you’ll need to extract them via docker cp).

1 Like

unfortunately i deleted this project and started again witch smaller pics (to be honest the px resolution is still the same but i change the jpg quality from 100% to 84% - now single pic is about 4Mb instead of 8Mb)
we will se after next 24h, if they will stop again then I will be grateful for help

2 Likes

and finally after 32h is done… almost done :wink:
now after:

[INFO] Finished merge stage
(…)
Computing source raster statistics…
0
.
.
.
10
.
(…)
Generating Overview Tiles:
0
.
10
.
.
.
20
.
.
.
30
.
(…)

they hang… probably?

i run NodeODM with parameter -v parameters so i have access project file.
like i can see i have orthophoto_tiles done, but on the dtm_tiles are just folder with zoom 21 and inside have a lot of folders but half of them are empty and last modification time of all thouse folders are about 6h ago.

but on the graphic system monitor I see like the CPU still do something

second thing (strange thing) is the odm_orthophoto.tiff from odm_orthophoto which represents the entire project have only 43Mb weight.
I look at the submodel folders and each of odm_orthophoto.tiff from submodels have about 10Mb weight… like i saw the odm_orthophoto.tiff submodel from last unsuccessful render have about 350-500Mb

and we are going again… :wink:

again:

[INFO] Generated cutline file: /var/www/data/848a8fd6-55cd-4121-b649-a6e21b3a3c14/odm_orthophoto/grass_cutline_tmpdir/cutline.gpkg --> /var/www/data/848a8fd6-55cd-4121-b649-a6e21b3a3c14/odm_orthophoto/cutline.gpkg
[INFO] Computing mask raster: /var/www/data/848a8fd6-55cd-4121-b649-a6e21b3a3c14/odm_orthophoto/odm_orthophoto_cut.tif
/usr/local/lib/python2.7/dist-packages/joblib/externals/loky/backend/semaphore_tracker.py:198: UserWarning: semaphore_tracker: There appear to be 6 leaked semaphores to clean up at shutdown
len(cache))
/code/run.sh: line 6: 30441 Killed python $RUNPATH/run.py “[email protected]
Full log saved at /var/www/data/5e3efac1-cacf-46b1-a8a6-f54056c7d2d8/submodels/submodel_0017/error.log

like i know the submodell 0017 is one of the bigest - have a 456 pics,
but for example submodel 0016 have a 429 pics and are done completly so is no big diference with amount of pics.

but…
i see that the submodel_0016 was done on the main computer (this one with 128gb of RAM)
the sumbodel_0017 was maked remotly on second node witch have only 32GB ram.

@pierotofy can be the issue that the second node can not handle this stage witch only 32gb of ram?