FileNotFoundError -- ClusterODM

Running a dataset through ClusterODM that I previously ran through as a single dataset. The dataset has some camera calibration challenges associated with being a belly landing fixed-wing with a commodity camera, so I though localizing the camera errors might be useful, and what better approach than doing that with split merge.

Version: latest versions of WebODM, ClusterODM, and NodeODM, up-to-date as of today (Sept. 12, 2021). Updated using docker pull on latest images.

So here is what I see during the second stage of remote submodel processing:

FileNotFoundError: [Errno 2] No such file or directory: '/var/www/data/caec6ac0-05dd-484d-8a26-56bf7e0fefb8/opensfm/exif/deduped2016-09-13_08.59.58.JPG.exif'
Full log saved at /var/www/data/845a57fa-1b8d-49d2-927e-278898888009/submodels/submodel_0010/error.log
[INFO]    LRE: Cleaning up remote task (caec6ac0-05dd-484d-8a26-56bf7e0fefb8)... OK

If I look for more detail in the log on that remote machine, I see something like this:

useruser@nodex:~$ docker cp hungry_allen:/var/www/data/845a57fa-1b8d-49d2-927e-278898888009/submodels/submodel_0010/error.log .
useruser@nodex:~$ tail error.log 
exif = {image: self.load_exif(image) for image in self.images()}
File "/code/SuperBuild/install/bin/opensfm/opensfm/dataset.py", line 981, in <dictcomp>
exif = {image: self.load_exif(image) for image in self.images()}
File "/code/SuperBuild/install/bin/opensfm/opensfm/dataset.py", line 579, in load_exif
with self.io_handler.open_rt(self._exif_file(image)) as fin:
File "/code/SuperBuild/install/bin/opensfm/opensfm/io.py", line 1349, in open_rt
return cls.open(path, "r", encoding="utf-8")
File "/code/SuperBuild/install/bin/opensfm/opensfm/io.py", line 1341, in open
return open(*args, **kwargs)
FileNotFoundError: [Errno 2] No such file or directory: '/var/www/data/caec6ac0-05dd-484d-8a26-56bf7e0fefb8/opensfm/exif/deduped2016-09-13_08.59.58.JPG.exif'

If I check to see if that exif file seems corrupted or similar, I get the following:

useruser@nodex:~$ docker cp hungry_allen:/var/www/data/845a57fa-1b8d-49d2-927e-278898888009/opensfm/exif/deduped2016-09-13_08.59.58.JPG.exif .
useruser@nodex:~$ more deduped2016-09-13_08.59.58.JPG.exif 
{
    "make": "SONY",
    "model": "DSC-WX220",
    "width": 4896,
    "height": 3672,
    "projection_type": "brown",
    "focal_ratio": 0.0,
    "orientation": 1,
    "capture_time": 1473753586.693,
    "gps": {
        "latitude": -5.788261138888889,
        "longitude": 39.29726327777777,
        "altitude": 261.167,
        "dop": 20.0
    },
    "band_name": "RGB",
    "camera": "v2 sony dsc-wx220 4896 3672 brown 0.0 rgb"
}

I’ve got a couple of these cropping up in a 7k image dataset. I could pull these out of the process and rerun, but I am curious if there are any insights on what is happening to trigger this error.

1 Like

Mm, not sure, but doesn’t seem to be an isolated case.

If you can reproduce this consistently, can you setup a (small) test case (images, parameters, etc.)?

1 Like

You don’t want a 7k image test case?

Yeah, let me see what I can do that’s reproducible.

2 Likes

:laughing:

1 Like

I’m hitting the same issue with a 2677 image data set. ClusterOdm with (4) processing nodes (96 vCPUs/768 RAM on the last run). I’m on my third try each yielding a different set of images. Also occurring the the second stage of remote submodel processing. One of the images shows in in two of the runs. But this is not presenting itself until ~10 hours into the run. Bit frustrating, like to know if there’s ways to result. I’ve tried with pre-processed images as well. Also added post; CPU/RAM for processing 2DFull - 2677 Images - #10 by Saijin_Naib.

1 Like