What determines the amount of memory used by pdal?

I can’t seem to proceed with a dataset of 635 16.9 mpixel images on a machine with over 100GB of ram because of std::bad_alloc.

Last log output is:

[INFO]    Creating odm_meshing/tmp/mesh_dsm_r2.82842712475e-06 [idw] from 1 files
{
    "pipeline": [
        {
            "type": "readers.ply",
            "filename": "/code/smvs/smvs_dense_point_cloud.ply"
        },
        {
            "data_type": "float",
            "type": "writers.gdal",
            "filename": "/code/odm_meshing/tmp/mesh_dsm_r2.82842712475e-06.idw.tif",
            "radius": "2.82842712475e-06",
            "output_type": "idw",
            "resolution": 2e-06
        }
    ]
}
Pipeline file: /tmp/tmpJDSKFY.json
[DEBUG]   running pdal pipeline -i /tmp/tmpJDSKFY.json
PDAL: std::bad_alloc

Traceback (most recent call last):
  File "/code/run.py", line 47, in <module>
    plasm.execute(niter=1)
  File "/code/scripts/odm_meshing.py", line 98, in process
    max_workers=args.max_concurrency)
  File "/code/opendm/mesh.py", line 35, in create_25dmesh
    max_workers=max_workers
  File "/code/opendm/dem/commands.py", line 38, in create_dems
    fouts = list(e.map(create_dem_for_radius, radius))
  File "/usr/local/lib/python2.7/dist-packages/loky/process_executor.py", line 788, in _chain_from_iterable_of_lists
    for element in iterable:
  File "/usr/local/lib/python2.7/dist-packages/loky/_base.py", line 589, in result_iterator
    yield future.result()
  File "/usr/local/lib/python2.7/dist-packages/loky/_base.py", line 433, in result
    return self.__get_result()
  File "/usr/local/lib/python2.7/dist-packages/loky/_base.py", line 381, in __get_result
    raise self._exception
Exception: Child returned 1

This is based on commit 002a6f0 and my smvs/smvs_dense_point_cloud.ply is 2.0GB and has 59376826 vertices.

I used to be able to process this dataset before the introduction of smvs/pdal.

Try running it with --use-opensfm-dense. It will use the dense point cloud from OpenSfM.

With that flag OpenSfm produced a 1.1GB opensfm/depthmaps/merged.ply with 18402906 vertices, but unfortunately that didn’t avoid the PDAL: std::bad_alloc error. It’s not clear to me how much memory pdal tries to allocate. Before it crashes, while it’s still running, it only seems to use only about 2GB.

Well, "resolution": 2e-06 might be the cause. What is your --dsm-resolution parameter?

I’m not specifying --dsm-resolution directly, but I do specify --orthophoto-resolution 0.0001 as per the https://www.opendronemap.org/2018/09/better-everything-announcing-opendronemap-0-4/:

docker run -it --rm -v /test:/datasets/code odm --project-path /datasets --project-path /datasets --end-with mvs_texturing --use-opensfm-dense --mesh-size 200000 --mesh-octree-depth 12 --force-ccd 13.2 --texturing-keep-unseen-faces --orthophoto-resolution 0.0001 --opensfm-depthmap-method BRUTE_FORCE --max-concurrency 101

I’m guessing your images do not have GPS information embedded, so ODM cannot compute a ground sampling distance estimate, so it will use the resolution value you ask. The post didn’t go in details to explain this (we should add better documentation at some point), but the bottom line is, for your dataset set a reasonable --orthophoto-resolution value (or keep the default of 5).

Yes, my dataset doesn’t have any GPS information. It’s just .JPGs from a consumer camera. Thanks for the explanation, I’ll try --orthophoto-resolution 5. I also tried running pdal manually and it only succeeded with resolution": 2e-01

Also, FYI pdal info showed this about my dataset:

pdal info -i /tmp/pipe.json                         [74/90024]
{
  "filename": "\/tmp\/pipe.json",
  "pdal_version": "1.6.0 (git-version: Release)",
  "stats":
  {
    "bbox":
    {
      "native":
      {
        "bbox":
        {
          "maxx": 2019.5201,
          "maxy": 2638.7615,
          "maxz": 625.8677,
          "minx": -1999.8473,
          "miny": -1072.5405,
          "minz": -259.0711
        },
        "boundary": {
   "coordinates" : [
      [
         [ -1999.8472999999999, -1072.5405000000001 ],
         [ -1999.8472999999999, 2638.7615000000001 ],
         [ 2019.5201, 2638.7615000000001 ],
         [ 2019.5201, -1072.5405000000001 ],
         [ -1999.8472999999999, -1072.5405000000001 ]
      ]
   ],
   "type" : "Polygon"
}

      }
    },
    "statistic":
    [
      {
        "average": -42.45127901,
        "count": 18402906,
        "kurtosis": 1465669631,
        "maximum": 2019.5201,
        "minimum": -1999.8473,
        "name": "X",
        "position": 0,
        "skewness": -4987915723,
        "stddev": 436.457582,
        "variance": 190495.2209
      },
      {
        "average": 254.4188058,
        "count": 18402906,
        "kurtosis": -4.906887827e+11,
        "maximum": 2638.7615,
        "minimum": -1072.5405,
        "name": "Y",
        "position": 1,
        "skewness": 6.489182875e+11,
        "stddev": 241.358195,
        "variance": 58253.77829
      },
      {
        "average": -0.0828036865,
        "count": 18402906,
        "kurtosis": 3.035576832e+12,
        "maximum": 625.8677,
        "minimum": -259.0711,
        "name": "Z",
        "position": 2,
        "skewness": 8.131656185e+11,
        "stddev": 134.3056906,
        "variance": 18038.01853
      },
      {
        "average": 0.3115425265,
        "count": 18402906,
        "kurtosis": -5.538081318e+10,
        "maximum": 0.9980000257,
        "minimum": -0.9150000215,
        "name": "NormalX",
        "position": 3,
        "skewness": 7.446044569e+10,
        "stddev": 0.5138341942,
        "variance": 0.2640255792
      },
      {
        "average": -0.4566033072,
        "count": 18402906,
        "kurtosis": 2.117290482e+13,
        "maximum": 0.824000001,
        "minimum": -0.9990000129,
        "name": "NormalY",
        "position": 4,
        "skewness": -5.619428455e+12,
        "stddev": 0.5471587316,
        "variance": 0.2993826776
      },
      {
        "average": 0.08416621282,
        "count": 18402906,
        "kurtosis": 2422212575,
        "maximum": 0.8220000267,
        "minimum": -0.8460000157,
        "name": "NormalZ",
        "position": 5,
        "skewness": 1.968248987e+10,
        "stddev": 0.3521340629,
        "variance": 0.1239983983
      },
      {
        "average": 124.313986,
        "count": 18402906,
        "kurtosis": -7.843209043e+13,
        "maximum": 255,
        "minimum": 10,
        "name": "diffuse_red",
        "position": 6,
        "skewness": 2.692722237e+13,
        "stddev": 33.38980872,
        "variance": 1114.879326
      },
      {
        "average": 121.0464161,
        "count": 18402906,
        "kurtosis": -6.547406396e+13,
        "maximum": 255,
        "minimum": 10,
        "name": "diffuse_green",
        "position": 7,
        "skewness": 2.072664005e+13,
        "stddev": 34.24753634,
        "variance": 1172.893746
      },
      {
        "average": 113.2194484,
        "count": 18402906,
        "kurtosis": -2.831363589e+13,
        "maximum": 255,
        "minimum": 6,
        "name": "diffuse_blue",
        "position": 8,
        "skewness": 8.95883498e+12,
        "stddev": 39.5131018,
        "variance": 1561.285214
      },
      {
        "average": 0,
        "count": 18402906,
        "maximum": 0,
        "minimum": 0,
        "name": "class",
        "position": 9,
        "stddev": 0,
        "variance": 0
      }
    ]
  }
}

I was told that one could reduce memory usage by telling pdal to use some bounds and streaming mode. Maybe the appropriate bounds/resolution can be computed from that pdal info?

I’m not familiar with streaming mode, but if you find a way to lower memory usage, please share your findings! We’d love to improve memory usage if possible.

the GDAL writer needs to have bounds supplied to use streaming mode - so adding “bounds”: ([minx,maxx], [miny,maxy]) to the writers.gdal block might help.

GDAL is possibly exploding at resolution 2e-06 (0.000002) here because it thinks you’re writing out approximately 4000 x 3500 m (see the estimated bounding box) with 0.000002 m cells (~ 2 billion x 1.75 billion pixels, or 3.5 quintillion-ish pixels - is there a word for an image that big? could ODM be the first to invent it !?!).

At 0.2 m resolution, it’s saner (~ 20000 x 17500 ~= 350 000 000 pixels GDAL is trying to write )

HTH