The perspective projection of a photograph is not a map projection, so it is not possible to georeference it in that way without warping the image.
The QGIS raster georeferencer includes the transformation equations for a perspective projection which allows modifying the image and georeferencing it based on 4 control points. Which you must calculate for each image.
It should be possible to build the same equations in a Python script and send GDAL the parameters to do the warping and georeferencing. Instead of control points, it should be possible to use exif tags (the focal length, field of view and/or image size are required, as well as the location in three dimensions and the three associated rotations) to calculate the position in the ellipsoid of the four vertices of the image.
Although this would not include orthorectification, and since GDAL is included in ODM, I think a warped and georeferenced set of images could be generated.
I am also interested in this functionality. I’m far from having the time currently available to do it, but I think it’s likely that a script that does this transformation already exists and it’s just a matter of finding it.