Parrot Anafi and embedded metadata in video files

WebODM / Parrot ANAFI

Georeferenced photogrammetry from video

A beginner’s guide

The Parrot ANAFI embeds metadata in the videofile when filming. This means that it is possible to extract the latitude, longitude and altitude from the drone’s GPS at each frame, making it possible to build a spatially accurate 3D model or map from the video file. This document outlines a simple work method for doing so.

Tools used:

Parrot ANAFI w/Skycontroller 3 and FreeFlight6

ffmpeg https://www.ffmpeg.org/

vmeta from Parrot PDRAW SDK Overview - GroundSDK Tools 7.0

a simple text editor (TextEdit on MacOS)

a simple spreadsheet application (Numbers on MacOS)

WebODM WebODM Drone Software - OpenDroneMap

Notes to prerequisites:

Video with metadata from a Parrot ANAFI drone (see Parrot PDRAW documentation for more info, other models are also supported), and the software mentioned above is used in this tutorial.

There are other options to export individual frames from videos than ffmpeg, but it is a very easy and quick to use. If you want to use other software than WebODM for model and map creation, you will most likely need to edit the EXIF information for the individual images or check if the software supports a similar georeferencing solution as the geo.txt format.

General:

When making a model from video using “classic” photogrammetry*, it will inherently be more difficult to obtain a good result than from pictures. The reason for this is that video will be more prone to inducing motion blur, rolling shutter effects and artifacts due to video compression. However, it is possible, and depending on your needs, you may get satisfying results.

Any flight path can potentially be used, but depending on your target some planning will make the results better. The rules are the same as for ordinary photogrammetry, clear images, good coverage and overlap. If you want a 3D model, you can film at oblique angles (>-90°), for mapping nadir (straight down) might be better. The level of detail will depend on your video settings, altitude and speed. As the ANAFI has a 180° capable gimbal, you have a great tool for capturing any 3D objects (as long as you have clear footage and good overlap).

*note: By “Classic” photogrammetry in this context I’m refering to non-AI powered model creation or Simultaneous Location And Mapping (SLAM) type 3D model creation.

Workflow

  1. Extract the metadata as a Comma Separated Value (csv) file:

Open a Command Line Interface (CLI), in this case Terminal in MacOS

Change directory to your Parrot SDK folder:

cd /Users/Code/groundsdk-tools

From the folder un the vmeta command in a wrapper:

./out/groundsdk-macos/staging/native-wrapper.sh vmeta-extract --csv destinationfolder/destinationfile.csv sourcefolder/sourcevideo.MP4

The csv-file should now have been created in the destination folder.

*note: If you trim the video using a 3rd party editor (like QuickTime), it might strip the metadata, so it’s recommended to use the raw file from the ANAFI.

  1. Extract the frames from the video:

Open a Command Line Interface (CLI), in this case Terminal in MacOS

Change directory to the folder where you want to keep your individual video frames:

cd /Users/you/yourworkingfolder/Frames

Extract the frames using ffmpeg:

ffmpeg -i path/to/video.mp4 Frame%04.jpg -hide_banner

After a little while, your folder will be filled with the individual frames from your video.

  1. Prepare your geo.txt-file for WebODM:

Open your CSV-file. It will be space delimited, not comma delimited, so you might have to make a change in your spreadsheet when importing:

Screenshot 2022-01-06 at 20.53.43

The CSV-file will contain a lot of information about the ANAFI’s state at the exact moment in time when the video was created, like battery percentage, orientation, signal strength etc. We will now deleta all columns except three: Location_latitude, location_longitude and location_altitude.

We also need to swap the location_latitude and location_logitude columns to get the correct order for WebODM to process tha data. After this, create an empty column in front of the GPS data. This is where we will paste the name of all the frames from the video in ascending order. To do this, first open a text editor like TextEdit, make an empty file. Make sure your file is in plain text format (in TextEdit, select “Format” from the menu and select “Make Plain Text”). Go to your Frames folder and select all frames in the folder. Copy and paste into your empty text file.

You should now have a list of all the filenames for your frames in the text document. Select all, and copy-paste into the first column of your spreadsheet.

Select all columns, excluding the headers and copy-paste into your text file. Make a line on the very top, and enter the coordinate system for the GPS data, EPSG:4326. Save the file as “geo.txt”, using all lowercase letters. It should look like this:

Screenshot 2022-01-06 at 21.36.52
*note: The values must be tab-delimited to be read correctly by WebODM

This file will be used by WebODM to correctly place the images when creating the map or 3D model.

If you haven’t already, now is a good time to create a folder for the data you want to process in WebODM.

  1. Selecting your frames:

When using video, you have many more frames to use than typically for stills, but photogrammetry works best when there is a good change in angle / separation on the subject, as the Structure From Motion (SFM) calculations depend on it to build the 3D model. You don’t necessarily need many frames, you need good quality ones. And the more frames you have, the longer the processing time and resources needed.

If you have a massive amount of frames, you can divide the data into diffent sections / blocs depending on the nature of your video segments. Some segments might require more data sampling than others - if there is little to no change between the individual frames, you can safely keep a few and skip the rest.

  1. In WebODM:

Make a new project, upload the images and geo.txt-file and process with the settings you deem appropriate for your project.

End notes

This is a workflow I created after having scoped out the site of a rock fall I wanted to map / 3D model.

At the time I had just taken a panorama at a different location, but it was early morning mid-winter, cold and dark, and I was not sure if the site was within range. I didn’t think the conditions were right for a mapping mission in the first place, but I was curious to see if it was reachable - which it was. However, at that stage I was down to 30% on the battery, and since it was cold I decided to return.

I started recording on the approach, and the video was not as bad as I expected. I decided to quickly make a test model using Pix4Ddiscovery (the free version of Pix4D), and got decent results using only stills from the 2.7K video. Pix4D has an embedded tool that let’s you extract every N-th frame from a video, making the amount of frames easier to handle. However, as it was not georeferenced, I was not able to get any good orientation or scaling - and backtracing the metadata to the frames was a very laborous and difficult task.

Similar results as the Pix4D 3D model was successfully recreated with WebODM, but the orientation issues were the same, making the model very difficult to view without further editing in other software. There was also nothing in the area that would make a good

But since I knew the metadata was available in the video file, since I had previously looked into the White Papers for the Parrot ANAFI, I decided to give it a go while waiting for the next opportunity to perform a propper mapping of the area.

I am sure this workflow can be improved on, and would love some feedback from the community. If you have any suggestions for improvement, better suited tools or god-like scripting / programming skills, or just good ideas, please reach out.

Screenshot 2022-01-06 at 22.48.18
Screenshot 2022-01-06 at 13.50.08

6 Likes

Beautiful work!

Thanks so much for taking the time to share this with us all!

2 Likes

I wanted to add this to the post, but it seems like I’m not able to edit it anymore:

Appendix 1

Updating GPS EXIF tags using Exiftool

Updating the EXIF data of the extracted frames from the video using Exiftool is very easy when you have a CSV file. Since we already have our geo.txt file, we can use this as the starting point. You can find Exiftool here: https://exiftool.org

  1. Duplicate your geo.txt file and give it a new name.
  2. Add three more columns called “GPSLatitudeRef”, “GPSLongitudeRef” and “GPSAltitudeRef”, fill them with “North”, “East” and “Above Sea Level”
  3. Rename your columns to match the EXIF tag names (GPSLatitude, GPSLongitude and GPS Altitude), and check that the headers match the data*. The first column with the file names for your frames must be called “SourceFile”. Your file should look something like this:

Screenshot 2022-01-15 at 12.30.15

  1. Save your new file, and make sure your deliminator is comma.
  2. Open your Command Line Interface and navigate to your work folder with the images you want to tag, then type in the following command:

exiftool -csv=/Path/to/Workfolder/filename.csv -o output/ .

This will create a new folder called “output” in your workfolder, and copy the entire set of images with the new tags to that folder, while keeping the originals untouched. Check your results. You can now use the images without the geo.txt file, or with other applications that needs / reads GPS tags.

*note: I forgot I switched Latitude and Longitude around in the previous step to make the geo.txt file, and ended up with the wrong tags on my first go

1 Like

I love this. This is pretty similar to the workflow we have at GitHub - localdevices/ODMax: ODMax 360 degree video utilities. So far this tool is specific to the GoPro max, but it seems there’s a general use case to generate georeferenced stills from video.

2 Likes

That looks like the next level for sure! I wouldn’t know where to begin…

The problem with this workflow is that you end up with a lot of useless files. You want a way to pick the good ones, i.e. clear, crisp shots where there is a certain change in the view, and still keep track of the metadata associated with those frames.

…which sounds like a perfect job for a machine.

1 Like