ZAMBA CLOUDComputer vision for wildlife conservation

About Zamba Cloud Get Started Log In

What is Zamba Cloud?

Zamba Cloud makes it easier to handle large amounts of camera trap videos for research and conservation.

Zamba Cloud is an application that automatically detects what species are present in camera trap videos. In short, you'll upload videos and receive a spreadsheet telling you what species are most likely present each video, allowing you to weed out false triggers and get straight to the videos of interest.

After creating an account, you can either upload videos directly or point Zamba Cloud to an FTP server where your videos are stored. Then, get back to your day while Zamba Cloud processes your videos. You'll get an email when it's done. Simply log back into your account where a spreadsheet with labels for each video will be waiting for you.

Click here for step by step instructions

What species groups can Zamba Cloud identify?

Currently Zamba Cloud can identify 23 species groups and also blank videos where no animal appears:

Blank (no animal present)
Bird
Cattle
Chimpanzee
Duiker
Elephant
Forest Buffalo
Gorilla
Hippopotamus
Hog
Human
Hyena
Large Ungulate
Leopard
Lion
Other Non-Primate
Other Primate
Pangolin
Porcupine
Reptile
Rodent
Small Antelope
Small Cat
Wild Dog

Don't see what you're looking for? We are always open to expanding the list of species that Zamba can identify—see this FAQ question for more information.

How accurate is Zamba Cloud?

To assess the performance of Zamba Cloud, the predictions from the application were compared with labels manually applied by reearchers and citizen scientists.

The algorithm was originally developed based on 200,000 videos labeled by citizen scientists on the Chimp&See platform. As of February 2019, when we evaluated the algorithm on a separate set of 90,000 additional Chimp&See videos, we saw the following results:

Top-3 Accuracy: 79%

This means that 79% of the time, the correct label was in the top 3 labels that were predicted. Zamba Cloud outputs the top-3 predicted labels to enable researchers to easily surface videos of interest.

In addition, the results showed:

  • Presence of wildlife: 94% - Zamba Cloud can accurately detect the presence of wildlife in videos (whether videos are blank or non-blank) 94% of the time.
  • Presence of humans: 93% - Zamba Cloud can accurately detect the presence of humans in videos (regardless of whether other animals are also present) 93% of the time.
  • Top-1 accuracy for common species: 87-97% - For the species that appeared most often in the dataset—such as chimps, primates, duikers, and hogs—the most likely label for the video was correct 87% to 97% of the time. Generally the algorithm performs better on species that are physically larger (e.g., elephants) than on ones that are smaller (e.g., rodents).

Performance may differ on other datasets as there are a number of reasons why the Chimp&See dataset is unique.

  • Videos on the Chimp&See platform are 15-second segments of longer videos, whereas one minute videos, which are the entire capture, appear more commonly in practice.
  • Data labels are the judgments of citizen scientists. Species that are harder for a non-expert to identify are therefore less likely to be correct in this dataset.
  • Chimp&See initially displays nine frames taken from each clip and asks the citizen scientist if an animal is present. This screening process may have led to videos being over-labeled as blanks when in fact animals appeared in other parts of the clip.

With these in mind, we are also exploring how the algorithm works on different datasets. For an expert-labeled dataset of one minute long videos from sites not included in the training dataset, we saw a signficantly lower top-3 accuracy of 40%. For now, we expect that performance on new data will be between these two accuracy measures.

This is just the beginning. Our goal is to continually improve the algorithm, and we have well-defined action items for improving accuracy across all datasets. The most valuable contribution to this effort is additional labeled data. Find out more about how you can submit a correction for the videos that Zamba Cloud got wrong and let us know what the right species are. Or, if you have videos that are already labeled, you can share labeled data directly with us.

How to use Zamba Cloud

Sign up or log in

To begin, either sign up to create an account or log in to your existing account.

Sign Up Log In

Uploads

After logging in, you'll be taken to the Uploads page, where you will upload the videos you want Zamba Cloud to process. There are two ways of submitting videos: Direct upload or FTP upload.

Zamba currently supports the following video file formats: avi, mp4, mpeg, asf.

Wondering how long it will take? See this FAQ question for more information.

Direct upload

If your videos live on your computer, make sure you've selected the Direct Uploads tab in the upper left corner. Then click the + New Direct Upload button.

Direct uploads work best with a fast internet connection, as limited bandwidth can cause the browser to time out partway through the upload. If you have a slow internet connection, consider an FTP upload instead.

Drag and drop videos or click Select File(s) to select the video(s) you want to upload from your computer.

When you are finished selecting videos, click Upload.

You may add an optional title and/or description to this upload. Finally click Begin Processing.

FTP upload

If your videos live on an FTP server, make sure you've selected the FTP Uploads tab in the upper left corner. Then click the + New FTP Upload button to start a new submission.

Enter the FTP server URL path, username, and password in the corresponding fields and click Upload from FTP.

Keep in mind that with the FTP submission, all the videos in the specified folder will submitted to Zamba Cloud. You will not be able to pick and choose which videos within the folder are processed.

Finally click Begin Processing.

Job status

Once you have submitted videos, you can see their status on the Uploads page under the corresponding "Direct Uploads" or "FTP Uploads" tab. The status will say "Zamba succeeded" when the labels spreadsheet is ready for download. Since it can take a few days to process a large quantity of videos, we'll send you an email when the labels are ready so you don't have to keep checking this page. Feel free to close the webpage and get back to your day.

Downloading species labels

When the species labels are ready, you will receive an email from zamba@drivendata.org. The link in the email will take you to your Uploads page where there will be a button that says Download Labels. Click on this to download the csv file, which can be opened in Excel, Numbers, Google Sheets, or read by analytic software like R or Python.

You can re-download the labels at any point as they will always be accessible from your Uploads page when you are logged in.

Understanding the labels spreadsheet

For each species label in the list above, Zamba Cloud uses an advanced computer vision model to estimate the probability that that label applies to the video. Probabilities range between 0 and 1. For each species label, 0 means the species is definitely not in the video, 0.5 means there's a 50% chance the species is in the video, and 1 means the species is definitely in the video. For the blank label, 0 means the video is definitely not blank, 0.5 means there’s a 50% chance the video is blank, and 1 means the video is definitely blank.

The labels spreadsheet has a row for each video. The first column contains a Zamba-generated unique ID for the video. The second column contains the original video filename (for direct uploads) or filepath (for FTP uploads). The next six columns contain the name and corresponding probability for the top three "most likely" species (i.e. the labels with probabilities closest to 1). These columns are intended as a shortcut to make filtering for videos of interest easy.

The remaining columns contain the probabilites for each of the 24 labels. These will not sum to 1 as there can be more than one species present in each video. Zamba Cloud individually estimates the probability that each species is present.

A labels spreadsheet could look like this:

original_filename top_1_label top_1_probability top_2_label top_2_probability ... BIRD BLANK CATTLE ... corrected_label
12180011.AVI BLANK 0.6446 OTHER_PRIMATE 0.1418 0.0021 0.6446 0.0004
12060005.AVI HUMAN 0.9460 BLANK 0.0571 0.0001 0.0571 0.0000

Filtering videos by species

There are two ways to filter for videos that have a certain animal in them using the labels spreadsheet.

  • The first is to use the most likely species columns: top_1_label, top_2_label, top_3_label. You could set a filter that checks if the species of interest is in any of those columns.
  • The second is to use the probability columns. Say you wanted to see only videos where there was at least an 80% chance there was a lion in them. You could set a filter on the LION column to see only rows where that value was 0.8 or greater.

A similar thing can be done to filter out blank videos. Either filter based on if "BLANK" does not appear in the most likely species columns, or set a threshold using the probability column BLANK.

Submitting corrections

Zamba Cloud relies on user-labeled data to improve its predictions. If you have videos where Zamba Cloud did not predict the right species, we'd like to know what the right species are!

The easiest way to submit corrections is to fill in the corrected_label column, which is the last column in the labels spreadsheet that you download from Zamba Cloud. As you review videos, you can fill in this column with the label of the correct species. The label you put in this column must exactly match the column name for the species (for example, FOREST_BUFFALO or CHIMPANZEE).

The spreadsheet of corrections that you upload should look like:

video_uuid ... ... corrected_label
9ab-c65 ... ... HUMAN
89a-000 ... ...
123-456 ... ... DUIKER
123-456 ... ... FOREST_BUFFALO
  • The file should be saved as a CSV file (comma separated values), which should be an export option from your spreadsheet tool.
  • Columns other than video_uuid and corrected_label are ignored for the corrections spreadsheet so you can leave these exactly how they were downloaded from Zamba Cloud.
  • If the correct label for a video is blank, then BLANK must be entered in the corrected_label column.
  • If you have multiple species in a video, copy the entire row so the same video appears twice with one corrected_label per row (as shown above for video 123-456).
  • For videos where the top_1_label is correct, the corrected_label column can be left blank or you can confirm the correct label by entering it in that column.

To submit corrections, select the Submit Correction tab in the upper right corner. Then drag and drop or click Upload File to select the corrections spreadsheet you want to upload. Finally, click Submit correction.

FAQ

How does Zamba Cloud work?

Under the hood, Zamba Cloud runs a computer vision algorithm trained on thousands of hours of camera trap videos that has learned to estimate the probability that different species are present in the video. For more information on the origin of Zamba, check out https://zamba.drivendata.org/

How much does Zamba Cloud cost?

Currently Zamba Cloud is supported financially through 2019 by the Max Planck Institute for Evolutionary Anthropology with in-kind support from Heroku. It is free to use during this time. Because your labels can always be downloaded as a spreadsheet, you are not locked in to continuing to use this tool to access the labels that the algorithm predicted.

How long will it take to process my videos?

Depending on network conditions, you can expect processing to take about an hour for 120 videos. If you're experiencing issues, try limiting uploads to no more than 1000 videos at a time. Keep in mind that your job may take longer if it is in the queue behind another user's long running job.

I'd like more fine grained species identification. Is this possible?

Computer vision algorithms are only as good as the labeled data they're trained on. Zamba Cloud was trained on video data labeled by citizen scientists through the Chimp&See Zooniverse project. Crowd sourcing is great for getting many videos labeled, but the labels are not 100% accurate and they tend to be more general in nature, ex. "duiker" rather than "red duiker." To train Zamba Cloud to identify red duikers versus yellow-backed duikers for example, we would need videos with these species-specific labels.

We're always looking for partners who can share their data to help us improve the accuracy on current species detection as well as expand to new species.

Have data and want to make Zamba Cloud better? Let us know where Zamba Cloud got things wrong by submitting corrected labels. If you have additional data where you already have the species identified, reach out to zamba@drivendata.org.

This application has been developed and made available thanks to the generous support of the Max Planck Institute for Evolutionary Anthropology and the Arcus Foundation.

Our video processing algorithms run on Microsoft Azure thanks to support provided by Microsoft's AI for Earth program.

Created with support from our friends at Heroku

Built and maintained by DrivenData