this page is still under construction, come back later 😴 Reproductive workflow of Colares et al. (2024)

Lucas Colares 2023-08-02

Hey there, nerds! This R markdown file is the summarization of all the steps to achieve the results reported in the paper Contrasting responses of insects to forest loss promoted by a mega dam in the Amazon, by Lucas Colares and co-authors. First of all, download and unzip (or clone) the GitHub repository of the project at https://github.com/lucas-colares/blow-me-a-fly. Then, open the R project by double-clicking in the HabIns.Rproj file. If this R markdown file does not automatically appear, double click in the full_step_by_step.Rmd file and follow the steps below. Hope it helps!

1. Setup

Okay, after this introduction, we can now go coding! First of all, you will need to set a few configurations and functions for this markdown file to work. We are going to need some (ok, a lot of) packages and other small setups. Don’t worry, I already created a R script that you can run to install all packages and functions required. Just run the following code to setup your gear:

source("scripts/00. setup.R")

## Carregando pacotes exigidos: imagerExtra

## Carregando pacotes exigidos: imager

## Carregando pacotes exigidos: magrittr

## 
## Attaching package: 'imager'

## The following object is masked from 'package:magrittr':
## 
##     add

## The following objects are masked from 'package:stats':
## 
##     convolve, spectrum

## The following object is masked from 'package:graphics':
## 
##     frame

## The following object is masked from 'package:base':
## 
##     save.image

## Carregando pacotes exigidos: vegan

## Carregando pacotes exigidos: permute

## Carregando pacotes exigidos: lattice

## This is vegan 2.6-4

## Carregando pacotes exigidos: XML

2. Download the images

Now that you are setup, we are going to need to download the raw images of all 236 sticky traps to the folder insect_imgs/, which should be empty when you download or clone the GitHub repository. You can either manually download the raw images at https://figshare.com/ndownloader/files/41804211 and insert it into the specified folder, or run the following command line (please, note that the zip file is ~3GB, so this should take a while to download):

#download.file(url = "https://figshare.com/ndownloader/files/41804211",
#              destfile = "insect_imgs/originals.zip",mode = "wb",)

Nice! Now we need to unzip the file, run the following:

#unzip(zipfile = "insect_imgs/originals.zip",exdir = "insect_imgs/")

3. Image segmentation and slicing into pieces

Now that we got our images, we are going to conduct a process called Image segmentation, which basically means that we are going to select which elements we are interested in an image (in our case, insects). To do this, we will automatically identify the background (yellow sticky traps) and separate it from the foreground (insects) using the function mask_segmentation(), which was created by our team specifically for this task. Before running it, let’s take a look on what exactly this function is doing.

First of all, the mask_segmentation() function will load the original image from the sticky trap:

Figure 1. R plot of the original image, which is being loaded by the mask_segmentation function.

Second, the mask_segmentation() function will apply a blur to the image, so the edges of the foreground are soften for the segmentation. This blur is very soft, so you may not even notice it:

Figure 2. R plot of the image after the blur application done by the mask_segmentation function.

Thirdly, the mask_segmentation() will resize the image, if you want to. This resize step is useful if you have limited computational power. The function will run much fast with a resized image. You may adjust how much you want to resize the image using the Resize argument. The default option is 1, which means that the full resolution will be used. If you set the Resize argument to 2, then the resolution used will be 1/2 of the original image. If you set it to 4, then the image will be resized to 1/4 of the original resolution, and so on. For our images, we recommend setting the Resize argument to any number ranging from 1 to 5, no more than that.

Following, the mask_segmentation() function will then convert the RGB image to gray-scale for the next steps:

Figure 3. Image after resize and conversion to gray-scale.

In the next step, the mask_segmentation() function will apply an Adaptive Threshold in the image, which means that light-colored pixels will be converted to white pixels and darker-pixels will be converted to black pixels, leaving the insects in the sticky trap evident (figure 4). This Adaptive Threshold method is useful for images with varying lighting conditions, which is the case of our images. It will conduct this Threshold to small pieces of the original image so that every region has its own threshold values (pieces, i.e., window, of size 10% relative to the largest side of the image). In the function, you can set the K argument to any number between 0 and 1. When K is high, local threshold values tend to be lower. When K is low, local threshold values tend to be higher. Default is 0.2, which works fine for our images. The image will look like this after Adaptive Threshold:

Figure 4. Image after Adaptive Threshold using default parameters.

Following, the function applies a erode mathematical operation to enlarge a bit the edges of the white pixels. Then, white polygons too small are cleaned and the colors are inverted in a way that black pixels now represent the foreground and white pixels represent the background:

Figure 5. Image after erode, clean and inversion.

Then, all white polygons that are connected are separated into small pieces in a imlist and x and y coordinates of the limits of each small piece are extracted to a data frame. Further, a cluster analysis is conducted using the gower distance to check which set of white pixels are closer to each other:

Figure 6. Cluster dendrogram representing which set of white pixels are closer to each other in the image.

In this step, the dendrogram is cut in a specific height so that groups of white pixels that are closer to each other are formed. The numeric scalar in which the dendrogram is cut can be set in the H argument of the mask_segmentation() function. Default value is 0.05, which works fine for our images. The groups that are formed after cutting the dendrogram represent the white pixels that will be together in the final image piece. In this way, insects that are close to each other stay in the same image. In the end, the original image is sliced into many small pieces. Note that you can specify a margin for these small pieces using the Margin argument of the mark_segmentation() function. We suggest you to set this margin to 100 pixels (the default), in this way, we can capture more details from the image that would otherwise go missing. These small pieces are then saved in a folder specified by the user, which can be set in the destFolder argument of the mark_segmentation() function. Here, we choose to save all the pieces in the “insect_imgs/slices/all_slices/” folder. A XML file is saved with the small pieces representing the position in which the white pixels (i.e., the insects) were located. The white pixels in all XML files will be labelled as “insects” but we will refine these annotations later. We can later use these XML files to accelerate the annotation process in the next steps.

Figure 7. Examples of image pieces generated after running the mask_segmentation function.

Nice! After this (not so) short explanation about how the function works, let’s finally generate these small pieces for all our images. You will need to provide a vector with the path for all images you want to separate into small pieces, we do this in the line paste0("insect_imgs/originals/",dir("insect_imgs/originals/")) and store these paths in the images object. These will need to be provided for the function using the imgs argument. Then, we specify the folder we want the images and XML files to be saved in the line destination.folder="insect_imgs/slices/all_slices/". The folder where images and XML files will be saved is then stored in the destination.folder object. Now, we can move forward and run the mask_segmentation() function using the specific arguments. You will get a loading bar that will indicate to you when this process is over, this can take several hours. If you want to download the same specific image slices that we used in the paper, you can jump ahead and download it directly using code chunck after the next one.

images=paste0("insect_imgs/originals/",dir("insect_imgs/originals/"))[300:469]
destination.folder="insect_imgs/slices/all_slices/"
#mask_segmentation(imgs = images,destFolder = destination.folder,Resize = 4, K=0.2, Blur=5, H=0.05, Margin=100)

Please, note that we removed several sliced images from our dataset during annotation because there was nothing in it. In the end, we ended with the 8926 slices described in the paper. To download these, run the following:

4. Image annotation

Now that we have all the image pieces we need, it’s time to properly label these insects! This process is called annotation and consists on basically naming a considerable amount of images to train our object detection model. For this task, we developed an interactive function that uses the previous XML file we created on the previous step.

slices=paste0("insect_imgs/slices/all_slices/",dir("insect_imgs/slices/all_slices/",pattern = ".jpg"))
destination.folder2="insect_imgs/slices/annotated/"
#auto_annotation(imgs = slices, destFolder = destination.folder2, Blur = 2, K = 0.2, Erode = 5)

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
Pre-Annotated Photos		Pre-Annotated Photos
Raw Photos/images		Raw Photos/images
images		images
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

this page is still under construction, come back later 😴 Reproductive workflow of Colares et al. (2024)

1. Setup

2. Download the images

3. Image segmentation and slicing into pieces

4. Image annotation

About

Uh oh!

Releases

Packages

lucas-colares/InsectDet

Folders and files

Latest commit

History

Repository files navigation

this page is still under construction, come back later 😴 Reproductive workflow of Colares et al. (2024)

1. Setup

2. Download the images

3. Image segmentation and slicing into pieces

4. Image annotation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages