DETECTION OF MICROPLASTICS USING MACHINE LEARNING
ZENON CHACZKO1, PETER WAJS-CHACZKO2, DAVID TIEN3, YOUSEF HAIDAR1
1
University of Technology Sydney, Ultimo, 2007, Australia
2
Macquarie University, Macquarie Park, 2109, Australia
3
School of Computing & Mathematics, Charles Sturt University, NSW, Australia
E-MAIL
[email protected],
[email protected],
[email protected],
[email protected]Abstract: characteristics of pollution and the impact of macroplastics
Monitoring the presence of micro-plastics in human and on the aquatic, soils and aerial ecology. However, there is an
animal habitats is fast becoming an important research theme absence of standardised protocols, methodologies, best
due to a need to preserve healthy ecosystems. Microplastics practices and tools for the quantification [4]. And there is an
pollute the environment and can represent a serious threat for
urgent need for the uniformity of methods, procedures and
biological organisms including the human body, as they can be
inadvertently consumed through the food chain. To perceive
protocols related to microplastics presence, extraction and
and understand the level of microplastics pollution threats in their classification (identification). This research aims to
the environment there is a need to design and develop reliable address this need. In order to better understand the level and
methodologies and tools that can detect and classify the impact of microplastics pollution, methods, computational
different types of the microplastics. This paper presents results solutions and sensing devices that can detect and classify the
of our work related to exploration of methods and techniques different types of the microplastics urgently required [4, 5].
useful for detecting suspicious objects in their respective In the field of plastic recycling there are some useful and
ecosystem captured in hyperspectral images and then practical tools present; hyperspectral imaging is on of most
classifying these objects with the use of Neural Networks promising technologies, for it successes [5-9].
technique.
The main objectives and outcomes of this research work
Keywords: project is to explore a reliable approach for finding an
Microplastics; Machine learning; Neural network; effective computing solution for locating objects of interest
Identification; Classification; Hyperspectral image in an image (canvas), as well as, to perform classification of
these objects for correct identification.
1. Introduction
A. Scope, Aims & Objectives
Microplastics are fast becoming persistent contaminants
in both aquatic and terrestrial environments [1-4]. There is At this stage, this research is only concerned about
an increasing scholarly work about the presence and exploring methods of detecting the number of objects along
negative effects of microplastics on marine environments [2, with their respective location in a hyperspectral image,
3]. So far, a relatively small number of reports can be found whilst the classification of objects applies the Neural
about the need for studying contamination sources, pollution Networks. Thus, the main objectives and outcomes of this
characteristics and ecological impact of microplastics. There project is to find a reliable method for:
is mounting evidence that macroplastics can influence soil
biota at different trophic levels [4], and thus pose a serious • locating objects in an image/canvas
threat to animal and human health in the food web. Further • classifying objects
in-depth investigation is required to uncover the full impact
and define ecological threats related to microplastics at The proposed Neural Network-based solution is to
various levels and in all kinds of biological ecosystems. To employ the supervised learning approach which needs data
study the impact of microplastics on various environments to be used as training samples. Even though there are other
and humans, researchers require better analytical methods of training Neural Network models without data,
methodologies and tools which would allow to determine the such as the unsupervised learning approach, they still require
some information about the application or scenario; in the behind this separation is because to further ensure that the
unsupervised learning approach a way of observing a model evaluation samples are new data that are not seen in the
or a system is required to adjust and learn, to reach the training phase; this will give an indication of how the system
optimal solution. In this project’s scenario it is unrealistic to will perform on unseen data.
have a model that simulate the way plastics react to light
synthetically and accurately to the real world. Thus, the A. Tools And Libraries
supervised learning approach was chosen.
The following selection of software development
2. Methodology environment and analytical tools were explored, integrated
and used in the experimental solution:
As previously stated, a dataset is required for training the
Neural Network model, but none were to be found in the • Python – main language used
public domain. Some creative thinking was needed to keep • tensorflow – GPU enabled library for building Neural
the research alive and one main idea stood out; that idea was Networks
to use a dataset of hyperspectral images of objects, and • numpy – data manipulation module (library)
discarding the requirement of having them be images of • matplotlib – graphing and plotting
microplastics. The motivation was, that at this stage this • tensorboard – graphing and plotting
research project is concerned about seeing if hyperspectral • scikit-learn – used for the confusion matrix and for
images can be used effectively in Neural Networks, to apply shuffling data
that in the field of microplastics pollution research. • scikit-image – used for finding the contours for object
The “SpecTex ” (University of Eastern Finland n.d.) detection
dataset was used for this project [9]. This dataset includes • panda – usually used for data manipulation, but in this
spectral images of sixty textile samples with different texture case, it was opportunistically used for displaying tables
patterns [9]. In this research case, those samples can be • Jupyter – a playground for quickly and conveniently
considered as (suspicious contaminant) objects. The images experimenting with python
have a size of 640 x 640 x 39, as in image high times image
width times number of channels. The number of channels
refer to different snapshot of the object in different
wavelengths, ranging from 400-nm to 780-nm with a 10-nm
increment. To prepare the data for training, a limited number
of files (10), that contain the samples, were selected for the
experiment (see TABLE 1).
TABLE 1. The list of files selected for the experiment
A random point was taken from each file, and using that
FIGURE 1. A single layer of 39 layers in the T14.tif file
point 28x28x39 sized images were cropped at the center of
that point to simulate plastics that were broken down to Key Concepts
small pieces by the sun and ocean waves. Then the cropped In 2013, Zhang et al. in their research work [8] on tensor
images were slightly distorted by using a circular mask to discriminative locality alignment and spectral–spatial
simulate the shape of a pellet. This process was repeated 100 feature extraction in hyperspectral image discussed
times each file, and another 10 times for each files. The first dimensionality reduction of hyperspectral data using the
time to generate the training samples and the second time to Principal Component Analysis (PCA) approach. The
generate the evaluation/testing samples. However, there purpose was to remove the redundancy of information and
were two regions that were chosen for each type of data thus reduce the size of the data. The redundancy comes
samples to be generated from (see Fig. 1). The smaller from the inter-band correlation which is can be very high,
region is used for the testing as it fewer samples to be and it can be reduced, with a rather low chance of losing
generated are needed than the training samples. The reason significant information. However, giving the fact that the
size of the data set used in this initial phase was rather small Note, Step 5 can also be placed before Step 3, since the
to begin with, PCA was not considered but not employed in cropping and resizing can only happen after all the first 5
the project. Also, Zhang et al [8] introduced a very useful steps are completed. However, the rest of the steps should
concept of tensor representation of hyperspectral images be in their respective order.
(HSIs) that we used to put forward the HSI feature extraction,
which is called tensor discriminative locality alignment A. Object Detection
(TDLA) method. In our experimentation, the tensor
representation of HSIs concept was employed. The HSIs are Assume that the figure 3a below as an example of a
composed of several images/layers, that represent a spectral- grayscale image of an object, applying the find contours
channel/spectral- wavelength. Images/layers could be then function from scikit-image, will yield what is shown in
represented by a grayscale image, where the values of pixel figure 3b. The con- tour’s maximum and minimum values in
represent the intensity of the light wave in that pixel both the Y and X axis, are calculated to construct the
corresponding to the spectral-channel. bounding-box of the object, and then using that bounding-
box the image is then cropped as shown in 3c. That cropped
3. The design and implementation of Machine image is then resized to fit the input specification
Learning Solution of the Neural Network, but that is not shown here. It is
important to note that the cropped image will not be the
The Machine Learning system need to execute in the grayscale image but rather all the 39 layers of different
deployment state. It is assumed that when the system is wavelength data, but this is shown just for clarity of what
deployed, its computing model is already trained. The would happen.
deployed system (see Fig. 2) executes the following steps: The goal of this process is to automate the extraction of
multiple objects within a single snapshot, to properly feed
1. The objects/microplastics are placed in a canvas
them to the Neural Network model. For example, the Figure
2. A hyperspectral image is taken of the whole canvas
4a below shows a 500 x 500 canvas where objects are
3. One layer of the hyperspectral image is supplied to the
randomly placed in it, and the noise was introduced to see
Object Detector
the reliability of the method; the Figure 4b show the result
4. The locations of the objects are supplied to the
of the find contours function, which detected 6 objects and
“crop&Resize” module
colored them differently. Similarly, the max and min is used
5. All the layers are supplied to the “crop&Resize”
to extract the objects (Fig. 4c). Again, the actual
module
extraction/cropping will be applied to all 39 layers for each
6. Cropped and correctly resized hyperspectral images is
object (Fig. 5); In the Figure 5 it is hard to distinguish the
supplied to the Neural Network model
difference between each layer; remember that each layer
7. The Neural Network provides the prediction
represents an image of a wavelength, ranging from 400-nm
to 780-nm with a 10-nm increment. However, a computer
algorithm or for this case a trained Neural Network model,
could possibly recognise the subtle differences and use them
to correctly classify the objects.
FIGURE 3. Object detection example
FIGURE 2. System Design – Already Trained NN Model
FIGURE 4. Example of objects extraction
FIGURE 6. Neural Network Implementation Design
B. Object Classification
This part of the system was implemented using tensorflow;
this section will describe the implementation details, along
with tensorflow’s layers and train application programing in-
terface(API). The structure of the system is as follow, the
input layer, three fully-connected dense layers, and lastly the
output layer; each layer, along with the training component
shall be discussed in the sections below. Also, there will be
an extra section that describes the other, less important,
components. The layers API is a python module that
contains a set of functions to construct Neural Network
related layers, such as convolution layer, denser layer, and
FIGURE 5. Example of all 39 Layers of an object
more. The way it is designed allow for stacking of layers, as
The contours of every object are being calculated and in the output of the previous layer can be passed to the next
used to crop the objects as shown in LISTING 1. The loop layer. Neural Network programing especial the
in line 5, is a secondary filter that ensure that the found backpropagation stage is highly error prone; the layers API
contour must have more than 30 points representing it, take care of the low-level implementation details, and leaves
because otherwise, the found contour is likely representing high-level design up to the user.
a anomaly or noise in the image rather than an actual object. Moreover, tensorflow’s train API will take care of the
Line 13 and 14 show how the bounding box are calculated backpropagation process. By using both modules, this will
from the points in the contour. hence eliminate most of the chances of introducing errors in
the model. The Figure 6 below shows the graph generated
LISTING 1. Object Detection and Extraction by the tensorboard tool; while the tool only allows exporting
the graph in “png” format, one can merely copy the svg code
from the Internet browser and convert it to a pdf file for a
better quality, which is what was done and shown in the
figure.
All layers, except the 3rd dense layer, have a rectified
linear unit or a RLU as their activation function. The RLU
can be defined with the equation below.
f (x) = x+ = max(0, x) (1)
Also the weights and biases were initialized with the C. Input Layer
xavier initializer provided by tensorflow ; it is consider to be
the state of the art technique for initializing the weights. The “target” component, shown in Figure 6 is an input,
Rather than initializing the weights randomly using this that contain the true label; it is only used in training, and not
function is assumed to have less chance for the Neural in prediction mode.
Network to get stuck on the local minimum of the cost
function in the training stage, and hence, it get a better TABLE 2. Input Data Shape
chance of finding the global minimum.
LISTING 2. Neural Network Model’s Source Code
D. Densely Connected Layers
TABLE 3. Densely Connected Layers Data Shape
E. Output Layer
TABLE 4. Output Layer Data Shape
F. Training Components
The “softmax_cross_entropy_with_logits” function
supplied by tensorflow’s API is used to represent the cost
functions. The cross entropy function require the actual true
labels in order to calculate the cost. The calculated cost is
then reduced by averaging the container tensor on the batch
axis, in other words the shape of the tensor would be [1 10]
rather than the original [20 10] shape. The averaging help in
smoothing the learning curve. This is what is called batch
mode; no conclusive research on literature was done by this
report, but batch mode was used in most examples on the
Internet, so batch mode was automatically chosen over
stochastic mode. After that the cost is supplied to the training
component, which is just an “AdamOptimizer ”, that is
implemented by tensorflow. The tensorflow API provide
different optimizers, and all of them managed the A. Optimization
backpropagation process; it incorporates the weights
regulation process, which requires the gradient that is The sample data used for training had a size of 1000. The
calculated from the loss; the gradient calculation is also data were shuffled in a random order using function
managed by the optimizer. The optimizer requires a scalar designed for this purpose from the scikit-learn tool; the
value to define the learning rate, so the value 1 10−4 was function merely shuffles the data and there is nothing special
picked. The “AdamOptimizer ” was picked as it seemed to about it, but it was used, nonetheless, merely for
be the state of the art algorithm that every article/tutorial convenience. As discussed earlier, the results are generated
used. (Fig. 7), using the data collected from the “logging_saver ”
component, which logs scalar values every 10 iteration of
G. Related Components training steps; the data were conveniently accessible from
the tensorboard tool. The loss results (Fig. 7) show that these
The “metrics” and the “logging_saver” are the only values continue to decrease until they stabilise. The loss
components left unmentioned. Both components are related value refers to how much of a change is needed to minimize
and they are unnecessary for the application, but are used for the error and reach the optimal solution. Just because the loss
gathering metrics about the training phase progress. Logs are has stabilized, it does not necessarily mean that it had
captured on every 10th iteration of training steps, and they reached the optimal solution; in fact, there is no global
contain the accuracy-and-loss, resulted within each training method, across all scenarios and situations, that proves that
step/batch. The data is shown in Figure 7. the found solution is optimal. To do that one would need
specific knowledge on the field of interest. On the other hand,
the accuracy value (Fig. 7b) refers to the accuracy over the
training batch, and it does not necessarily evaluate or prove
that the model is perfect, like how it is shown to be reaching
a 100% accuracy. However, it does show that, at least,
within the training dataset, the model is figuring out the
relationship between different features in the data and
classifying them correctly.
LISTING 3. Optimization Process – Source Code
FIGURE 7. Metrics Over Training
4. Experimental Results
This section will only focus on the object classification
part, or the Neural Network model; that is because the object
detection is a well-known area and the results taken from the
experiments were to be expected, and are already shown in
previous sections. There were two stages in the building of
the Neural Network model, to be discussed; the optimization
stage and the evaluation stage. Both sections will include a
brief description of how these results were realized and
gathered.
LISTING 4. The Optimization Process Output LISTING 5. Evaluation Process – Source Code
FIGURE 8. Falsely Classified Images – After Optimization
B. Evaluation of Results 5. Conclusions
After a thousand iteration of training, the evaluation This paper has demonstrated how the Machine Learning-
resulted in an accuracy of 95%, with a 95 correct based solutions can be designed and built for the effective
classifications out of 100 samples, that were never seen by identification and classification of microplastics objects.
the Neural Network model. The figure 8 below show all 5 A viable, practical and possibly standardised solution for
misclassified objects. While this may seem impressive, it is these tasks can be archived by incorporating and integrating
not an indication of performing as good in the real-world various public domain advanced tools and libraries.
scenario. That is because it is unclear what did the Neural However, it was found, if we cannot solely rely on such a
Network model learn from the training phase; for example, solution alone. Each time, when somehow a newer set of
the features that it learned about and used, does not exist in data is introduced, it is likely some further changes and
hyperspectral images of microplastics. The question is then, adaptations will be still required. This is because open-
can the Neural Network, given a proper dataset, find features source tools in most cases address very generic solutions to
that distinguish the types of different microplastics, during the algorithm and rarely these solutions cater for the
the training phase. As previously state, hyperspectral adaption and their continued improvement to satisfy the
imaging of plastics is used in the classification of plastics in required accuracy.
the recycling industry, so it is highly likely that the answer
to that question is yes. However, given the fact that no data References
about hyperspectral imaging of microplastics were available
in public, at the time of this report was made, it is impossible [1] Browne, M.A., Galloway, T. and Thompson, R., 2007.
to have a comprehensive conclusion, a one that is free of Microplastic—an emerging contaminant of potential
assumptions. In the end, these results do indicate the concern? Integrated environmental assessment and
possibility of Neural Networks model, trained with Management, 3(4), pp.559-561.
hyperspectral images of microplastics, being applicable. [2] Lusher, A., 2015. Microplastics in the marine
environment: distribution, interactions and effects. In
TABLE 5. Confusion Matrix – After Optimization
Marine anthropogenic litter (pp. 245-307). Springer,
Cham.
[3] Hidalgo-Ruz, V., Gutow, L., Thompson, R.C. and Thiel,
M., 2012. Microplastics in the marine environment: a
review of the methods used for identification and
quantification. Environmental science & technology,
46(6), pp.3060-3075.
[4] Defu He, Yongming Luo, Shibo Lua, Mengting Liu,
Yang Song Lili Leia, 2018. Microplastics In Soils:
Analytical Methods, Pollution Characteristics And series Studies in Computational Intelligence, Springer
Ecological Risks TrAC Trends in Analytical Chemistry, 2015, pp.3-16.
Volume 109, Dec 2018, pp. 163-172. [7] Karaca, A.C., Ertürk, A., Güllü, M.K., Elmas, M. &
[5] Chaczko Z., Kale A., Santana-Rodríguez José Juan, Ertürk, S. 2013, ’Plastic waste sorting using infrared
Suárez-Araujo Carmen Paz, 2018. Towards an IOT hyperspectral imaging system’, pp. 1-4.
Based System for Detection and Monitoring of [8] Zhang, L., Zhang, L., Tao, D. & Huang, X.
Microplastics in Aquatic Environments, IEEE 22nd 2013, ’Tensor Discriminative Locality Alignment for
International Conference on Intelligent Engineering Hyperspectral Image Spectral–Spatial Feature
Systems (INES 2018), June 21 2018, Las Palmas, Spain, Extraction’, IEEE Transactions on Geo-science and
pp. 000057-000062 Remote Sen,
[6] Kale A., Chaczko, Z., Evolutionary Feature [9] University of Eastern Finland n.d., Spectex | UEF,
Optimization and Classification for Monitoring University Website, Finland, viewed 28 May 2018,
Floating Objects, Computational Intelligence and <https://www.uef.fi/web/spectral/spectex>., 2015) (pp.
Efficiency in Engineering Systems, Volume 595 of the 107-112).