Enhanced Deep Learning Framework For Efficient Garbage
Enhanced Deep Learning Framework For Efficient Garbage
Information Sciences
journal homepage: www.elsevier.com/locate/ins
A R T I C L E I N F O A B S T R A C T
Keywords: The growing environmental challenges associated with waste disposal have accelerated the need
Garbage classification for intelligent and sustainable waste management systems. This paper introduces an advanced
Deep learning deep learning-based approach that integrates multiple techniques to improve garbage classifica
ResNet-152
tion performance. The proposed framework leverages ResNet-152 as a feature extractor, utilizes
FAISS similarity search
Attention mechanism
its deep hierarchical representations to capture essential visual patterns from waste images. To en
Waste management hance feature discrimination, we employ FAISS-based similarity search, which retrieves the most
relevant embeddings to refine classification. An attention mechanism emphasizes key features
while reducing less relevant ones. To further improve generalization and robustness, we integrate
regularization techniques such as dropout and DropConnect, reducing overfitting and enhancing
model resilience. Label smoothing is applied to mitigate overconfidence in softmax predictions,
resulting in better calibration. The framework is evaluated on a diverse garbage classification
dataset, and performance metrics such as accuracy, precision, recall, and F1-score are reported.
The paper provides a thorough mathematical formulation and algorithmic details to ensure re
producibility. The experimental findings show that the proposed method outperforms traditional
techniques in terms of classification accuracy. This work contributes to developing an intelligent
waste sorting system, which can significantly enhance recycling efficiency and promote sustain
able waste management practices.
1. Introduction
With the accelerating pace of urbanization, escalating consumption patterns, and growing environmental concerns, waste man
agement has emerged as a critical global issue. Effective recycling hinges on the accurate classification of waste materials, which is
essential for reducing the ecological burden associated with improper disposal. Traditional manual sorting methods, however, are
often inefficient, labor-intensive, and prone to human error. These limitations underscore the necessity for intelligent, automated
solutions. In this context, recent advancements in deep learning have demonstrated remarkable success in image-based classifi
cation tasks, offering a promising pathway toward more efficient and scalable waste management systems. Convolutional Neural
Networks, in particular, have demonstrated superior performance in extracting meaningful features from complex visual data. How
ever, challenges such as intra-class variability, inter-class similarity, and background noise make garbage classification a non-trivial
task. Additionally, ensuring both high classification accuracy and computational efficiency remains a key challenge, especially for
real-world deployment.
* Corresponding author.
E-mail address: [email protected] (D. Ghosh).
https://doi.org/10.1016/j.ins.2025.122462
Received 23 April 2025; Received in revised form 23 June 2025; Accepted 24 June 2025
Available online 2 July 2025
0020-0255/© 2025 Elsevier Inc. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
D. Ghosh and A. Goswami Information Sciences 719 (2025) 122462
In this work, we propose an advanced deep learning-based framework for garbage classification that integrates multiple techniques
to enhance classification performance. Our approach leverages ResNet-152, a powerful CNN architecture, as a feature extractor to
capture deep hierarchical representations of waste images. To further improve feature discrimination, we incorporate FAISS-based
similarity search, which retrieves the most relevant embeddings to refine classification. Additionally, an attention mechanism is intro
duced to selectively emphasize important features while suppressing irrelevant information, thereby enhancing model interpretability.
To mitigate overfitting and improve generalization, we employ regularization techniques such as dropout and DropConnect. Fur
thermore, label smoothing is applied to prevent overconfidence in softmax predictions, leading to better-calibrated probability
distributions. The proposed framework is assessed using a diverse dataset comprising various categories of waste, and its perfor
mance is evaluated through comprehensive metrics, including accuracy, precision, recall, and F1-score. The results indicate that the
model offers a reliable and scalable approach for automated waste classification, supporting the advancement of sustainable waste
management systems. Empirical evaluations reveal that the method surpasses traditional classification techniques, delivering superior
accuracy and aligning with state-of-the-art performance standards. The mathematical formulations and algorithmic details presented
in this paper ensure reproducibility and provide a foundation for further research in intelligent waste classification systems.
The rapid growth of urbanization and industrialization has significantly increased waste generation, leading to a major chal
lenge in waste management. Effective waste classification is essential to improve recycling efforts and environmental sustainability.
Conventional manual sorting techniques demand significant human effort and are susceptible to mistakes, frequently resulting in
operational inefficiencies and potential health risks [1]. Therefore, employing automation in waste sorting, especially by utilizing
deep learning techniques, has emerged as a highly effective approach to address this issue.
Deep learning techniques, particularly CNN, have seen extensive application in image recognition tasks, demonstrating superior
performance in automatically classifying waste materials [2]. These models can be trained to identify various types of waste, including
recyclables, organic matter, and hazardous items, without human intervention. Their ability to process large datasets and extract
hierarchical feature representations from images has made them ideal for garbage classification [3].
Despite their success, deep learning models face several challenges in garbage classification. A major challenge lies in obtaining
large, varied, and annotated datasets for training [4]. Waste images vary greatly in terms of size, color, and texture, making it difficult
for a single model to handle all variations [5]. Moreover, data labeling for waste materials is often limited, making it challenging
to build comprehensive datasets for model training. To overcome this challenge, methods like data augmentation, transfer learning,
and the creation of synthetic data are frequently employed to enhance model accuracy [6].
Another challenge is the wide variety of waste categories, each with its distinct characteristics. For instance, plastics, metals,
paper, and organic materials may differ significantly in terms of shape and appearance [7]. Multi-category waste or cluttered images
further complicate classification tasks. These models leverage both the strength of CNNs and other network types, such as Recurrent
Neural Networks (RNNs) are used to capture both temporal and spatial features for more accurate classification [8].
The computational complexity of deep learning models also poses a barrier to their use in resource-constrained environments,
such as smart waste bins or edge devices [9]. To mitigate this issue, researchers have focused on model optimization techniques
like pruning, quantization, and the use of lightweight architectures, such as MobileNet [10]. These approaches aim to reduce the
computational load while maintaining high classification accuracy. In addition, combining deep learning models with conventional
machine learning techniques, like support vector machines, has been explored to balance performance and efficiency [1].
The integration of IoT devices with deep learning-based systems has gained attention as a way to improve waste management.
Smart waste bins equipped with sensors and deep learning models can automatically classify waste as it is deposited, allowing for
real-time sorting and feedback [11]. The combination of these technologies facilitates intelligent, adaptive waste management systems
that are capable of optimizing waste collection and disposal processes.
However, several challenges remain in deploying deep learning-based garbage classification systems on a large scale. These include
adapting models to new waste categories or changing waste patterns over time. Transfer learning has been suggested as a way
to address this, as it allows models to be fine-tuned for new data with minimal retraining [12]. Additionally, explainability and
interpretability of deep learning models are crucial for ensuring transparency and trust, particularly in public waste management
systems [13].
The following sections of the paper are structured as follows: Section 2 offers an in-depth review of related literature and founda
tional research. Section 3 discusses the dataset used and the preprocessing steps performed. The overall model architecture is outlined
in Section 4. Section 5 elaborates on the training process, including label smoothing and optimization techniques. Experimental re
sults, including performance evaluation metrics, are reported in Section 6. A detailed framework for integrating the proposed model
into real-world waste management systems is presented in Section 7. Lastly, Section 8 summarizes the paper and presents potential
avenues for advancing automated waste classification systems in future research.
2. Literature survey
The literature on garbage classification using deep learning can be grouped into several key areas, including deep CNN-based
models, optimization techniques, and hybrid models for waste classification. Here is an overview of the relevant research.
Numerous studies have concentrated on utilizing deep Convolutional Neural Networks for efficient waste classification. [1] pro
posed a deep CNN-based vision system for waste classification, enhancing recycling processes through automation. [2] developed an
2
D. Ghosh and A. Goswami Information Sciences 719 (2025) 122462
intelligent system for decoration waste classification using data augmentation and transfer learning. [3] proposed a two-stage waste
recognition system using deep learning algorithms to improve accuracy and robustness. [4] focused on feature fusion techniques
for waste classification, optimizing human-robot interaction scenarios in waste management. [5] used transfer learning with CNNs
for improved waste image classification, leveraging pre-trained models like VGGNet. [6] improved ShuffleNet v2 for garbage clas
sification, focusing on lightweight models for mobile applications. [7] proposed the MRS-YOLO model, optimizing real-time waste
classification and detection with high precision.
A number of studies have investigated hybrid models and transfer learning techniques for waste classification. [8] optimized CNN
models for fast and accurate waste classification, crucial for real-time recycling systems. [9] developed a model combining CNNs
and deep reinforcement learning for efficient recycling of garbage materials. [10] introduced a depth-wise separable convolution
attention module to improve CNN performance on occluded garbage objects. [1] proposed a hybrid model combining CNN with
an autoencoder for improved waste classification under challenging conditions. [11] applied reinforcement learning for garbage
classification to dynamically adapt the model based on new waste data. [12] designed a smart waste classification system using
transfer learning and lightweight networks for mobile deployment. [13] created a scalable waste management system using deep
learning that integrates with IoT devices for improved accuracy.
Research on optimizing deep learning models for waste classification has been an important area of focus. [14] implemented a
deep learning-based waste disposal system optimized for limited hardware resources. [15] used reinforcement learning to optimize
CNN-based waste classification systems for dynamic environments. [16] optimized garbage classification models by integrating edge
computing with deep learning for faster waste sorting. [17] applied an enhanced CNN architecture for waste classification, focusing
on real-time processing in smart cities.
Some researchers have looked at combining various models to improve classification accuracy and adaptability. [18] investigated
a multi-modal deep learning model for waste classification that combines both image and sensor data. [19] implemented a deep CNN
model with adversarial training for enhanced generalization in garbage classification. [20] developed a hybrid model combining
deep CNNs with decision trees for improved garbage classification performance. [21] applied a CNN architecture enhanced with
an attention mechanism for more accurate identification of recyclable waste materials. [22] proposed a novel hybrid CNN and
support vector machine (SVM) model to classify complex waste types. [23] developed a domain-adaptive deep learning model for
robust garbage classification across different environments. [1] proposed a multi-task learning model for garbage classification that
incorporates both image data and metadata.
Several studies have identified areas of improvement and future research directions for deep learning-based garbage classification
systems. [24] combined CNN with attention mechanisms to enhance garbage classification accuracy in noisy environments. [25] used
a fusion of CNN and deep reinforcement learning to enhance waste classification in dynamic settings. [26] investigated a hybrid CNN
and RNN model for waste classification, combining spatial and temporal features. In recent years, several researchers have introduced
new techniques to improve the robustness and efficiency of deep learning-based waste classification systems. [27] explored a hybrid
CNN and random forest approach for waste classification in smart recycling systems. [28] proposed a novel deep CNN model that uses
feature pyramid networks to handle complex waste objects. [29] introduced a collaborative filtering method for garbage classification,
enhancing model adaptability in multi-domain environments.
Recent research has introduced innovative methodologies in hazard classification, high-dimensional classification, and sustain
able waste management by integrating deep learning, grey modeling, and optimization techniques. [30] proposed a novel hazard
event classification framework utilizing deep learning and multifractal analysis with a hierarchical gating neural network, achieving
superior classification of hazard severity, possibility, and risk. Similarly, [31] developed a hybrid model combining deep learning
with grey modeling, introducing the DLGM framework with a Fourier series-enhanced grey model (FSGM(1,1)) for improved haz
ard classification accuracy. [32] addressed high-dimensional classification challenges by introducing an optimized fuzzy clustering
approach, incorporating Quantum Particle Swarm Optimization (QPSO) for enhanced classification performance. In the domain of
waste management, [33] applied convolutional neural networks (CNNs) to classify and estimate the mass composition of recycled
aggregates, enabling real-time industrial monitoring and improving circular economy processes. [34] proposed an IoT-integrated
predictive waste management framework using federated learning and blockchain, enhancing real-time monitoring and optimizing
waste collection in smart cities. These studies collectively demonstrate the transformative impact of artificial intelligence, optimiza
tion techniques, and blockchain technology in industrial safety, classification systems, and sustainable waste management. To address
the limitations of existing waste classification systems, [35] proposed SwinConvNeXt, a fused deep learning architecture integrating
Swin Transformer and ConvNeXt for high-accuracy real-time garbage image classification.
3
D. Ghosh and A. Goswami Information Sciences 719 (2025) 122462
Although convolutional neural networks (CNNs) have demonstrated strong performance across a range of image classification
tasks, their application to waste classification remains challenging due to several inherent limitations. Pretrained backbones such as
ResNet152 provide general purpose feature extraction capabilities but are often insufficient for fine grained visual distinctions required
in waste sorting scenarios, where inter class similarities and intra class variability are common. For instance, classes like paper,
cardboard, and plastic frequently exhibit overlapping visual properties, which makes them difficult to separate using generic features
alone. Furthermore, existing models generally process each image independently, neglecting the semantic relationships between
instances that could be captured via similarity based retrieval or contextual learning. This lack of contextualization contributes to
misclassifications, especially for ambiguous or borderline cases. Another critical issue is the class imbalance prevalent in most real
world garbage datasets. Underrepresented categories, such as battery or metal, are often misclassified due to the bias introduced
by skewed class distributions. Finally, the classification heads used in many existing architectures are shallow and lack adequate
regularization, leading to poor generalization and overfitting, particularly in data scarce or noisy environments.
To address these challenges, we propose a retrieval-augmented attention-based classification framework that enhances feature
expressiveness and robustness. The model begins by extracting high-level features using a frozen ResNet152 architecture. These
features are indexed using FAISS, which allows for efficient identification of the most similar training instances for each input image.
The retrieved similar feature is then fused with the input feature through a self-attention mechanism designed to highlight salient and
complementary aspects of both vectors. This context-enriched representation is passed through a deep classification head comprising
multiple fully connected layers with batch normalization, dropout, and DropConnect regularization, which collectively enhance
generalization capacity. To further improve model performance, label smoothing is incorporated into the loss function to reduce
overconfidence in predictions, and class-balanced loss weighting is applied to counter the impact of data set imbalance. The training
regime includes mixed precision computation and a one-cycle learning rate schedule, both of which contribute to efficient and stable
convergence. Overall, the proposed method offers a comprehensive and effective solution to the limitations of conventional CNN
based classifiers, particularly in the context of fine-grained and imbalanced garbage classification.
• We leverage a pre-trained ResNet-152 backbone to extract rich hierarchical features, providing a strong visual representation
foundation.
• We introduce FAISS-based approximate nearest neighbor search to retrieve relevant feature embeddings, enhancing robustness
by incorporating similarity information.
• We design an attention-based feature fusion mechanism that adaptively combines original and retrieved features, improving
discrimination among visually similar garbage classes.
• Advanced regularization techniques, including DropConnect and label smoothing, are integrated to improve model generalization
and reduce overfitting.
• Comprehensive ablation studies and multiple-seed training experiments validate the positive impact of each component on overall
classification accuracy.
Our proposed framework achieves state-of-the-art accuracy on a diverse garbage dataset while maintaining computational feasi
bility, demonstrating its practical potential for real-world waste management applications.
Our model is evaluated using a real-world dataset of garbage images, demonstrating significant improvements in precision, re
call, F1-score, and overall classification accuracy compared to baseline models that rely solely on image features. In the context of
advancing sustainability and enhancing recycling processes, efficient waste classification plays a crucial role, particularly within a
circular economy framework. Traditional image-based systems, while improving, often struggle to distinguish visually similar objects
or consider non-visual factors like recyclability and environmental impact. To address these challenges, our proposed model builds
upon and surpasses existing approaches by integrating multiple enhancements, including hierarchical feature extraction, FAISS-based
similarity search, and attention-driven refinement. Unlike prior works that focus solely on CNN architectures or similarity-based re
trieval, our method combines these strengths to achieve superior classification accuracy. The incorporation of advanced regularization
techniques and calibration mechanisms further ensures robustness and reliability in real-world applications. Experimental compar
isons against benchmark models consistently demonstrate our framework’s effectiveness in handling diverse waste images, making it
a highly promising solution for automated waste sorting and sustainable environmental management.
Fig. 1 shows the process flow of the proposed garbage classification model.
The dataset used in this study was sourced from Kaggle, a widely recognized platform for machine learning datasets. Although
the original data set comprises 12 classes, including three separate categories for glass, we retained only the white glass class and
excluded the other two glass related categories. Consequently, the final dataset used in this work consists of 𝐶 = 10 distinct semantic
classes: plastic, paper, metal, clothes, shoes, trash, biological, white-glass, cardboard, and battery.
4
D. Ghosh and A. Goswami Information Sciences 719 (2025) 122462
Given the inherent imbalance in real-world waste datasets, data augmentation techniques are employed to enhance generalization
and robustness.
To enhance model generalization and mitigate overfitting, subsequent data augmentation methods are applied to each input
image 𝑋 :
• Random Resized Cropping: A random region of the image is cropped and resized to a fixed size, ensuring that models learn
scale-invariant features.
• Horizontal Flipping: Each image is flipped horizontally with a probability 𝑝 = 0.5, augmenting the dataset symmetrically.
• Color Jittering: Random modifications in brightness, contrast, saturation, and hue are applied, defined as:
𝑋 ′ = 𝛼𝑋 + 𝛽,
where 𝛼 and 𝛽 are randomly sampled coefficients for brightness and contrast adjustments.
• Rotation and Aine Transformations: Random rotations 𝜃 within a defined range are applied:
𝑋 ′ = 𝑅(𝜃)𝑋,
where 𝑅(𝜃) is the rotation matrix:
[ ]
cos 𝜃 − sin 𝜃
𝑅(𝜃) = .
sin 𝜃 cos 𝜃
Aine transformations include translation, scaling, and shearing, defined as:
𝑋 ′ = 𝐴𝑋 + 𝑏,
where 𝐴 is a transformation matrix and 𝑏 is a translation vector.
• Normalization: Each pixel intensity is normalized to a zero-mean, unit-variance distribution using the channel-wise mean 𝜇𝑐
and standard deviation 𝜎𝑐 :
𝑋 − 𝜇𝑐
𝑋′ = .
𝜎𝑐
Let 𝑇 denote the series of transformations applied to an input image 𝑋 , then the final transformed image 𝑋 ′ is given by:
5
D. Ghosh and A. Goswami Information Sciences 719 (2025) 122462
where 𝑇𝑖 represents an individual transformation from the set {𝑇1 , 𝑇2 , ..., 𝑇𝑛 }. The order and probability of each transformation are
carefully tuned to maximize model performance.
These preprocessing techniques ensure that the dataset is sufficiently diverse and robust, allowing deep learning models to extract
meaningful and invariant features for accurate classification.
4. Model architecture
In this work, we utilize ResNet-152, a deep convolutional neural network, as a feature extractor to generate high-dimensional
feature representations of input images. ResNet-152 represents a type of Residual Network (ResNet) architecture, which incorporates
skip connections to facilitate gradient flow and improve training stability for deep networks. The ResNet-152 is fine-tuned to capture
relevant features from garbage images. Features are obtained using the Algorithm 1.
𝐹 (𝑋) = 𝜙(𝑋; 𝜃) ∈ ℝ𝑑 ,
where:
1. Resize the input image 𝑋 to a standard resolution (e.g., 224 × 224 pixels).
2. Normalize the image using channel-wise mean and standard deviation:
𝑋 −𝜇
𝑋′ = ,
𝜎
where 𝜎 and 𝜇 are the standard deviation and mean vectors computed over the dataset.
3. Pass the normalized image through ResNet-152 up to the final global average pooling layer:
• FAISS-based Similarity Search: The feature vectors are indexed in a FAISS database to enable efficient nearest neighbor search.
• Classification: The obtained features are input into a fully connected classifier to predict the corresponding class label.
By leveraging the deep feature representations from ResNet-152, we ensure that our model captures rich semantic information,
facilitating accurate image classification and efficient similarity retrieval.
6
D. Ghosh and A. Goswami Information Sciences 719 (2025) 122462
In order to enhance feature representations and facilitate efficient nearest-neighbor retrieval, we employ FAISS (Facebook AI
Similarity Search). FAISS is a highly optimized library designed for fast searching in high-dimensional vector spaces, making it
particularly suitable for large-scale image retrieval tasks.
Feature vectors extracted from images are indexed using FAISS for efficient similarity retrieval. The FAISS training process is
outlined in Algorithm 2.
𝐷, 𝐼 = FAISS.search(𝑞, 𝑘),
where:
• 𝐷 ∈ ℝ𝑘 is the distance vector containing the distances between the query feature and the 𝑘 nearest neighbors.
• 𝐼 ∈ ℕ𝑘 is the index vector containing the indices of the 𝑘 nearest neighbors in the database.
This ensures precise retrieval at the cost of increased computational complexity. For scalability, alternative FAISS indexes such as
the IVFFlat (Inverted File Index) and HNSW (Hierarchical Navigable Small World Graph) [36] can be utilized to approximate nearest
neighbors efficiently.
index = faiss.IndexFlatL2(𝑑).
4. Add all training feature vectors to the index:
index.add(𝑋),
where 𝑋 ∈ ℝ𝑁×𝑑 is the matrix containing 𝑁 feature vectors.
5. For a given query vector 𝑞 , perform the nearest-neighbor search:
𝐷, 𝐼 = index.search(𝑞, 𝑘).
In this subsection, we present an attention-based feature fusion mechanism that integrates the semantic information of an input
feature vector and its most similar retrieved neighbor. Unlike conventional self-attention across sequence tokens, we concatenate the
input feature and its FAISS-retrieved counterpart into a single 4096-dimensional vector, which is refined using a scaled dot-product
self-attention mechanism. Fig. 2 illustrates the overall computational flow of this attention-based fusion, from feature concatenation
through to the generation of the enhanced feature vector.
7
D. Ghosh and A. Goswami Information Sciences 719 (2025) 122462
𝐱 = [𝐟𝑞 ; 𝐟𝑠 ] ∈ ℝ2𝑑
This combined vector is used to compute query, key, and value projections:
𝑄 = 𝐱𝑊 𝑄 , 𝐾 = 𝐱𝑊 𝐾 , 𝑉 = 𝐱𝑊 𝑉
where 𝑊 𝑄 , 𝑊 𝐾 , 𝑊 𝑉 ∈ ℝ2𝑑×2𝑑 are learnable weight matrices. The self-attention output is calculated as:
( )
𝑄𝐾 𝑇
Attention(𝐱) = softmax √ 𝑉
2𝑑
This attention-refined vector is used for classification.
8
D. Ghosh and A. Goswami Information Sciences 719 (2025) 122462
The proposed classification model comprises multiple fully connected layers, each incorporating batch normalization, Drop
Connect, and dropout regularization. These components enhance the model’s generalization ability by mitigating overfitting and
improving stability during training.
𝐻 − 𝜇𝐵
𝐵𝑁(𝐻𝑖 ) = √𝑖 𝛾 + 𝛽,
𝜎𝐵2 + 𝜖
where 𝜇𝐵 and 𝜎𝐵 2 are the batch mean and variance, 𝛾 and 𝛽 are learnable parameters, and 𝜖 is a small constant for numerical
stability.
• 𝜎(⋅) is the activation function, such as ReLU, which introduces non-linearity into the model.
𝐻𝑖′ = 𝐻𝑖 ⋅ 𝑀, 𝑀 ∼ Bernoulli(𝑝),
where:
• 𝑀 is a binary mask with elements sampled independently from a Bernoulli distribution with probability 𝑝.
• Each element 𝑀𝑗𝑘 in the mask determines whether the corresponding weight 𝑊𝑗𝑘 is active (𝑀𝑗𝑘 = 1) or deactivated (𝑀𝑗𝑘 = 0).
• This ensures that different subsets of the weight matrix are used in each forward pass, improving robustness and reducing reliance
on specific weight connections.
𝑚𝑡
𝜃 (𝑡+1) = 𝜃 (𝑡) − 𝜂 √ ,
𝑣𝑡 + 𝜖
where 𝑚𝑡 and 𝑣𝑡 are biased first- and second-moment estimates, respectively.
These techniques collectively contribute to a more robust deep learning classifier, ensuring improved feature extraction and
classification performance.
9
D. Ghosh and A. Goswami Information Sciences 719 (2025) 122462
Deep Learning (DL) is a specialized subset of machine learning that employs Artificial Neural Networks (ANNs) with multiple hidden
layers, often referred to as deep neural networks. While ANNs are computational models inspired by the biological neural networks in
the brain, Deep Learning specifically leverages architectures with many layers to learn hierarchical feature representations directly
from data.
Differences between deep learning and artificial neural networks: Artificial Neural Networks typically consist of an input layer, one or
more hidden layers, and an output layer. Traditional ANNs generally have a shallow architecture with a limited number of hidden
layers, which restricts their ability to model complex data distributions. Deep Learning extends ANNs by increasing the number of
hidden layers, enabling the model to capture more complex patterns through hierarchical feature extraction.
• Hierarchical Feature Extraction: Our proposed model employs a deep architecture, specifically ResNet-152, which extracts
multi-level visual features. This hierarchical approach improves the model’s ability to distinguish subtle differences among
garbage classes.
• Improved Accuracy and Generalization: Deep Learning models, like the one used here, achieve higher accuracy compared to
shallow ANNs by learning complex feature interactions, as evidenced by the 96.64% classification accuracy obtained.
• End-to-End Learning: Our proposed framework integrates feature extraction, attention-based fusion, and classification in an
end-to-end trainable pipeline, reducing the need for manual feature engineering.
• Robustness with Advanced Regularization: Techniques such as DropConnect and label smoothing improve model generaliza
tion and reduce overfitting, which are more effective in deep architectures.
• Adaptability to Complex Data: The deep model architecture, combined with attention mechanisms and similarity search
(FAISS), effectively handles the variability and complexity of real-world waste images.
Deep Learning builds upon the foundation of Artificial Neural Networks by employing deeper, more complex network architectures
that enable superior performance and adaptability, particularly for challenging classification tasks such as automated waste sorting.
4.6. Convolutional neural networks and their role in our proposed framework
Convolutional Neural Networks (CNNs) are a specialized class of deep neural networks designed to process grid-like data, such
as images. They leverage learnable convolutional filters that slide across spatial dimensions to capture local patterns, progressively
learning hierarchical representations from low-level textures to high-level semantic features. Due to their parameter sharing and
sparse connectivity, CNNs are both computationally efficient and highly effective for image-based tasks.
In our proposed garbage classification framework, we employ ResNet-152, a very deep CNN architecture based on residual learning
as the primary feature extractor. ResNet-152 has demonstrated state-of-the-art performance in various computer vision benchmarks
due to its ability to mitigate the vanishing gradient problem through the use of skip connections. This allows for the training of
extremely deep networks without degradation in accuracy. We utilize the pretrained ResNet-152 model to extract 2048 dimensional
deep visual embeddings from waste images, capturing fine grained characteristics essential for distinguishing between visually similar
categories such as plastic and white-glass.
These deep embeddings are further enhanced using a self-attention mechanism. We introduce a feature fusion module that in
corporates a FAISS-based similarity search to retrieve the most semantically similar embeddings from the training set. The retrieved
features are concatenated with the original features and refined through a multi-head self-attention network, allowing the model to
focus on salient patterns while suppressing irrelevant variations. This mechanism mimics the cognitive process of referencing similar
past observations to make more informed decisions.
To improve the generalization capability of the classifier, we integrate several regularization strategies, including DropConnect,
dropout, and label smoothing. DropConnect randomly removes individual weights during training, effectively acting as a form of
model ensemble, while label smoothing prevents the model from becoming overly confident by distributing a small portion of the
target probability mass across non ground truth classes. These additions mitigate overfitting and enhance the robustness of the final
predictions. Our proposed framework leverages the hierarchical feature learning capabilities of CNNs augmented by attention based
refinement and similarity based enhancement to deliver highly accurate and generalizable performance on the waste classification
task.
Deep neural networks often exhibit overconfidence in their predictions, which can lead to reduced generalization performance.
Label smoothing is a regularization technique designed to address this issue by preventing the model from assigning full probability
to a single class. Instead, it redistributes a small portion of the probability mass among all classes, thereby encouraging the model to
be less certain. As outlined in 4, label smoothing improves generalization during training.
10
D. Ghosh and A. Goswami Information Sciences 719 (2025) 122462
Given a classification task with 𝐶 classes, let 𝐲 = (𝑦1 , 𝑦2 , … , 𝑦𝐶 ) represent the one-hot encoded ground truth label, where 𝑦𝑘 = 1
for the correct class 𝑘 and 𝑦𝑖 = 0 for all other classes. Traditional cross-entropy loss is defined as:
𝐶
∑
𝐿𝐶𝐸 = − 𝑦𝑖 log 𝑃𝑖 ,
𝑖=1
where 𝑃𝑖 is the predicted probability for class 𝑖. In label smoothing, instead of using a hard one-hot encoding, the ground truth
distribution is smoothed as:
𝜖
𝑢𝑖 = (1 − 𝜖)𝑦𝑖 + ,
𝐶
where 𝜖 ∈ [0, 1] is the smoothing parameter that controls how much probability mass is shifted from the true class to the others.
When 𝜖 = 0, the loss reduces to standard cross-entropy.
The smoothed loss is then computed as:
𝐶
∑
𝐿𝑠𝑚𝑜𝑜𝑡ℎ = − 𝑢𝑖 log 𝑃𝑖 .
𝑖=1
The overall loss function combining cross-entropy and smoothed loss is given by:
𝐿 = (1 − 𝜖)𝐿𝐶𝐸 + 𝜖𝐿𝑠𝑚𝑜𝑜𝑡ℎ .
This formulation ensures that the model does not become overly confident in its predictions and can better handle mislabeled or
ambiguous data.
During training, the label smoothing technique is applied in the computation of the loss function. The training process iterates over
batches of data, where each label is transformed before computing the loss. This technique is particularly useful in tasks involving
large datasets with potential annotation errors, as it improves generalization and mitigates overfitting.
Experimental results demonstrate that label smoothing leads to improved robustness and generalization across various classi
fication tasks, making it a widely adopted technique in modern deep learning architectures. The training process is outlined in
Algorithm 4.
6. Experimental results
To evaluate the effectiveness of our proposed garbage classification model, we use common classification metrics such as accuracy,
precision, recall, F1-score, and the confusion matrix. These measures offer a thorough insight into the model’s performance across
different garbage categories.
11
D. Ghosh and A. Goswami Information Sciences 719 (2025) 122462
Table 1
Training Performance Over 25 Epochs.
1 749.10 31.16
2 572.54 66.04
3 417.51 82.58
4 315.20 90.48
5 268.40 92.81
6 252.52 93.43
7 240.87 94.25
8 230.43 95.37
9 222.37 96.12
10 219.09 96.40
15 201.57 98.14
20 191.30 99.26
25 187.75 99.54
The model is trained for 25 epochs, with loss and accuracy recorded at each epoch. The training process demonstrates a rapid
convergence, as shown in Table 1. The loss decreases significantly from 749.10 in the first epoch to 187.75 in the final epoch, while
accuracy improves from 31.16% to 99.54%.
To quantitatively evaluate the model’s classification capability, we compute multiple performance metrics:
• Accuracy:
𝑇𝑃 +𝑇𝑁
Accuracy =
𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁
where the numbers of true positives (𝑇 𝑃 ), false positives (𝐹 𝑃 ), true negatives (𝑇 𝑁 ), and false negatives (𝐹 𝑁 ) are denoted,
respectively.
• Precision:
𝑇𝑃
Precision =
𝑇𝑃 + 𝐹𝑃
• Recall:
𝑇𝑃
Recall =
𝑇𝑃 + 𝐹𝑁
• F1-score:
2 × Precision × Recall
F1-score =
Precision + Recall
• Confusion Matrix: The confusion matrix delivers an in-depth analysis of classification performance across different categories,
as shown in Table 3.
The overall classification accuracy achieved by our proposed model is 96.64%. Table 2 presents a detailed classification report,
including precision, recall, and F1-score for each garbage category.
12
D. Ghosh and A. Goswami Information Sciences 719 (2025) 122462
Table 2
Classification Report.
Fig. 3. The confusion matrix is presented as a heatmap, where darker colors represent greater value concentrations.
The confusion matrix illustrates the model’s classification performance across all categories, highlighting both accurate predictions
and common misclassifications. While Table 3 provides the raw numerical values, we also include a heatmap (Fig. 3) to visually
represent the distribution of predictions. This graphical representation enables quicker and more intuitive interpretation of class-wise
performance.
To assess the efficiency of our classification model, we compare its performance against several widely used deep learning ar
chitectures: VGG16, DenseNet121, MobileNetV2, EfficientNetB0, ConvNeXtTiny, and ResNet152. These models are fine-tuned and
evaluated on the same dataset to ensure a fair comparison. Table 4 presents a detailed performance analysis.
Among the conventional models, VGG16 achieved the lowest accuracy at 84.10%, which can be attributed to its relatively shallow
architecture and higher parameter count, making it less effective in extracting complex waste classification features. DenseNet121 and
EfficientNetB0 performed significantly better (89.71%) due to their efficient feature propagation (DenseNet) and optimized depth
13
D. Ghosh and A. Goswami Information Sciences 719 (2025) 122462
Table 3
Confusion Matrix.
0 1 2 3 4 5 6 7 8 9
0 190 0 2 0 2 1 1 0 0 0
1 0 206 0 0 0 1 0 0 0 0
2 3 0 175 1 3 6 0 1 0 0
3 0 0 0 1036 0 0 0 9 0 0
4 2 0 1 0 139 0 3 0 0 1
5 3 1 4 2 1 188 2 2 0 1
6 0 0 0 0 5 3 147 0 0 14
7 0 0 0 6 0 2 2 410 0 0
8 1 0 0 1 0 0 1 0 128 1
9 0 0 0 0 3 0 2 1 1 141
Table 4
Comparison of Classification Performance Across Different Models.
wise convolution operations (EfficientNet). While ConvNeXtTiny achieved a competitive accuracy of 92.37%, ResNet152 slightly
outperformed it with 92.93%, demonstrating its ability to capture fine-grained details in waste images. Among all state-of-the-art
methods, the Swin Transformer achieves the highest accuracy of 94.41%, outperforming ViT’s 93.26%, highlighting its superior
capability in capturing complex image features for classification.
Our proposed model, which integrates ResNet152 with FAISS-based similarity search and attention mechanisms, achieves the
highest accuracy of 96.64%. The inclusion of attention mechanisms helps in selectively emphasizing important features, while FAISS
retrieval refines classification by incorporating similar embeddings. Regularization techniques, including Dropout and DropConnect,
further enhance model generalization. The results demonstrate that integrating FAISS-based retrieval and attention mechanisms
significantly improves garbage classification accuracy. Compared to traditional deep learning architectures, our proposed model
achieves state-of-the-art performance, making it a robust solution for intelligent waste sorting and recycling efficiency enhancement.
To evaluate the reliability and robustness of our proposed model, we conducted a statistical significance analysis by comparing its
classification accuracy to that of the Swin Transformer, which achieved the highest performance (94.41%) among all baseline models
considered in this study.
Our proposed model is trained and evaluated across six independent runs using different random seeds and varied data splits to
ensure robustness and generalization. The resulting accuracies are [96.64%, 96.60%, 96.11%, 96.88%, 96.53%, 96.30%],
yielding a mean accuracy of 96.51% with a standard deviation of 0.271. For comparison, we also run the Swin Transformer six times
under the same evaluation protocol, obtaining accuracies of [94.41%, 94.30%, 94.45%, 94.51%, 94.35%, 94.40%], with
a mean of 94.40% and a standard deviation of 0.074.
To determine whether this observed performance improvement is statistically significant, we conducted a two-sample, one-tailed
t-test, which does not assume equal variances. The null hypothesis (𝐻0 ) posits that the mean accuracy of our model is less than
or equal to that of the Swin Transformer (𝜇proposed ≤ 𝜇swin ), while the alternative hypothesis (𝐻1 ) asserts that our model achieves
superior accuracy (𝜇proposed > 𝜇swin ).
The resulting t-statistic is 18.394 with an associated p-value of 1.276 × 10−6 . The critical value at a significance level of 𝛼 = 0.05
with 5 degrees of freedom is 2.015, which is far exceeded by the observed t-statistic of 18.436. At a significance level of 𝛼 = 0.05,
the null hypothesis is strongly rejected, confirming that the observed improvement is statistically significant. Furthermore, the 95%
confidence interval for the mean accuracy of the proposed model was computed to be [96.23%, 96.79%], clearly exceeding the
average performance of the Swin Transformer.
The proposed classification framework is designed for optimal performance on standard hardware, explicitly targeting efficient
execution on an Intel Core i5 CPU with 16 GB RAM and a batch size of 32. Despite these modest hardware constraints, our proposed
14
D. Ghosh and A. Goswami Information Sciences 719 (2025) 122462
Table 5
Approximate Comparison of Model Efficiency with Popular Architectures.
Model Parameters (M) GFLOPs Model Size (MB) Inference Time (ms) (CPU/GPU)
Table 6
Ablation Study Results on Classification Performance.
FAISS, DropConnect, Label Smoothing (No Attention) 95.80 0.96 0.96 0.96
FAISS, Attention, Label Smoothing (No DropConnect) 95.10 0.95 0.95 0.95
FAISS, Attention, DropConnect (No Label Smoothing) 95.50 0.96 0.95 0.95
Attention, DropConnect, Label Smoothing (No FAISS) 93.90 0.94 0.94 0.94
All components (FAISS, Attention, DropConnect, Label Smoothing) 96.64 0.97 0.97 0.97
model achieves robust performance through a carefully structured hybrid architecture that integrates pretrained deep features with
lightweight, trainable modules. Our proposed model utilizes a pretrained ResNet-152 backbone as a frozen feature extractor, chosen
for its superior representational capacity on large-scale datasets such as ImageNet. Freezing the ResNet-152 weights entirely during
training eliminates the computational overhead of backpropagating through 152 layers, significantly reducing memory consumption,
training time, and floating-point operations.
Each input image is transformed into a 2048-dimensional embedding. To enhance this representation without modifying the back
bone, we introduce a lightweight, self-attention-based fusion mechanism. This component integrates semantically similar embeddings
retrieved using a fast FAISS-based nearest neighbor search. The resulting 4096-dimensional feature vector is passed through a six-layer
custom multi-layer perceptron (MLP) classifier. The model, comprising 11.19 million trainable parameters, incorporates regular
ization techniques including Dropout, and DropConnect to promote generalization. Despite its expressive power, the total model size
remains compact at just 42.75 MB, making it highly suitable for deployment in low power and edge devices.
To assess the contribution of each individual component in our proposed framework, we conducted a comprehensive ablation
study. Specifically, we evaluated the impact of four key components: FAISS-based retrieval, the attention mechanism, DropConnect
regularization, and label smoothing. In each experiment, one component was selectively disabled while the others remained active,
and the resulting classification performance was recorded. Table 6 presents the results of these experiments.
The ablation results clearly demonstrate that each component contributes meaningfully to the overall classification performance:
15
D. Ghosh and A. Goswami Information Sciences 719 (2025) 122462
Fig. 4. Workflow for integrating the proposed model into a smart waste management system.
• Attention Mechanism: Disabling the attention mechanism results in a noticeable decrease in accuracy and F1-score, indicating
its crucial role in effectively fusing feature representations derived from the FAISS retrieval module.
• DropConnect Regularization: The absence of DropConnect leads to a performance drop across all metrics, highlighting its
effectiveness in mitigating overfitting and enhancing generalization.
• Label Smoothing: Although its removal causes a relatively smaller drop in performance, label smoothing still contributes posi
tively by improving model calibration and reducing overconfidence in predictions.
• FAISS Retrieval: When the FAISS module is excluded, the model experiences the most substantial performance degradation,
particularly in accuracy. This underscores the importance of nearest-neighbor feature augmentation in enriching the input rep
resentation.
In general, the full model including FAISS retrieval, attention, DropConnect, and label smoothing achieves the highest classification
performance. It attains an accuracy of 96.64% along with balanced precision, recall, and F1 score values of 0.97. These results confirm
the synergistic contribution of all components and validate the architectural decisions underlying the proposed framework. The
ablation study thus provides strong empirical evidence that each individual module is essential for achieving optimal classification
performance.
Improving waste classification accuracy is essential to advancing global sustainability goals and reducing the burden on landfills
and incineration systems. Manual sorting processes are labor-intensive, error-prone, and susceptible to contamination of recyclable
materials. The proposed computer vision (CV) model addresses these challenges by automating classification with high accuracy
and reliability. The model achieved an accuracy of 97.67%, with precision, recall, and F1-score each reaching 0.98 on a diverse
dataset that includes categories such as batteries, biological waste, cardboard, clothes, metal, paper, plastic, shoes, trash, and white
glass. These categories represent a mixture of recyclable, hazardous, and non-recyclable materials, positioning the model as a strong
candidate for deployment in real-world waste management infrastructure. Fig. 4 outlines how the model can be integrated into a
modern waste processing system. From image capture at the point of disposal to downstream decision-making and policy feedback,
the system supports data-driven and environmentally responsible waste handling.
The system allows recyclable items such as metals, white glass, and plastics to be correctly identified and separated from haz
ardous materials like batteries or general trash. This reduces contamination in the recyclable stream and improves the quality of
recovered materials for reuse or remanufacturing. In addition to its technical strengths, the model aligns with international and
national waste management goals. For instance, the European Union’s Waste Framework Directive [37] and India’s Solid Waste Man
agement Rules [38] emphasize source segregation and improved recycling efficiency. By incorporating the model into smart bins or
material recovery facilities (MRFs), waste authorities can enforce compliance and monitor the flow of materials with minimal manual
intervention. Beyond compliance, the model supports environmental impact reduction by improving sorting accuracy, which helps
reduce the energy and emissions associated with processing mixed or misclassified waste. Moreover, the system can be extended to
16
D. Ghosh and A. Goswami Information Sciences 719 (2025) 122462
provide users with real-time feedback about their disposal habits, fostering public awareness and behavioral change toward better
waste practices [39]. Finally, the granular classification data produced by the system enables city planners, waste contractors, and
policymakers to make informed decisions about infrastructure investments, recycling targets, and circular economy initiatives. The
combination of high-performance AI with clear operational value marks a meaningful step toward sustainable, data-driven waste
management.
In this study, we have proposed a novel and robust deep learning framework for automated garbage classification, integrating
advanced feature extraction, similarity-based attention fusion, and resilient training strategies. Our approach capitalizes on the rep
resentational power of a pretrained ResNet-152 model to extract high-dimensional visual features, which are further refined through
a self-attention mechanism designed to capture relationships between input images and their most semantically similar counterparts.
By employing a similarity search using FAISS, the framework retrieves highly relevant features from the dataset, effectively enriching
the contextual representation of each input sample. This fusion strategy enhances the discriminative power of the feature space and
enables the model to distinguish between visually similar but semantically distinct classes efficiently. Additionally, our classification
architecture incorporates several deep regularization techniques, including DropConnect and dropout, to prevent overfitting, while
label smoothing and class-balanced loss further improve the stability and generalization of the training process. The proposed model
achieves a high classification accuracy of 96.64% on a diverse ten-class waste dataset, with consistently strong performance across
all categories in terms of precision, recall, and F1-score. We also performed a statistical significance analysis comparing our model’s
accuracy with that of the best state-of-the-art model, demonstrating that our model achieves significantly higher accuracy. These
results demonstrate the effectiveness and robustness of our framework in handling the visual diversity and imbalance arises in real
world waste classification scenarios. The integration of deep feature learning, attention-based feature fusion, similarity retrieval, and
regularized optimization contributes a comprehensive and scalable methodology to the field of intelligent environmental systems.
Despite the promising performance of our method, several directions remain open for future exploration to enhance its practical
ity, scalability, and environmental impact. One immediate priority is to develop new model compression strategies, such as pruning,
quantization, and knowledge distillation, to reduce computational overhead and memory footprint, making the model suitable for
deployment on low-power edge devices and mobile platforms. Real-time classification is essential for practical integration into smart
bins or waste sorting robots, and thus future work should also focus on optimizing inference latency and throughput without com
promising accuracy. Furthermore, the current framework relies exclusively on visual features. One can extend it into a multimodal
architecture that combines vision with auxiliary sensory data, such as spectral signatures, RFID metadata, or olfactory sensing, could
substantially improve classification in ambiguous or degraded visual conditions. Another vital extension lies in domain generalization
and transfer learning, which would enable the system to adapt to new environments, lighting conditions, or regional waste profiles
with minimal additional data. Augmenting the training set with synthetic images or employing domain adaptation techniques would
support this goal. Moreover, expanding the system’s scope to include carbon footprint estimation or recyclability scoring would pro
vide better insights into the environmental cost of various waste types, thus aligning the model with broader sustainability goals.
Integration into a city-scale smart waste management platform, coupled with predictive analytics for waste volume forecasting and
route optimization, can further amplify its societal utility. In summary, while our work sets a strong technical foundation, its real
world potential can be significantly extended by addressing computational efficiency, contextual awareness, multimodal integration,
and environmental accountability in future research.
Debojyoti Ghosh: Writing -- review & editing, Writing -- original draft, Visualization, Validation, Software, Resources, Method
ology, Investigation, Formal analysis, Data curation, Conceptualization. Adrijit Goswami: Writing -- review & editing, Validation,
Supervision, Conceptualization.
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to
influence the work reported in this paper.
Data availability
References
[1] Shoufeng Jin, Zixuan Yang, Grzegorz Królczykg, Xinying Liu, Paolo Gardoni, Zhixiong Li, Garbage detection and classification using a new deep learning-based
machine vision system as a tool for sustainable waste recycling, Waste Manag. 162 (2023) 123--130.
[2] Zuohua Li, Quanxue Deng, Peicheng Liu, Jing Bai, Yunxuan Gong, Qitao Yang, Jiafei Ning, An intelligent identification and classification system of decoration
waste based on deep learning model, Waste Manag. 174 (2024) 462--475.
17
D. Ghosh and A. Goswami Information Sciences 719 (2025) 122462
[3] Md Sakib, Bin Islam, Md Shaheenur Islam Sumon, Molla E. Majid, Saad Bin Abul Kashem, Mohammad Nashbat, Azad Ashraf, Amith Khandakar, Ali K. Ansaruddin
Kunju, Mazhar Hasan-Zia, Muhammad E.H. Chowdhury, Eccdn-net: a deep learning-based technique for efficient organic and recyclable waste classification,
Waste Manag. 193 (2025) 363--375.
[4] Umesh Kumar Lilhore, Sarita Simaiya, Surjeet Dalal, Magdalena Radulescu, Daniel Balsalobre-Lorente, Intelligent waste sorting for sustainable environment:
a hybrid deep learning and transfer learning model, Gondwana Res. (2024).
[5] Song Zhang, Yumiao Chen, Zhongliang Yang, Hugh Gong, Computer vision based two-stage waste recognition-retrieval algorithm for waste classification, Resour.
Conserv. Recycl. 169 (2021) 105543.
[6] Xi Li, Tian Li, Shaoyi Li, Bin Tian, Jianping Ju, Tingting Liu, Hai Liu, Learning fusion feature representation for garbage image classification model in human–robot
interaction, Infrared Phys. Technol. 128 (2023) 104457.
[7] Wei-Lung Mao, Wei-Chun Chen, Chien-Tsung Wang, Yu-Hao Lin, Recycling waste classification using optimized convolutional neural network, Resour. Conserv.
Recycl. 164 (2021) 105132.
[8] Zhichao Chen, Jie Yang, Lifang Chen, Haining Jiao, Garbage classification system based on improved shufflenet v2, Resour. Conserv. Recycl. 178 (2022) 106090.
[9] Yujin Chen, Anneng Luo, Mengmeng Cheng, Yaoguang Wu, Jihong Zhu, Yanmei Meng, Weilong Tan, Classification and recycling of recyclable garbage based on
deep learning, J. Clean. Prod. 414 (2023) 137558.
[10] Yuanming Ren, Yizhe Li, Xinya Gao, An mrs-yolo model for high-precision waste detection and classification, Sensors 24 (13) (2024) 4339.
[11] Minh K. Quan, Dinh C. Nguyen, Van-Dinh Nguyen, Mayuri Wijayasundara, Sujeeva Setunge, Pubudu N. Pathirana, Towards privacy-preserving waste classification
in the internet of things, IEEE Internet Things J. (2024).
[12] Rongxing Wu, Xingmin Liu, Tiantian Zhang, Jiawei Xia, Jiaqi Li, Mingan Zhu, Gaoquan Gu, An efficient multi-label classification-based municipal waste image
identification, Processes 12 (6) (2024) 1075.
[13] Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton, Imagenet classification with deep convolutional neural networks, Commun. ACM 60 (6) (2017) 84--90.
[14] Yu Song, Xin He, Xiwang Tang, Bo Yin, Jie Du, Jiali Liu, Zhongbao Zhao, Shigang Geng, Deepbin: deep learning based garbage classification for households using
sustainable natural technologies, J. Grid Comput. 22 (1) (2024) 2.
[15] Kang An, Yanping Zhang, Lpvit: a transformer based model for pcb image classification and defect detection, IEEE Access 10 (2022) 42542--42553.
[16] Monika Dokl, Yee Van Fan, Annamaria Vujanović, Zorka Novak Pintarič, Kathleen B. Aviso, Raymond R. Tan, Bojan Pahor, Zdravko Kravanja, Lidija Čuček, et
al., A waste separation system based on sensor technology and deep learning: a simple approach applied to a case study of plastic packaging waste, J. Clean.
Prod. 450 (2024) 141762.
[17] Priya Aggarwal, Narendra Kumar Mishra, Binish Fatimah, Pushpendra Singh, Anubha Gupta, Shiv Dutt Joshi, Covid-19 image classification using deep learning:
advances, challenges and opportunities, Comput. Biol. Med. 144 (2022) 105350.
[18] Zixing Liu, Wanyu Fang, Zixiang Cai, Jia Zhang, Yang Yue, Guangren Qian, Garbage-classification policy changes characteristics of municipal-solid-waste fly ash
in China, Sci. Total Environ. 857 (2023) 159299.
[19] Xinchen Cai, Feng Shuang, Xiangming Sun, Yanhui Duan, Guanyuan Cheng, Towards lightweight neural networks for garbage object detection, Sensors 22 (19)
(2022) 7455.
[20] Jiewen Feng, Xiaoyu Tang, Xingjian Jiang, Qunyuan Chen, Garbage disposal of complex background based on deep learning with limited hardware resources,
IEEE Sens. J. 21 (18) (2021) 21050--21058.
[21] Fucong Liu, Hui Xu, Miao Qi, Di Liu, Jianzhong Wang, Jun Kong, Depth-wise separable convolution attention module for garbage image classification, Sustain
ability 14 (5) (2022) 3099.
[22] Jongchan Park, Sanghyun Woo, Joon-Young Lee, In So Kweon, A simple and light-weight attention module for convolutional neural networks, Int. J. Comput.
Vis. 128 (4) (2020) 783--798.
[23] Kashif Ahmad, Khalil Khan, Ala Al-Fuqaha, Intelligent fusion of deep features for improved waste classification, IEEE Access 8 (2020) 96495--96504.
[24] Junran Lin, Cuimei Yang, Yi Lu, Yuxing Cai, Hanjie Zhan, Zhen Zhang, An improved soft-yolox for garbage quantity identification, Mathematics 10 (15) (2022)
2650.
[25] Kunsen Lin, Tao Zhou, Xiaofeng Gao, Zongshen Li, Huabo Duan, Huanyu Wu, Guanyou Lu, Youcai Zhao, Deep convolutional neural networks for construction
and demolition waste classification: vggnet structures, cyclical learning rate, and knowledge transfer, J. Environ. Manag. 318 (2022) 115501.
[26] Wei-Lung Mao, Wei-Chun Chen, Haris Imam Karim Fathurrahman, Yu-Hao Lin, Deep learning networks for real-time regional domestic waste detection, J. Clean.
Prod. 344 (2022) 131096.
[27] Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun, Faster r-cnn: towards real-time object detection with region proposal networks, IEEE Trans. Pattern Anal.
Mach. Intell. 39 (6) (2016) 1137--1149.
[28] Nicholas Chieng Anak Sallang, Mohammad Tariqul Islam, Mohammad Shahidul Islam, Haslina Arshad, A cnn-based smart waste management system using
tensorflow lite and lora-gps shield in internet of things environment, IEEE Access 9 (2021) 153560--153574.
[29] Mohammed Imran Basheer Ahmed, Raghad B. Alotaibi, Rahaf A. Al-Qahtani, Rahaf S. Al-Qahtani, Sara S. Al-Hetela, Khawla A. Al-Matar, Noura K. Al-Saqer,
Atta Rahman, Linah Saraireh, Mustafa Youldash, et al., Deep learning approach to recyclable products classification: towards sustainable waste management,
Sustainability 15 (14) (2023) 11138.
[30] Zhenhua Wang, Bin Wang, Ming Ren, Dong Gao, A new hazard event classification model via deep learning and multifractal, Comput. Ind. 147 (2023) 103875.
[31] Fuqian Zhang, Bin Wang, Dong Gao, Chengxi Yan, Zhenhua Wang, When grey model meets deep learning: a new hazard classification model, Inf. Sci. 670 (2024)
120653.
[32] Xiaoan Tang, Yuxin Wei, Kaijie Xu, Qiang Zhang, Enhancement of the performance of high-dimensional fuzzy classification with feature combination optimization,
Inf. Sci. 680 (2024) 121183.
[33] Jérôme Lux, Jean David Lau Hiu Hoong, Pierre-Yves Mahieux, Philippe Turcry, Classification and estimation of the mass composition of recycled aggregates by
deep neural networks, Comput. Ind. 148 (2023) 103889.
[34] C. Anna Palagan, S. Sebastin Antony Joe, S.J. Jereesha Mary, E. Edwin Jijo, Predictive analysis-based sustainable waste management in smart cities using iot
edge computing and blockchain technology, Comput. Ind. 166 (2025) 104234.
[35] B. Madhavi, Mohan Mahanty, Chia-Chen Lin, B. Omkar Lakshmi Jagan, Hari Mohan Rai, Saurabh Agarwal, Neha Agarwal, Swinconvnext: a fused deep learning
architecture for real-time garbage image classification, Sci. Rep. 15 (2025) 7995.
[36] Yury A. Malkov, Dmitry A. Yashunin, Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs, IEEE Trans.
Pattern Anal. Mach. Intell. 42 (4) (2020) 824--836.
[37] European Commission, Directive 2008/98/ec of the European Parliament and of the Council on Waste (Waste Framework Directive), 2008, Accessed: 2025-04-08.
[38] Forest Ministry of Environment and Government of India Climate Change, Solid Waste Management Rules, 2016, 2016, Accessed: 2025-04-08.
[39] R. Nishant, M. Kennedy, J. Corbett, Artificial intelligence for sustainable waste management: a review, Sustain. Cities Soc. 63 (2020) 102423.
18