Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
12 views99 pages

Blind Assistance System

The document presents a project report on the 'Blind Assistance System Using Machine Learning,' developed by Puredla Dhushyant Deepak Chand as part of the requirements for a Master's degree in Computer Science. The application aims to assist visually impaired individuals by using machine learning to detect and identify objects in real-time through smartphone cameras, providing audio feedback to enhance navigation safety. The report includes sections on the project's introduction, literature survey, system design, coding, testing, and future scope, highlighting the integration of AI and machine learning in creating practical solutions for accessibility.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views99 pages

Blind Assistance System

The document presents a project report on the 'Blind Assistance System Using Machine Learning,' developed by Puredla Dhushyant Deepak Chand as part of the requirements for a Master's degree in Computer Science. The application aims to assist visually impaired individuals by using machine learning to detect and identify objects in real-time through smartphone cameras, providing audio feedback to enhance navigation safety. The report includes sections on the project's introduction, literature survey, system design, coding, testing, and future scope, highlighting the integration of AI and machine learning in creating practical solutions for accessibility.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 99

GAYATRI VIDYA PARISHAD

COLLEGE FOR DEGREE AND PG COURSES (A)

(Affiliated to Andhra University)

RUSHIKONDA, VISAKHAPATNAM-45

Department of Computer Science

VISION

“Creating human excellence for a better society”

MISSION

“Unfold into a world class organization with strong academic and research base
producing responsible citizens to cater to the changing needs of the society”
BLIND ASSISTANCE SYSTEM USING

MACHINE LEARNING

A project report submitted in partial fulfilment of the requirements for the


award of the Degree of

Master of Science in Computer Science

Submitted by

PUREDLA DHUSHYANT DEEPAK CHAND

(Regd. No: PG232406017)


Under the guidance of
Mr. P. Venkata Rao

Associate Professor

Department of Computer Science

GAYATRI VIDYA PARISHAD

COLLEGE FOR DEGREE AND PG COURSES(Autonomous)

(Affiliated to Andhra University)

RUSHIKONDA, VISAKHAPATNAM-45

2023-2025
GAYATRI VIDYA PARISHAD

COLLEGE FOR DEGREE AND PG COURSES (A)


(Affiliated to Andhra University)

RUSHIKONDA, VISAKHAPATNAM-45

Department of Computer Science

CERTIFICATE

This is to certify that the project report titled “BLIND ASSISTANCE


SYSTEM USING MACHINE LEARNING” is the bonafide record work
carried out by PUREDLA DHUSHYANT DEEPAK CHAND (Regd. No.
PG232406017), a student of this college, during the academic year 2024-2025,
in partial fulfilment of the requirements of the award of the degree of Master of
Science in Computer Science.

Project Guide: Director of MCA


Mr. P. Venkata Rao Prof. I. S. Pallavi

External Examiner
DECLARATION

I, PUREDLA DHUSHYANT DEEPAK CHAND, Regd.


PG232406017, hereby declares that the project report entitled “BLIND
ASSISTANCE SYSTEM USING MACHINE LEARNING”, is an original
work done at Gayatri Vidya Parishad College for Degree and PG Courses (A),
Visakhapatnam, submitted in partial fulfilment of the requirements for the
award of Master of Science in Computers, Gayatri Vidya Parishad College for
Degree and PG Courses(A), affiliated to Andhra University. I assure that this
project is not submitted to any other University or College.

Puredla Dhushyant Deepak Chand

PG232406017
ACKNOWLEDGEMENT

I consider this as a privilege to thank all those people who helped me a lot
for successful completion of the project “Blind Assistance System Using
Machine Learning.”

I would like to thank, Prof. K.S. Bose, Principal of Gayatri Vidya


Parishad College for Degree and PG Courses (A), who has provided full-
fledged lab and infrastructure for successful completion of my project work.

I would like to thank my project guide and our ever-accommodating Prof.


I. S. Pallavi, Director of MCA, who has obliged in responding to every request
though she is busy with her hectic schedule of administration and teaching.

I would like to thank our ever-accommodating project guide Mr. P.


Venkata Rao, Associate Professor, Department of Computer Applications,
who guided me in completing this project successfully.
I thank all the Teaching & Non-Teaching staff who had been a constant
source of support and help during the tenure of completion of this project.

Puredla Dhushyant Deepak Chand


BLIND ASSISTANCE SYSTEM
USING MACHINE LEARNING
ABSTRACT

8
ABSTRACT

Blind Assist is an innovative mobile application designed to help visually


impaired individuals navigate their environment more safely and independently.
The app uses the smartphone camera and advanced machine learning models to
detect and identify important objects in real time, including walls, curbs,
potholes, chairs, people, and doors.

Once these objects are detected, the app provides immediate voice
feedback, announcing the objects aloud through text-to-speech technology so
the user can be aware of obstacles and surroundings without needing to see
them. The object detection system is powered by TensorFlow Lite, which allows
the app to run models efficiently directly on the mobile device without requiring
an internet connection.

The app processes live camera frames using the CameraX library and
performs object detection with customizable parameters such as detection
threshold, maximum number of detected objects, and the number of processing
threads. This makes it adaptable to various device capabilities and user
preferences, balancing speed and accuracy. Blind Assist’s interface includes
adjustable settings to increase or decrease the detection confidence threshold (to
show only more certain detections) and to switch between supported hardware
delegates (CPU, GPU, or NNAPI) for optimized performance on different
devices. By integrating voice commands and intuitive controls, the app ensures
ease of use for visually impaired users.

9
TABLE OF CONTENTS

CHAPTER Page No
1. INTRODUCTION

1.1. About the Project 1

1.2. Artificial Intelligence 2

1.2.1. Introduction to Machine Learning 2

1.2.2. Categories of Machine Learning 3

1.2.3. Need for Machine Learning 4

1.2.4. Challenges in Machine Learning 4

1.2.5. Applications of Machine Learning 5

1.2.6. Machine Learning Techniques 5

1.2.7. Future Directions 7

1.2.8. Challenges and Limitations of Machine Learning 9

2. LITERATURE SURVEY

2.1. Introduction 12

2.2. Existing System 12

2.2.1. Disadvantages of Existing Systems 13

2.3. Problem Statement 14

2.4. Proposed System 14

2.4.1. Objectives 15

2.5. Functional Requirements 16

2.6. Non-Functional Requirements 17

10
2.6.1. Software Requirements 18

2.6.2. Hardware Requirements 18

3. UML MODELING

3.1. Introduction to UML 20

3.2. Goals of UML 20

3.3. UML Standard Diagrams 21

3.3.1. Structural Diagrams 21

3.3.2. Behaviour Diagrams 21

3.4. UML Diagrams 22

3.4.1. Use Case Diagram 22

3.4.2. Sequence Diagram 26

3.4.3. Activity Diagram 29

4. SYSTEM DESIGN

4.1. Design and Goals 33

4.2. System Architecture 34

4.3. System Design 34

4.4. Implementation Of Project 35

4.5. Algorithm 37

4.5.1. Object Detection with TensorFlow Lite 37

4.5.2. Object Detection with EfficientDet 38

11
5. CODING

5.1. Coding Approach 39

5.2. Verification and Validation 39

5.3. Source Code 40

6. TESTING

6.1. Testing Objectives 57

6.2. Testing Types 57

6.2.1. Unit Testing 57

6.2.2. Integration testing 58

6.2.3. Validation Testing 58

6.2.4. System Testing 58

6.3. Various types of Testing 59

6.3.1. White Box testing 59

6.3.2. Black Box testing 60

6.4. Test Plan 61

6.5. Test Cases 62

7. OUTPUT SCREENS 65

8. CONCLUSION 70

9. REFERENCES 71

12
10. FUTURE SCOPE 72

11. APPENDIX

11.1 List of Tables 73

11.2 List of Figures 74

13
INTRODUCTION

14
1.INTRODUCTION

1.1 About the Project:

The “Blind Assist” app is an Android-based assistive technology designed to help


blind and visually impaired individuals detect and recognize important objects around them
in real time using their smartphone camera. It captures live video through CameraX and
processes each frame using TensorFlow Lite’s object detection models, enabling the app to
identify objects like walls, curbs, potholes, chairs, people, and doors. When a target object is
detected, the app announces it audibly with a clear voice through the phone’s text-to-speech
system, helping users navigate safely without needing to look at the screen.

The app interface provides adjustable controls, allowing users to set detection
thresholds (to filter out low-confidence results), configure the maximum number of detected
objects, and choose the number of threads used for processing—balancing speed and device
performance. It supports multiple hardware delegates (CPU, GPU, NNAPI) to take advantage
of each device’s processing power and ensures smooth operation on a wide range of Android
devices.

To customize the object detection for the user’s environment, the app can integrate a
custom-trained model. This involves collecting and labeling images of relevant objects (e.g.,
walls, potholes, curbs), training the dataset using TensorFlow’s object detection API, and
converting the trained model into a lightweight TensorFlow Lite format. The final TFLite
model can then replace the default model in the app, enabling it to recognize specific
obstacles or landmarks important for visually impaired navigation.

Unlike traditional, costly hardware solutions for the blind, Blind Assist runs entirely
on a smartphone without needing internet access after setup, making it affordable, portable,
and accessible. By providing timely voice alerts, it reduces the risk of accidents and improves
confidence and independence for users moving through unfamiliar or hazardous
environments. The app’s customizable parameters also make it flexible for different users’
needs, whether they prefer faster detection or higher accuracy. This project demonstrates how

15
modern computer vision, deep learning, and mobile technology can create practical, real-
world solutions that directly enhance the quality of life for people with disabilities.

1.2 Artificial Intelligence

Artificial Intelligence (AI) is a field of computer science that aims to create


machines and software capable of performing tasks that normally require human intelligence.
These tasks include recognizing images, understanding natural language, making predictions,
and learning from experience. AI systems work by using algorithms and models trained on
large datasets to identify patterns and make decisions, often faster and more accurately than
humans. In recent years, AI has advanced rapidly, leading to applications in healthcare,
transportation, customer service, and accessibility technologies.

In the Blind Assist project, AI is used to make the environment more understandable
and navigable for people who are blind or visually impaired. By combining AI-based object
detection models with a smartphone camera, the app can analyze live video frames in real
time. It detects and identifies important objects like walls, curbs, potholes, chairs, people, and
doors. This information is then converted into audio feedback using text-to-speech
technology, allowing users to hear what is around them instantly. The AI model at the heart of
the system has been trained on thousands of labeled images to recognize these objects
accurately even in challenging conditions.

This AI-powered approach provides a practical and affordable solution for visually
impaired individuals, enabling them to navigate unfamiliar environments safely without
relying on others. It also demonstrates the power of AI to improve lives by bridging the gap
between human perception and machine understanding, showing how technology can create
more inclusive societies.

1.2.1 Introduction to Machine Learning

Before we take a look at the details of various machine learning methods, let's start
by looking at what machine learning is, and what it isn't. Machine learning is often
categorized as a subfield of artificial intelligence, but I find that categorization can often
be misleading at first brush. The study of machine learning certainly arose from research
in this context, but in the data science application of machine learning methods, it's more
helpful to think of machine learning as a means of building models of data.

16
Fundamentally, machine learning involves building mathematical models to help
understand data. "Learning" enters the fray when we give these models to unable
parameters that can be adapted to observed data; in this way the program can be
considered to be "learning" from the data. Once these models have been fit to previously
seen data, they can be used to predict and understand aspects of newly observed data.
I'll leave to the reader the more philosophical digression regarding the extent to
which this type of mathematical, model-based "learning" is similar to the "learning"
exhibited by the human brain. Understanding the problem setting in machine learning is
essential to using these tools effectively, and so we will start with some broad
categorizations of the types of approaches we'll discuss here.
1.2.2 Categories of Machine Leaning

At the most fundamental level, machine learning can be categorized into two main
types: supervised learning and unsupervised learning.

Supervised learning involves somehow modelling the relationship between


measured features of data and some label associated with the data; once this model is
determined, it can be used to apply labels to new, unknown data. This is further subdivided
into classification tasks and regression tasks: in classification, the labels are discrete
categories, while in regression, the labels are continuous quantities. We will see examples
of both types of supervised learning in the following section.

Unsupervised learning involves modelling the features of a dataset without


reference to any label, and is often described as "letting the dataset speak for itself." These
models include tasks such as clustering and dimensionality reduction. Clustering
algorithms identify distinct groups of data, while dimensionality reduction algorithms
search for more succinct representations of the data. We will see examples of both types of
unsupervised learning in the following section.

As its name suggests, Supervised machine learning is based on supervision. It means


in the supervised learning technique, we train the machines using the "labelled" dataset,
and based on the training, the machine predicts the output. Here, the labelled data specifies
that some of the inputs are already mapped to the output. More preciously, we can say;
first, we train the machine with the input and corresponding output, and then we ask the
machine to predict the output using the test dataset.

17
Let's understand supervised learning with an example. Suppose we have an input
dataset of cats and dog images. So, first, we will provide the training to the machine to
understand the images, such as the shape & size of the tail of cat and dog. Shape of eyes,
colour, height (dogs are taller, cats are smaller), etc. After completion of training, we input
the picture of a cat and ask the machine to identify the object and predict the output. Now,
the machine is well trained, so it will check all the features of the object, such as height,
shape, color, eyes, ears, tail, etc., and find that it's a cat. So, it will put it in the Cat
category. This is the process of how the machine identifies the objects in Supervised
Learning

1.2.3 Need for Machine Learning

Human beings, at this moment, are the most intelligent and advanced species on
earth because they can think, evaluate and solve complex problems. On the other side, Al
is still in its initial stage and haven't surpassed human intelligence in many aspects. Then
the question is that what is the need to make machine learn. The most suitable reason for
doing this is, "to make decisions, based on data, with efficiency and scale".

Lately, organizations are investing heavily in newer technologies like Artificial


Intelligence, Machine Learning and Deep Learning to get the key information from data to
perform several real-world tasks and solve problems. These data-driven decisions can be
used, instead of using programming logic, in the problems that cannot be programmed
inherently. The fact is that we can't do without human intelligence, but other aspect is that
we all need to solve real-world problems with efficiency at a huge scale. That is why the
need for machine learning arises.

1.2.4 Challenges in Machines Learning

While Machine Learning is rapidly evolving, making significant strides with


cybersecurity and autonomous cars, this segment of Al as whole still has a long way to go.
The reason behind is that ML. has not been able to overcome number of challenges. The
challenges that ML is facing currently are-
Quality of Data - Having good-quality data for ML algorithms is one of the biggest
challenges. Use of low-quality data leads to the problems related to data preprocessing and
feature extraction.
Time-Consuming Task - Another challenge faced by ML. models is the consumption of
time especially for data acquisition, feature extraction and retrieval.

18
Lack of Specialist Persons - As ML technology is still in its infancy stage, availability of
expert resources is a tough job.
No Clear Objective for Formulating Business Problems - Having no clear objective and
well-defined goal for business problems is another key challenge for ML because this
technology is not that mature yet.
Issue of Overfitting & Underfitting - If the model is overfitting or underfitting, it cannot
be represented well for the problem.
Curse of Dimensionality - Another challenge ML model faces is too many features of
data points. This can be a real hindrance.
Difficulty in Deployment - Complexity of the ML model makes it quite difficult to be
deployed in real life.

1.2.5 Applications of Machines Learning

Machine Learning is the most rapidly growing technology and according to


researchers we are in the golden year of Al and ML. It is used to solve many real-world
complex problems which cannot be solved with traditional approach. Following are some
real-world applications of ML
 Emotion analysis
 Sentiment analysis
 Error detection and prevention
 forecasting and prediction
 Stock market analysis and forecasting
 Speech synthesis
 Speech recognition
 Language Translation
 Object recognition
 Fraud detection
 Fraud prevention
 Recommendation of products to customer in online shopping

1.2.6 Machine Learning Techniques

Artificial Intelligence (AI) and Machine Learning (ML) encompass a wide range of
techniques and methodologies that enable machines to perform tasks that typically require
human intelligence. These techniques are at the heart of many modern applications, from
voice assistants and chatbots to advanced medical diagnostics and autonomous vehicles.

19
One of the fundamental techniques in AI is the use of algorithms to process and analyze
data. These algorithms can be broadly categorized into traditional AI algorithms and those
used in machine learning. Traditional AI algorithms include search algorithms, which are
used to navigate through data to find specific information or solutions to problems. Examples
include the A* search algorithm, which is used in pathfinding and graph traversal, and the
minimax algorithm, which is used in decision-making processes, particularly in game theory
and competitive environments.

Machine learning algorithms, on the other hand, are designed to enable systems to learn
from data and improve their performance over time. These algorithms can be divided into
three main types: supervised learning, unsupervised learning, and reinforcement learning.
Supervised learning involves training a model on a labeled dataset, meaning that each
training example is paired with an output label. The model learns to map inputs to the correct
output based on this training data. Common algorithms used in supervised learning include
linear regression, logistic regression, support vector machines (SVM), and decision trees.
Neural networks, a more advanced form of supervised learning, are particularly powerful for
tasks involving complex and high-dimensional data, such as image and speech recognition.

Unsupervised learning deals with unlabeled data, meaning the algorithm tries to find
patterns and relationships within the data without any explicit instructions on what to look
for. Clustering algorithms, such as k-means and hierarchical clustering, are used to group
similar data points together. Dimensionality reduction techniques, like principal component
analysis (PCA) and t-distributed stochastic neighbor embedding (t-SNE), are used to reduce
the number of variables under consideration and to visualize high-dimensional data.
Unsupervised learning is particularly useful for exploratory data analysis and for finding
hidden patterns or intrinsic structures in the data.

Reinforcement learning is a type of machine learning where an agent learns to make


decisions by performing actions in an environment to maximize some notion of cumulative
reward. Unlike supervised learning, where the correct answer is provided during training,
reinforcement learning relies on a reward signal to evaluate the actions taken by the agent.
The agent receives feedback in the form of rewards or penalties and uses this feedback to
learn the best strategy or policy to achieve its goals. This approach is highly effective in
dynamic environments where the optimal actions are not always apparent and must be

20
discovered through trial and error. Reinforcement learning has been successfully applied to a
wide range of problems, including robotics, game playing, and autonomous driving.

Another critical technique in AI and ML is the use of neural networks, which are
designed to simulate the way the human brain processes information. Neural networks consist
of layers of interconnected nodes, or neurons, that process data in a hierarchical manner. The
most basic form of a neural network is the feedforward neural network, where information
flows in one direction from the input layer to the output layer. More complex architectures,
such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs), have
been developed to handle specific types of data and tasks. CNNs are particularly effective for
image processing tasks due to their ability to capture spatial hierarchies in images, while
RNNs are suited for sequential data, such as time series or natural language, because they can
maintain information about previous inputs through their recurrent connections.

Deep learning, a subset of machine learning, refers to the use of neural networks with
many layers (hence "deep") to model complex patterns in data. The increased depth of these
networks allows them to learn more abstract and high-level features from raw data, which has
led to significant advancements in fields such as computer vision, natural language
processing, and speech recognition. Training deep learning models typically requires large
amounts of data and computational resources, but the results have been groundbreaking,
enabling the development of AI systems that can outperform humans in certain tasks.

Overall, the techniques in AI and ML are diverse and continually evolving, driven by
advances in computational power, availability of data, and innovative research. These
techniques form the backbone of intelligent systems that are transforming industries and
shaping the future of technology.

1.2.7 Future Directions

Artificial Intelligence (AI) and Machine Learning (ML) have profoundly impacted
numerous sectors, revolutionizing how we approach problem-solving and decision-making.
Their influence extends across industries such as healthcare, finance, transportation, and
entertainment, driving significant advancements and efficiencies. In healthcare, AI and ML
are transforming diagnostic and treatment processes. AI-driven diagnostic tools can analyze
medical images with remarkable accuracy, often outperforming human radiologists in
detecting diseases like cancer. Machine learning models predict patient outcomes and

21
recommend personalized treatment plans based on vast amounts of historical data. These
technologies are also accelerating drug discovery by identifying potential drug candidates and
predicting their efficacy, significantly reducing the time and cost involved in bringing new
medications to market. Moreover, AI-powered wearable devices continuously monitor
patients' vital signs, enabling early detection of health issues and timely medical
interventions.

The financial sector has also witnessed substantial changes due to AI and ML. These
technologies enhance fraud detection by identifying unusual patterns in transaction data,
thereby preventing fraudulent activities. In trading, machine learning algorithms analyze
market data to forecast stock prices and inform investment strategies, often executing trades
at high speeds and with greater precision than human traders. Furthermore, AI-driven
chatbots and virtual assistants provide personalized customer service, handling routine
inquiries and transactions, which frees up human agents for more complex tasks.

Transportation is another field greatly impacted by AI and ML. Autonomous vehicles,


which rely on sophisticated machine learning models to navigate and make real-time
decisions, promise to reduce accidents caused by human error and improve traffic flow. AI
algorithms optimize logistics and supply chain management by predicting demand, managing
inventory, and routing deliveries efficiently. These advancements lead to cost savings and
enhanced customer satisfaction through faster and more reliable services.

In the entertainment industry, AI and ML are used to create personalized user


experiences. Streaming services like Netflix and Spotify employ machine learning algorithms
to analyze users' viewing and listening habits, recommending content tailored to individual
preferences. AI-driven tools are also used in content creation, such as generating realistic
graphics in video games or producing original music and scripts. Additionally, sentiment
analysis tools gauge audience reactions to movies, shows, and advertisements, helping
creators and marketers refine their content to better meet audience expectations.

The impact of AI and ML extends beyond these sectors, influencing areas such as
education, agriculture, and environmental conservation. In education, adaptive learning
platforms use machine learning to tailor educational content to students' individual learning
styles and paces, improving learning outcomes. In agriculture, AI-powered systems monitor
crop health, optimize irrigation, and predict yields, enhancing productivity and sustainability.
Environmental conservation efforts benefit from AI's ability to analyze data from sensors and

22
satellite images, tracking wildlife populations and detecting illegal activities like poaching
and deforestation.

Looking to the future, AI and ML are poised to drive further innovations and societal
changes. One key area of development is the advancement of explainable AI (XAI), which
aims to make AI systems more transparent and understandable to humans. As AI systems
become more complex, ensuring that their decision-making processes are interpretable and
trustworthy is crucial, particularly in high-stakes domains like healthcare and finance.
Another promising direction is the integration of AI with the Internet of Things (IoT). IoT
devices generate vast amounts of data, and AI can analyze this data to derive insights and
make intelligent decisions in real-time.

AI and ML will also play a significant role in addressing global challenges such as
climate change and pandemics. Machine learning models can predict climate patterns,
optimize renewable energy sources, and improve disaster response strategies. In public
health, AI can assist in monitoring disease outbreaks, developing vaccines, and managing
healthcare resources more effectively. Ethical considerations and regulatory frameworks will
be critical as AI and ML continue to evolve. Ensuring that these technologies are developed
and deployed responsibly, with attention to issues such as bias, privacy, and job displacement,
will be essential to maximizing their benefits while mitigating potential risks. Collaborative
efforts between governments, industry, and academia will be necessary to create policies and
standards that promote ethical AI development and use.

1.2.8 Challenges and Limitations of Machine Learning

Artificial Intelligence (AI) and Machine Learning (ML) have made significant strides in
recent years, but they still face several challenges and limitations. Understanding these issues
is crucial for developing more robust and ethical AI systems.

1. Technical Challenges

 Overfitting and Underfitting: One of the key challenges in ML is achieving a


balance between overfitting and underfitting. Overfitting occurs when a model
learns the noise in the training data rather than the underlying pattern, resulting in
poor performance on new, unseen data. Underfitting happens when a model is too
simple to capture the complexity of the data. Finding the right model complexity
and regularization techniques is essential for optimal performance.

23
 Scalability: As datasets grow larger and more complex, scaling ML algorithms
becomes challenging. Training models on massive datasets requires substantial
computational resources and time. Ensuring that algorithms can efficiently handle
large-scale data while maintaining performance is a significant hurdle.
 Data Quality and Quantity: The effectiveness of ML models depends heavily on the
quality and quantity of the data. Inadequate or biased data can lead to inaccurate or
skewed predictions. Data preprocessing, cleaning, and augmentation are critical to
ensure that models are trained on high-quality data that accurately represents the
problem domain.

2. Data Privacy and Security

 Privacy Concerns: AI systems often require access to large amounts of personal


data, raising concerns about data privacy. Ensuring that sensitive information is
protected and used responsibly is a major challenge. Techniques like
anonymization and federated learning are being developed to address privacy
concerns, but balancing privacy with model effectiveness remains a complex issue.
 Data Security: AI systems are vulnerable to security threats such as adversarial
attacks, where malicious inputs are designed to fool the model into making
incorrect predictions. Ensuring the robustness of AI systems against such attacks
and protecting them from data breaches is crucial for maintaining trust and
security.

3. Ethical and Bias Issues

 Algorithmic Bias: AI and ML systems can inadvertently perpetuate or even


exacerbate existing biases present in the training data. For instance, biased training
data can lead to biased outcomes in areas such as hiring, law enforcement, and
credit scoring. Identifying and mitigating biases in AI models is essential for
promoting fairness and equity.
 Ethical Considerations: The deployment of AI systems raises various ethical
concerns, including the potential for misuse, impacts on employment, and decision-
making transparency.

4. Interpretability and Explainability

24
 Black-Box Nature: Many advanced ML models, particularly deep learning models,
operate as "black boxes," meaning their internal decision-making processes are not
easily interpretable. This lack of transparency can hinder trust and make it difficult
to understand how decisions are made. Developing techniques for model
interpretability and explainability is essential for ensuring that AI systems are
transparent and their decisions can be understood and justified.
 Model Interpretability: For AI systems to be widely accepted and trusted, it is
crucial that their predictions and decision-making processes are interpretable by
humans. Efforts to enhance model interpretability involve creating methods and
tools that provide insights into how models arrive at their conclusions, which is
especially important in high-stakes domains like healthcare and finance.

5. Societal Impact and Public Perception

 Job Displacement: The automation of tasks through AI and ML can lead to job
displacement, as machines and algorithms increasingly perform tasks previously
done by humans. Addressing the impact on employment and developing strategies
for workforce retraining and support are important for mitigating the negative
effects of automation.

25
LITERATURE SURVEY

26
2.LITERATURE SURVEY

2.1 Introduction:
This project is an advanced Android application named Blind Assist, created to
support visually impaired people by helping them recognize and identify objects around them
in real time using their smartphone’s camera. The app uses TensorFlow Lite, a lightweight
deep learning framework, to run object detection models directly on the mobile device,
allowing fast and offline processing without needing an internet connection.
We developed the app using Android Studio and wrote the code in Kotlin, a modern,
easy-to-read, and powerful programming language designed for Android development. To
capture live video from the camera, we used the CameraX API, which provides reliable and
flexible camera control on Android devices. The detected objects are highlighted with
bounding boxes on the camera preview using a custom overlay view, and the app uses Text-
to-Speech (TTS) to announce the names of detected objects aloud, giving instant audio
feedback to the user.
The application includes a bottom sheet interface that lets users adjust key detection
settings like the confidence threshold, the maximum number of objects to detect, and the
number of threads used for processing, providing customization based on the device’s
performance and the user’s needs. By combining real-time computer vision, voice output, and
a simple user interface, the app offers an effective, accessible tool for blind and visually
impaired users to navigate their surroundings more confidently and independently. The entire
solution runs smoothly on mobile devices, making it a practical and portable assistive
technology.
2.2 Existing System:
In the current scenario, visually impaired people face many challenges in identifying objects
around them and navigating unfamiliar environments. Although there are some existing
systems and technologies, they have several limitations:
 Traditional Walking Sticks or White Canes
These are the most commonly used tools for the blind. They help in detecting
obstacles directly in front of the user by physical contact, but they cannot recognize

27
objects at a distance or provide information about what the objects are.
 Guide Dogs
Guide dogs are trained to help visually impaired people move around safely.
However, having a guide dog is expensive and requires ongoing maintenance, care,
and training. It is not an affordable or practical solution for everyone.
 Mobile Apps Using GPS
Some smartphone apps offer navigation help using GPS, like Google Maps or
specialized apps for blind users. These apps can help with directions but do not detect
obstacles or recognize objects in the user’s immediate surroundings.
 Dedicated Electronic Devices
Certain electronic devices like wearable cameras or specialized obstacle detection
gadgets exist, but they are often bulky, costly, and inconvenient to carry daily. Many
of these devices require frequent calibration or charging, and they may not offer real-
time object recognition.
 Camera-based Applications Without Real-time Feedback
A few apps can identify objects when a photo is taken, but they process images slowly
or require an internet connection to send photos to a server. This delay in response
makes them less effective for real-time assistance.
 Lack of Voice Feedback
Some existing solutions do not include text-to-speech support, forcing users to read
on-screen text, which is not practical for blind people. Even apps with voice feedback
might announce information in an unclear or delayed manner, which reduces their
usefulness.
2.2.1. Disadvantages of Existing Systems
 Limited Detection Range
Most existing tools like white canes or guide dogs only help detect obstacles
very close to the user. They cannot identify objects or hazards at a distance,
making it difficult to plan a safe path ahead.
 No Object Recognition
Traditional aids cannot recognize or describe what an object is — they only
help avoid it. Visually impaired people still don’t know whether an obstacle is
a chair, a wall, or a person.
 High Cost and Maintenance
Devices like smart glasses or guide dogs require a lot of money to buy and
28
maintain. Many visually impaired people cannot afford these expensive
solutions.
 Slow or Delayed Feedback
Some apps or devices that use photos to identify objects send images to
servers for processing, which causes delays in getting results. Real-time
response is often impossible.
 Dependence on Internet Connectivity
Many solutions require a constant internet connection for object recognition,
which is unreliable or unavailable in many places, especially outdoors or in
rural areas.
 Bulky and Inconvenient Devices
Some electronic aids are large, heavy, or uncomfortable to carry, making them
impractical for daily use.
2.3 Problem Statement:
Visually impaired individuals face numerous difficulties in navigating their
surroundings safely and independently. Traditional mobility aids such as white canes and
guide dogs provide limited assistance—they can help detect obstacles directly in front of the
user but cannot identify or describe the nature of these obstacles or objects beyond the
immediate path. As a result, visually impaired people often remain unaware of important
details about their environment, which can lead to accidents, injuries, or difficulty in finding
essential landmarks like doors or chairs. Although some electronic and wearable assistive
devices exist, they are often prohibitively expensive, complex to use, or heavily reliant on
continuous internet connectivity, which limits their practicality in real-life scenarios.
Moreover, many of these solutions struggle to provide fast and accurate feedback necessary
for real-time navigation, especially in dynamic environments. This highlights a critical need
for an affordable, lightweight, and reliable system that leverages modern computer vision
techniques to detect and identify objects such as walls, curbs, potholes, chairs, doors, and
people in real time. By integrating this technology into a mobile platform and providing
immediate audio feedback through speech, we can significantly improve the confidence,
safety, and independence of visually impaired individuals as they move through their daily
environments.
2.4 Proposed System:
 Real-Time Object Detection with Mobile Camera

29
The proposed system uses an Android smartphone’s camera to continuously capture
live video frames of the user’s surroundings. These frames are analyzed instantly to
detect objects in real time, helping visually impaired users become aware of obstacles
or important items around them.
 Offline, On-Device Processing
Unlike many solutions that rely on cloud servers or internet connectivity, this system
performs all object detection directly on the device using TensorFlow Lite. This
ensures the app works reliably even in areas without network coverage and protects
user privacy by not uploading images.
 Audio-Based Object Announcements
Once objects like walls, curbs, potholes, chairs, doors, or people are detected, the
system announces them using Text-to-Speech in clear, easy-to-understand language.
This immediate spoken feedback allows visually impaired users to react quickly and
navigate their environment more safely.
 Simple and User-Friendly Interface
The app features an intuitive interface with straightforward controls. Users can adjust
detection sensitivity (threshold), choose the maximum number of objects detected at
once, and select the processing hardware (CPU, GPU, or NNAPI) without needing
technical expertise.
 Modern Android Architecture
The application is developed using Android Studio with Kotlin, leveraging modern
components like Jetpack libraries, CameraX for reliable camera handling, and
ViewBinding for safer, cleaner code. This ensures the app is maintainable, efficient,
and compatible with a wide range of devices.
 Enhanced Safety and Independence
By giving real-time awareness of obstacles and surroundings, the proposed system
aims to empower visually impaired individuals to move around confidently and
independently, reducing their reliance on assistance from others.
2.5 Objectives:
The main objective of this project is to develop a reliable Android application that
helps visually impaired individuals navigate their environment safely and independently
using real-time object detection. By leveraging TensorFlow Lite, the system aims to identify
critical obstacles and common items such as walls, curbs, potholes, chairs, doors, and people
directly on the user’s smartphone without requiring internet connectivity.
30
Another important objective is to provide immediate audio feedback through Text-to-
Speech so that users can receive timely alerts about detected objects. The app also seeks to
offer a simple, intuitive interface that allows users or caretakers to adjust detection settings,
including the detection threshold, maximum results, and processing hardware.
By achieving these objectives, the proposed system intends to significantly enhance
situational awareness, reduce the risk of accidents, and improve the independence and
confidence of visually impaired users in their daily lives.
2.6 Functional Requirements:
 Real-Time Object Detection
The system shall detect objects such as walls, curbs, potholes, chairs, doors, and
people instantly by processing live video frames from the smartphone’s camera. This
ensures the app provides timely alerts to users navigating their environment.
 Audio Feedback for Detected Objects
The system shall use Text-to-Speech to announce the labels of detected objects. This
audio feedback will enable visually impaired users to understand their surroundings
without needing to look at the screen.
 Adjustable Detection Settings
The app shall allow users to customize detection parameters, such as the minimum
confidence threshold (which affects sensitivity), the maximum number of objects
detected per frame, and the number of threads used for processing. This flexibility lets
users balance accuracy and performance based on their device’s capability.
 Hardware Delegate Selection
The app shall let users choose between CPU, GPU, or NNAPI delegates to perform
detection. This gives users the ability to optimize detection speed and battery usage
depending on their hardware.
 Visual Overlay of Detections
The system shall draw bounding boxes around detected objects directly on the camera
preview. This provides visual confirmation of what the app has detected, helping users
with partial vision or assisting a sighted helper.
 Camera Permission Handling
The app shall check if the camera permission is granted. If not, it will request
permission from the user before launching the camera, ensuring proper functionality
without manual configuration.

31
 Navigation to Camera Screen
The system shall automatically navigate the user from the permission screen to the
camera screen once the required permissions are granted, providing a seamless user
experience.
2.7 Non-Functional Requirements:
The non-functional requirements outline how the Blind Assistance Android
application should perform, focusing on quality attributes like performance, reliability,
usability, and maintainability. These characteristics ensure that the system is efficient, user-
friendly, and robust enough for visually impaired users to rely on daily.
 Performance Requirements
The app must analyse each camera frame and generate object detection
results within 200 milliseconds per frame.Quick processing ensures smooth,
uninterrupted real-time voice feedback to the user as they navigate their environment.
 Reliability and Stability
The application should remain stable during prolonged use, such as when
assisting a user on long walks or in crowded places.It must handle unexpected
conditions (e.g., sudden light changes or temporary camera interruptions) without
crashing.
 User Interface Usability
The app interface should use large, high-contrast buttons and clear text
labels, making it easy for visually impaired users to find and activate controls.Voice
prompts and intuitive navigation should guide users effectively throughout the app’s
features.
 Device Compatibility
The system must work on Android smartphones and tablets running
Android 8.0 (Oreo) or later versions, ensuring that a wide range of affordable devices
can support the app.
 Future Scalability
The architecture should allow for the addition of new detection models,
voice languages, or extra features without requiring a complete rewrite of the app.This
flexibility helps the app adapt to evolving user needs or advancements in object
detection technology.
 Security and Privacy

32
The app must only request essential permissions, such as camera access,
and avoid collecting or storing any user-identifiable data.All processing should occur
locally on the device to protect user privacy, with no images or recordings uploaded to
external servers.

2.7.1 Software Requirements:


The software requirements define the essential tools, libraries, and
platforms necessary to develop, build, and maintain the Blind Assistance Android application
efficiently.
 Operating System:
Windows 10 or later / macOS / Linux (for development environment)
 Programming Language:
Kotlin (for Android app development)
Java (for some Android dependencies)
 Libraries and Frameworks:
Android Jetpack Components for UI, navigation, and lifecycle management
CameraX API for accessing and managing the camera
TensorFlow Lite Task Library for on-device object detection
Android Text-to-Speech (TTS) API for audio feedback
View Binding for type-safe UI references
ConstraintLayout and Material Components for responsive, accessible UI design
 Development Tools:
Android Studio Electric Eel or later (recommended IDE for Android development)
Android SDK (version 33 or later) and build tools
Gradle for dependency management and builds
Git for version control and project collaboration

2.7.2 Hardware Requirements:

The hardware requirements ensure that the application runs smoothly on both the
development system and the target Android devices, delivering reliable real-time object
detection and audio feedback.

 Development Machine:

33
Processor: Multi-core processor (Intel i5/Ryzen 5 or higher recommended)

RAM: Minimum 8 GB; 16 GB or more recommended for seamless Android Studio


performance

Storage: At least 256 GB of free disk space (SSD recommended for faster project
builds and emulation)

 Target Android Device:

Processor: Octa-core ARM processor or equivalent for real-time processing

RAM: Minimum 4 GB for stable camera and detection performance

Storage: At least 1 GB of free internal storage for app installation and caching

Camera: Rear camera with at least 8 MP resolution for clear object capture

Audio: Built-in speaker or connected headphones for audible feedback using Text-to-
Speech

34
UML MODELING

35
3.UML MODELLING

3.1 Introduction to UML:


UML is a standard language for specifying, visualizing, constructing, and
documenting the artifacts of software system. UML was created by the Object
Management Group (OMG) and UML 1.0 specification drift was proposed to the OMG
in January 1997.OMG is continuously making efforts to create a truly industry standard.

 UML stands for Unified Modelling Language.


 UML is different from the other common programming language such as C++, java,
COBOL,
 UML is a pictorial language used to make software blueprints.
 UML can be described as a general-purpose visual modelling language to
visualize, specify, construct and document software system.

 Although UML is generally used to model software system, it is not limited within
this boundary. It is generally used to model software system as well. For example,
the process flows in a manufacturing unit, etc.

UML is not a programming language, but tools can be used to generate code in
various language using UML diagrams. UML has a direct relation with object-oriented
analysis and design. After some standardization, UML has become an OMG standard.

3.2 Goals of UML:


A picture is worth a thousand words, this idiom absolutely fits describing UML.
Object- oriented concepts were introduced much earlier than UML. At that point of time,
there were no standard methodologies to organize and consolidate the Object-oriented
development. It was then that UML came into picture.
There are number of goals for developing UML but the most important is to define
some general-purpose modelling language, which all models can use and it also need to be

36
made simple to understand and use.
A picture is worth a thousand words, this idiom absolutely fits describing UML.
Object- oriented concepts were introduced much earlier than UML. At that point of time,
there were no standard methodologies to organize and consolidate the Object-oriented
development. It was then that UML came into picture. There are number of goals for
developing UML but the most important is to define some general-purpose modelling
language, which all models can use and it also need to be made simple to understand and us

3.3 UML Standard Diagrams:


The elements are like components which can be associated in diverse ways to make a
complete UML picture, which is known as diagram. Thus, it is very important to understand
the different diagrams to implement the knowledge in real-life system. Any complex system
is best understood by making some kind of diagrams or pictures. These diagrams have a
better impact on our understanding. If we look around, we will realize that the diagram is not
a new concept. but it is used widely in different forms in different industries. We prepare
UML diagram to understand the system in a better and simple way. A single diagram is not
enough to cover all the aspects of the system. UMI. Defines various kinds of diagrams to
cover most of the aspects of a system. You can also create your own set of diagrams to meet
your requirements. Diagrams are generally made in an incremental and iterative way. There
are two broad categories of diagrams and they are again divided into subcategories:
 Structural Diagrams
 Behavioural Diagrams
3.3.1 Structural Diagrams:
The structural diagram represents the static aspect of the system. These static aspects
represent those parts of a diagram, which forms the main structure and are therefore stable.
These static parts are represented by classes, interfaces, object, components, and nodes. The
four structural diagrams are:
 Class diagram
 Object diagram
 Component diagram
 Deployment diagram
3.3.2 Behavioural Diagrams:
Any system can have two aspects, static and dynamic. So, a model is

37
considered as complete when both the aspects are fully covered. Behavioural diagram
captures the dynamic aspect of a system. Dynamic aspect can be further described as the
changing/moving parts of a system. UML has the following five types of behavioural
diagrams:
 Use case diagram
 Sequence diagram
 Collaboration diagram
 State chart diagram
 Activity diagram

3.4 UML Diagrams:


3.4.1 Introduction to Use Case Diagram:
Use cases are used during requirement elicitation and analysis to represent the
functionality of the system. Use cases focus on the behaviour of the system from an external
point of view. A use case describes a function provided by the system that yields a visible
result for an actor. An actor describes any entity that interacts with the system.
3.4.1.1 Actors:
Actors represent external entities that interact with the system. An actor can be human
or an external system. During this activity, developers identify the actors involved in this
system. In this project, user is : User: Any person
3.4.1.2 Use Cases:
The identification of actors and use cases results in the definition of the boundary of
the system, which is, in differentiating the tasks accomplished by the system and the tasks
accomplished by its environment. The actors are outside the boundary of the system, whereas
the use cases are inside the boundary of the system.
Actors are external entities that interact with the system. Use cases describe the
behaviour of the system as seen from an actor's point of view. Actors initiate a use case to
access the system functionality. The use case then initiates other use cases and gathers more
information from the actors. When actors and use cases exchange information, they are said
to be communicate.
To describe a use case, we use a template composed of six fields
 Use Case Name: The name of the use case
 Participating Actors: The actors participating in the particular use case

38
 Flow of events: Sequence of steps describing the function of use case
 Exit Condition: Condition for terminating the use case
 Quality Requirements: Requirements that do not belong to the use case but
constraint the functionality of the system

Fig. 3.4.1.2 Use Case Diagram


Description:
System: The system will detect obstacles using the camera, identify objects like
walls, curbs, potholes, chairs, doors, and people, and provide voice guidance to assist
visually impaired users.
Use Case for Start Camera:

Use Case ID UC001


Use Case Name Start Camera
Participating Actors User
Entry Condition User Opens the app and grant

39
permission
Flow of Events 1. User initiates camera
2. Live camera feed starts
Exit Condition Condition Camera preview is active
Quality Requirements Fast camera startup without delays

Table 3.4.1.2(a): Start Camera

Use Case for Detect Objects:

Use Case ID UC002


Use Case Name Detect Objects
Participating Actors System
Entry Condition Camera feed is active
Flow of Events 1. Capture video frames
2. Run object detection model
3. Identify objects
Exit Condition Detected objects identified with
bounding boxes
Quality Requirements Real-time and accurate detection

Table 3.4.1.2(b): Detect Objects

Use Case for Provide Audio Feedback:

Use Case ID UC003


Use Case Name Provide Audio Feedback
Participating Actors System, User
Entry Condition Objects are detected
Flow of Events 1. Convert detected object labels to speech
2. Announce detected objects
Exit Condition User receives audio guidance
Quality Requirements Clear, immediate, and natural voice

40
output

Table 3.4.1.2(c): Provide Audio Feedback

41
Use Case for Adjust Detection Settings:

Use Case ID UC004


Use Case Name Adjust Detection Settings
Participating Actors User
Entry Condition User is on the detection screen
Flow of Events 1. User changes threshold, max results,
or hardware delegate
2.System updates detection
configuration
Exit Condition Settings are applied in real-time
Quality Requirements Instant updates without needing to
restart the app

Table 3.4.1.2(d): Adjust Detection Settings

Use Case for View Detection Results:

Use Case ID UC005


Use Case Name View Detection Results
Participating Actors User
Entry Condition Detection is running
Flow of Events 1. System overlays bounding boxes
2. Display labels of detected objects on
screen
Exit Condition User sees detection results visually
Quality Requirements View Detection Results

Table 3.4.1.2(e): View Detection Results

42
3.4.2 Sequence Diagram:

A sequence diagram is the most commonly used interaction diagram. It simply depicts
interaction between objects in a sequential order ie. the order in which these interactions take
place. We can also use the terms event diagrams or event scenarios to refer to a sequence
diagram Sequence diagrams describe how and in what order the objects in a system function.
These diagrams are widely used by businessmen and software developers to document and
understand requirements for new and existing systems. Sequence diagrams can be useful
references for businesses and other organizations. Purpose of sequence diagrams are:

 Model the logic of a sophisticated procedure, function, or operation.

 See how objects and components interact with each other to complete a process.

 Plan and understand the detailed functionality of an existing or future scenario.

1. Lifeline:

In UML diagrams, such as sequence or communication diagrams, lifelines represent


the objects that participate in an interaction. For example, in a banking scenario, lifelines can
represent objects such as a bank system or customer. Each instance in an interaction is
represented by a lifeline.

2.Messeges:

A message is an element in a Unified Modelling Language (UML) diagram that


defines a specific kind of communication between instances in an interaction. A message
conveys information from one instance, which is represented by a lifeline, to another instance
in an interaction.

3. Activation Box:

Activation boxes, also known as activation bars or lifeline activations, show the
period of time during which an object or actor is actively engaged in processing a message.
They are drawn as rectangles on the lifeline and indicates the duration of the methods or
operation execution.

43
4.Focus of Control:

This indicates which objects or actors has control over is actively processing the
message at a given point in time. It is represented by a vertical dashe4d line extending from
the lifeline to the activation box.

5. Return Message:

Return messages show the flow of information back to the sender after the completion
of a method or operation, they are reprinted by dashed arrows.

Fig 3.4.2(a) Sequence Diagram of User

Description:

This sequence diagram illustrates the interaction flow in the Blind Assistance Android
app from the moment the user opens the app to logging out. It starts with the user accessing
the homepage, where they log in or register if necessary.

44
Once logged in, the user can initiate camera-based object detection, adjust settings,
and view detection results with real-time object identification overlaid on the screen.
Throughout the process, the app checks permissions, configures the camera, processes
frames, detects objects, and provides voice feedback.

Finally, the user can log out securely, returning to the login screen. This diagram helps
visualize the main steps of using the app and highlights the key modules and interactions
between user actions and system components.

Fig 3.4.2(b) Sequence Diagram of User

45
Description:

This sequence diagram details the process of creating and updating the object
detection model for the Blind Assistance app. It begins with the Admin collecting images of
relevant objects like walls, curbs, potholes, and doors.

The Admin then uses an annotation tool to label these images, creating a dataset
suitable for training. Next, the labeled dataset is fed into a training pipeline, where
TensorFlow is used to train the model, producing an updated TFLite file.

Finally, the trained model is deployed to the Android app, enabling improved object
detection performance for end users. This diagram clarifies the stages involved in preparing
the dataset, training, and updating the model for real-world deployment.

3.4.3 Activity Diagram:

Activity diagrams (also called Activity Charts or Flow Diagrams) depict the flow of
control or data from activity to activity within a system. They capture what the system does,
step-by-step, rather than how objects change state. Activity diagrams can model the behaviour
of a single use case, an operation, a business process, or even an entire workflow that spans
multiple systems.

Activity diagrams are especially valuable when you need to visualise complex
sequences that involve parallel processing, branching, loops, human tasks, or data
transformations.

1. Activities / Actions

Activities represent the high-level tasks or procedures performed in a workflow; an


action is the smallest executable step inside an activity. Both are shown as rounded
rectangles (actions are usually atomic, while activities may nest subordinate actions).

2. Control Flows

Control flows are directed arrows that connect actions or activities, indicating the
order in which steps are executed. They model the progression of control from one node to
the next once the preceding action completes.

3. Decision & Merge Nodes

46
A decision node (diamond shape) splits the flow based on boolean expressions or
guard conditions. Outgoing edges are labelled with conditions such as [valid] or [invalid]. A
merge node (also a diamond) brings multiple alternative flows back into a single path.

4. Fork & Join Nodes

A fork node (thick horizontal or vertical bar) divides the flow into concurrent paths
that execute in parallel. A join node synchronises these parallel branches: all incoming flows
must complete before the diagram proceeds. Forks and joins are critical for modelling
multi-threaded or asynchronous behaviour.

5. Swimlanes / Partitions

Swimlanes partition the diagram into vertical or horizontal zones that assign
responsibility for each action to an actor, class, or subsystem. They clarify who or what
performs each activity, improving traceability across organisational or architectural
boundaries.

6. Object Flows

Object flow arrows carry data objects (shown as rectangles with object names)
between actions. They illustrate how information is produced, consumed, or transformed
throughout the workflow.

7. Initial and Final Nodes

 Initial node (filled black circle): the entry point where control first enters the activity
diagram.

 Activity final node (encircled black dot): signifies the end of all flows in the activity.

 Flow final node (encircled X): terminates its particular path without stopping other
concurrent flows.

47
Fig 3.4.3 Activity Diagram

Description:

This activity diagram illustrates the workflow of the Blind Assistance Android app
from the moment the user opens the application until the continuous detection loop. The
process begins with the user launching the app, which immediately checks if camera
permissions have been granted. If permission is granted, the camera preview starts, and the
app begins real-time object detection on the video frames.

48
Detected objects are processed to estimate their distance, and the app uses text-to-
speech to announce them to the user. Bounding boxes and object labels are displayed on-
screen for reference. If the user adjusts detection settings such as detection threshold or
delegate, the new configuration is applied instantly without restarting the app.

If camera permission is denied, the system shows an appropriate error message,


prompting the user to grant permission. The app maintains a continuous detection loop,
allowing uninterrupted assistance as the user moves through their environment.

49
DESIGN

50
4.SYSTEM DESIGN

System Design is the process of defining the architecture, components, modules,


interfaces, and data for a system to satisfy specified requirements. In System design,
developers:

 Define design goals of the project


 Decompose the system into smaller sub systems
 Design hardware/software strategies
 Design persistent data management strategies
 Design global control flow strategies
 Design access control policies and
 Design strategies for handling boundary conditions.

System design is not algorithmic. It is decomposed of several activities. They are:

 Identify Design Goals


 Design the initial subsystem decomposition
 Refine the subsystem decomposition to address the design goals.
 System Design is the transform of analysis model into a system design model.

Developers define the design goals of the project and decompose the system into smaller
subsystems that can be realized by individual teams. Developers also select strategies for
building the system, such as the hardware/software platform on which the system will run,
the persistent data management strategy, the goal control flow the access control policy and
the handling of boundary conditions. The result of the system design is model that includes a
clear description of each of these strategies, subsystem decomposition, and a UML
deployment diagram representing the hardware/software mapping of the system.

4.1 Design Goals:

Design goals are the qualities that the system should focus on. Many design goals be
inferred from the non-functional requirements or from the application domain.

User friendly: The system is user friendly because it is easy to use and understand.

51
Reliability: Proper checks are there for any failure in the system if they exist.

4.2 System Architecture:

As the complexity of systems increases, the specification of the system decomposition


is critical. Moreover, subsystem decomposition is constantly revised whenever new issues are
addressed. Subsystems are merged into alone subsystem, a complex subsystem is split into
parts, and some subsystems are added to take care of new functionality. The first iterations
over the subsystem decomposition can introduce drastic changes in the system design model.

4.3 System Design:

The system design of the Blind Assistance App is structured to provide real-time
object detection and audio guidance for visually impaired users. The application is based on a
modular architecture that integrates camera input, object detection processing, and speech
output, all optimized for Android devices using Kotlin.

The system begins with a user-friendly interface where permissions are checked and
camera access is initiated. The CameraX API is used to handle live camera streams
efficiently. Captured video frames are analyzed by the object detection module, which uses
TensorFlow Lite models to identify objects such as walls, curbs, potholes, chairs, doors, and
people. Bounding boxes and object labels are drawn on the live camera feed for on-screen
reference.To convert detections into accessible information, a Text-to-Speech (TTS) engine is
integrated, which reads out the names of detected objects, helping users understand their
surroundings.

Adjustable detection settings are available through a dedicated settings panel,


allowing users to configure the detection threshold, maximum results, and the processing
delegate (CPU, GPU, or NNAPI) based on their device’s capabilities.

The system uses an event-driven approach where the detection process and TTS
announcements operate asynchronously, ensuring smooth real-time performance without lag.
The app also includes error handling for permissions and system errors, providing appropriate
messages to the user. Overall, the system design focuses on low-latency processing, high
detection accuracy, and seamless user experience to deliver effective assistance in dynamic
environments.

52
Fig 4.3 System’s Block Design

4.4 Implementation of Project:

Input Design

The input design defines how users interact with the Blind Assistance Android App
and how the system accepts camera data to process. It creates the bridge between the visually
impaired user and the app’s object detection functionality. The input is primarily the live
video feed from the mobile device’s camera, and it must be designed for ease of access, low
latency, and secure permission handling. The goal is to ensure minimal user intervention
while providing an efficient way to start and control detection.

The input design focuses on controlling errors, avoiding unnecessary steps, and
simplifying the process by automatically handling camera permissions and settings
adjustments. It ensures security by checking user permissions before accessing the camera
and protects user privacy by not storing images.

Input Design considers the following aspects:

 What data should be captured? (Live camera frames)


 How should the data be processed? (Real-time frame analysis and rotation correction)
 User prompts to guide them through permission granting and settings.
 Methods for validating camera input and handling errors (e.g., camera unavailable,
permissions denied).

53
Objectives

 Convert user actions (like starting detection) into system operations in the app,
ensuring the process is error-free.
 Provide a simple interface where the user can start detection without technical
knowledge.
 Validate that the camera feed is functional, with clear error messages guiding the user
when problems occur.
 Make the input layout intuitive and fast, minimizing user effort.

Output Design

Output design ensures the app presents detection results effectively to visually impaired
users through both audio and visual feedback. The output includes bounding boxes overlaid
on detected objects and voice announcements of object names and estimated distances.Clear,
concise, and timely output is critical to allow the user to navigate safely. Outputs must be
accurate, immediate, and easily understood to improve user trust and decision-making during
movement.

 Outputs include live bounding box overlays on the camera view for sighted helpers
and announcements for the user.
 Outputs must be optimized for low-latency delivery, ensuring real-time assistance.
 Outputs should convey important information like detected object names and
proximity warnings.

Objectives

Display object detection results in a user-friendly overlay on the camera preview.

Announce detected objects promptly with clear and natural speech.

Provide actionable information, e.g., “Person ahead,” “Chair nearby,” allowing the user to
make safe decisions.

4.5 Algorithms:

4.5.1 Object Detection with TensorFlow Lite

Object detection in this app is implemented using TensorFlow Lite models like
MobileNet SSD or EfficientDet Lite, optimized for Android devices. These models

54
take live camera frames as input, process them with convolutional neural networks
(CNNs), and output detected objects with their labels and confidence scores.

The object detection algorithm identifies relevant objects such as walls, curbs,
potholes, chairs, doors, and people. Once detected, their bounding boxes are
calculated, scaled to the preview size, and displayed. Detected object labels are
converted to speech output for the user.

4.5.1.1 Bounding Box Calculation and Confidence Threshold

Each detection result includes a bounding box (x, y, width, height) and
a confidence score. The app filters detections using a configurable threshold
(e.g., 0.5), ensuring only reliable detections are announced. This threshold
helps reduce false positives and improve usability.

Bounding boxes are then scaled relative to the preview display


dimensions to ensure accurate visual overlay, even with device rotation
corrections applied. The combination of accurate object detection and real-
time feedback is central to the app’s safety and assistance features.

Fig 4.5 SSD-MobileNet-v2 architecture

4.5.2 Object Detection with EfficientDet

EfficientDet is a family of object detection models developed by Google that


are designed to be both fast and accurate, making them ideal for real-time applications
on mobile devices. EfficientDet is built on top of the EfficientNet backbone (a highly
efficient image classification network), and uses a special architecture called BiFPN

55
(Bidirectional Feature Pyramid Network) to combine features at different scales in the
image efficiently. This allows EfficientDet to detect small, medium, and large objects
in a single pass.

High accuracy: Despite being lightweight, EfficientDet achieves performance on par


with or better than much larger detectors like YOLO or Faster R-CNN.

Scalable: Comes in versions D0 through D7 — smaller versions (D0-D2) run fast on


mobile devices, while larger versions (D4-D7) deliver higher accuracy on powerful
hardware.

Efficient: Uses fewer parameters and computations compared to traditional detectors,


which means it’s battery-friendly — critical for Android apps running on mobile
devices.

Feature fusion with BiFPN: Combines information from different resolutions of the
image better than traditional feature pyramids, leading to improved detection of
objects at multiple scales.

56
CODING

5.CODING
5.1 Coding Approach:

57
The objective of the coding or programming phase is to translate the design of the
system produced during the design phase into code in a given programming language, which
can be executed by a computer and that performs the computation specified by the design.
The coding phase affects both testing and maintenance. The goal of coding is not to reduce
the implementation cost, but the goal should be to reduce the cost of later phases. There are
two major approaches for coding any software system. They are Top-Down approach and
bottom-up approach.

Bottom-up Approach can best suit for developing the object-oriented systems. During
system design phase, we decompose the system into an appropriate number of subsystems,
for which objects can be modelled independently. These objects exhibit the way the
subsystems perform their operations.

Once objects have been modelled, they are implemented by means of coding. Even
though related to the same system as the objects are implemented of each other, the Bottom-
Up approach is more suitable for coding these objects. In this approach, we first do the
coding of objects independently and then we integrate these modules into one system to
which they belong.

5.2 Verification and Validation:

Verification ensures that the Blind Assistance App is built correctly according to the
design specifications, while validation checks that the app fulfills its purpose of providing
accurate, real-time obstacle detection and guidance for visually impaired users.

During the development of this system, all code modules related to camera
management, object detection, text-to-speech, and settings controls have been thoroughly
verified through detailed testing of their design, integration, and runtime behavior. Various
techniques were applied during validation, as discussed in the testing phase of the system.
Validations were implemented at two primary levels to ensure correctness and reliability:

Screen Level Validation: Validations of all user interactions, such as permission prompts,
detection setting adjustments, and button presses, are handled at the screen level. The system
raises appropriate error dialogs or messages if the camera permission is denied or if
unsupported settings are selected. This ensures users are guided through resolving issues
before detection starts.

58
Control Level Validation: Validations are applied directly to individual UI controls, such as
spinners for detection thresholds and delegate selections. If an invalid option is chosen or
required permissions are not granted, the system displays clear dialogs or toasts, helping the
user correct their input. This ensures every control behaves predictably and prevents invalid
configurations that could cause system errors or degraded detection performance.Throughout
the app, real-time runtime validations also ensure that camera input is active and object
detection results are reliable before audio announcements occur, maintaining both safety and
usability.

5.3 Source Code:

MainActivity.kt

package org.tensorflow.lite.examples.objectdetection

import android.os.Build

import android.os.Bundle

import androidx.appcompat.app.AppCompatActivity

import org.tensorflow.lite.examples.objectdetection.databinding.ActivityMainBinding

class MainActivity : AppCompatActivity() {

private lateinit var activityMainBinding: ActivityMainBinding

override fun onCreate(savedInstanceState: Bundle?) {

super.onCreate(savedInstanceState)

activityMainBinding = ActivityMainBinding.inflate(layoutInflater)

setContentView(activityMainBinding.root)

override fun onBackPressed() {

if (Build.VERSION.SDK_INT == Build.VERSION_CODES.Q) {

// Workaround for Android Q memory leak issue in IRequestFinishCallback$Stub.

finishAfterTransition()

59
} else {

super.onBackPressed()

ObjectDetectorHelper.kt

package org.tensorflow.lite.examples.objectdetection

import android.content.Context

import android.graphics.Bitmap

import android.os.SystemClock

import android.util.Log

import org.tensorflow.lite.gpu.CompatibilityList

import org.tensorflow.lite.support.image.ImageProcessor

import org.tensorflow.lite.support.image.TensorImage

import org.tensorflow.lite.support.image.ops.Rot90Op

import org.tensorflow.lite.task.core.BaseOptions

import org.tensorflow.lite.task.vision.detector.Detection

import org.tensorflow.lite.task.vision.detector.ObjectDetector

class ObjectDetectorHelper(

var threshold: Float = 0.5f,

var numThreads: Int = 2,

var maxResults: Int = 3,

var currentDelegate: Int = 0,

var currentModel: Int = 0,

60
val context: Context,

val objectDetectorListener: DetectorListener?

){

// For this example this needs to be a var so it can be reset on changes. If the
ObjectDetector

// will not change, a lazy val would be preferable.

private var objectDetector: ObjectDetector? = null

init {

setupObjectDetector()

fun clearObjectDetector() {

objectDetector = null

// Initialize the object detector using current settings on the

// thread that is using it. CPU and NNAPI delegates can be used with detectors

// that are created on the main thread and used on a background thread, but

// the GPU delegate needs to be used on the thread that initialized the detector

fun setupObjectDetector() {

// Create the base options for the detector using specifies max results and score threshold

val optionsBuilder =

ObjectDetector.ObjectDetectorOptions.builder()

.setScoreThreshold(threshold)

.setMaxResults(maxResults)

// Set general detection options, including number of used threads

61
val baseOptionsBuilder = BaseOptions.builder().setNumThreads(numThreads)

// Use the specified hardware for running the model. Default to CPU

when (currentDelegate) {

DELEGATE_CPU -> {

// Default

DELEGATE_GPU -> {

if (CompatibilityList().isDelegateSupportedOnThisDevice) {

baseOptionsBuilder.useGpu()

} else {

objectDetectorListener?.onError("GPU is not supported on this device")

DELEGATE_NNAPI -> {

baseOptionsBuilder.useNnapi()

optionsBuilder.setBaseOptions(baseOptionsBuilder.build())

val modelName =

when (currentModel) {

MODEL_MOBILENETV1 -> "mobilenetv1.tflite"

MODEL_EFFICIENTDETV0 -> "efficientdet-lite0.tflite"

MODEL_EFFICIENTDETV1 -> "efficientdet-lite1.tflite"

62
MODEL_EFFICIENTDETV2 -> "efficientdet-lite2.tflite"

else -> "mobilenetv1.tflite"

try {

objectDetector =

ObjectDetector.createFromFileAndOptions(context, modelName,
optionsBuilder.build())

} catch (e: IllegalStateException) {

objectDetectorListener?.onError(

"Object detector failed to initialize. See error logs for details"

Log.e("Test", "TFLite failed to load model with error: " + e.message)

fun detect(image: Bitmap, imageRotation: Int) {

if (objectDetector == null) {

setupObjectDetector()

// Inference time is the difference between the system time at the start and finish of the

// process

var inferenceTime = SystemClock.uptimeMillis()

// Create preprocessor for the image.

// lite_support#imageprocessor_architecture

val imageProcessor =

ImageProcessor.Builder()

63
.add(Rot90Op(-imageRotation / 90))

.build()

// Preprocess the image and convert it into a TensorImage for detection.

val tensorImage = imageProcessor.process(TensorImage.fromBitmap(image))

val results = objectDetector?.detect(tensorImage)

inferenceTime = SystemClock.uptimeMillis() - inferenceTime

objectDetectorListener?.onResults(

results,

inferenceTime,

tensorImage.height,

tensorImage.width)

interface DetectorListener {

fun onError(error: String)

fun onResults(

results: MutableList<Detection>?,

inferenceTime: Long,

imageHeight: Int,

imageWidth: Int

companion object {

const val DELEGATE_CPU = 0

const val DELEGATE_GPU = 1

64
const val DELEGATE_NNAPI = 2

const val MODEL_MOBILENETV1 = 0

const val MODEL_EFFICIENTDETV0 = 1

const val MODEL_EFFICIENTDETV1 = 2

const val MODEL_EFFICIENTDETV2 = 3

OverlayView.kt

package org.tensorflow.lite.examples.objectdetection

import android.content.Context-

import android.graphics.Canvas

import android.graphics.Color

import android.graphics.Paint

import android.graphics.Rect

import android.graphics.RectF

import android.util.AttributeSet

import android.view.View

import androidx.core.content.ContextCompat

import java.util.LinkedList

import kotlin.math.max

import org.tensorflow.lite.task.vision.detector.Detection

class OverlayView(context: Context?, attrs: AttributeSet?) : View(context, attrs) {

private var results: List<Detection> = LinkedList<Detection>()

65
private var boxPaint = Paint()

private var textBackgroundPaint = Paint()

private var textPaint = Paint()

private var scaleFactor: Float = 1f

private var bounds = Rect()

init {

initPaints()

fun clear() {

textPaint.reset()

textBackgroundPaint.reset()

boxPaint.reset()

invalidate()

initPaints()

private fun initPaints() {

textBackgroundPaint.color = Color.BLACK

textBackgroundPaint.style = Paint.Style.FILL

textBackgroundPaint.textSize = 50f

textPaint.color = Color.WHITE

textPaint.style = Paint.Style.FILL

textPaint.textSize = 50f

boxPaint.color = ContextCompat.getColor(context!!, R.color.bounding_box_color)

66
boxPaint.strokeWidth = 8F

boxPaint.style = Paint.Style.STROKE

override fun draw(canvas: Canvas) {

super.draw(canvas)

for (result in results) {

val boundingBox = result.boundingBox

val top = boundingBox.top * scaleFactor

val bottom = boundingBox.bottom * scaleFactor

val left = boundingBox.left * scaleFactor

val right = boundingBox.right * scaleFactor

// Draw bounding box around detected objects

val drawableRect = RectF(left, top, right, bottom)

canvas.drawRect(drawableRect, boxPaint)

// Create text to display alongside detected objects

val drawableText =

result.categories[0].label + " " +

String.format("%.2f", result.categories[0].score)

// Draw rect behind display text

textBackgroundPaint.getTextBounds(drawableText, 0, drawableText.length, bounds)

val textWidth = bounds.width()

val textHeight = bounds.height()

canvas.drawRect(

left,

67
top,

left + textWidth + Companion.BOUNDING_RECT_TEXT_PADDING,

top + textHeight + Companion.BOUNDING_RECT_TEXT_PADDING,

textBackgroundPaint

// Draw text for detected object

canvas.drawText(drawableText, left, top + bounds.height(), textPaint)

fun setResults(

detectionResults: MutableList<Detection>,

imageHeight: Int,

imageWidth: Int,

){

results = detectionResults

// PreviewView is in FILL_START mode. So we need to scale up the bounding box to


match with

// the size that the captured images will be displayed.

scaleFactor = max(width * 1f / imageWidth, height * 1f / imageHeight)

companion object {

private const val BOUNDING_RECT_TEXT_PADDING = 8

TFObjectDetectionTest.kt

68
package org.tensorflow.lite.examples.objectdetection

import android.content.res.AssetManager

import android.graphics.Bitmap

import android.graphics.BitmapFactory

import android.graphics.RectF

import androidx.test.ext.junit.runners.AndroidJUnit4

import androidx.test.platform.app.InstrumentationRegistry

import java.io.InputStream

import org.junit.Assert.assertEquals

import org.junit.Assert.assertNotNull

import org.junit.Assert.assertTrue

import org.junit.Test

import org.junit.runner.RunWith

import org.tensorflow.lite.support.label.Category

import org.tensorflow.lite.task.vision.detector.Detection

@RunWith(AndroidJUnit4::class)

class TFObjectDetectionTest {

val controlResults = listOf<Detection>(

Detection.create(RectF(69.0f, 58.0f, 227.0f, 171.0f),

listOf<Category>(Category.create("cat", "cat", 0.77734375f))),

Detection.create(RectF(13.0f, 6.0f, 283.0f, 215.0f),

listOf<Category>(Category.create("couch", "couch", 0.5859375f))),

Detection.create(RectF(45.0f, 27.0f, 257.0f, 184.0f),

listOf<Category>(Category.create("chair", "chair", 0.55078125f)))

69
)

@Test

@Throws(Exception::class)

fun detectionResultsShouldNotChange() {

val objectDetectorHelper =

ObjectDetectorHelper(

context = InstrumentationRegistry.getInstrumentation().context,

objectDetectorListener =

object : ObjectDetectorHelper.DetectorListener {

override fun onError(error: String) {

// no op

override fun onResults(

results: MutableList<Detection>?,

inferenceTime: Long,

imageHeight: Int,

imageWidth: Int

){

assertEquals(controlResults.size, results!!.size)

// Loop through the detected and control data

for (i in controlResults.indices) {

// Verify that the bounding boxes are the same

assertEquals(results[i].boundingBox, controlResults[i].boundingBox)

// Verify that the detected data and control

70
// data have the same number of categories

assertEquals(

results[i].categories.size,

controlResults[i].categories.size

// Loop through the categories

for (j in 0 until controlResults[i].categories.size - 1) {

// Verify that the labels are consistent

assertEquals(

results[i].categories[j].label,

controlResults[i].categories[j].label

// Create Bitmap and convert to TensorImage

val bitmap = loadImage("cat1.png")

// Run the object detector on the sample image

objectDetectorHelper.detect(bitmap!!, 0)

@Test

71
@Throws(Exception::class)

fun detectedImageIsScaledWithinModelDimens() {

val objectDetectorHelper =

ObjectDetectorHelper(

context = InstrumentationRegistry.getInstrumentation().context,

objectDetectorListener =

object : ObjectDetectorHelper.DetectorListener {

override fun onError(error: String) {}

override fun onResults(

results: MutableList<Detection>?,

inferenceTime: Long,

imageHeight: Int,

imageWidth: Int

){

assertNotNull(results)

for (result in results!!) {

assertTrue(result.boundingBox.top <= imageHeight)

assertTrue(result.boundingBox.bottom <= imageHeight)

assertTrue(result.boundingBox.left <= imageWidth)

assertTrue(result.boundingBox.right <= imageWidth)

72
// Create Bitmap and convert to TensorImage

val bitmap = loadImage("cat1.png")

// Run the object detector on the sample image

objectDetectorHelper.detect(bitmap!!, 0)

@Throws(Exception::class)

private fun loadImage(fileName: String): Bitmap? {

val assetManager: AssetManager =

InstrumentationRegistry.getInstrumentation().context.assets

val inputStream: InputStream = assetManager.open(fileName)

return BitmapFactory.decodeStream(inputStream)

Activity_main.xml

<?xml version="1.0" encoding="utf-8"?>

<androidx.coordinatorlayout.widget.CoordinatorLayout

xmlns:android="http://schemas.android.com/apk/res/android"

xmlns:tools="http://schemas.android.com/tools"

xmlns:app="http://schemas.android.com/apk/res-auto"

android:background="@android:color/transparent"

android:layout_width="match_parent"

android:layout_height="match_parent">

<RelativeLayout

android:layout_width="match_parent"

73
android:layout_height="match_parent"

android:orientation="vertical">

<androidx.fragment.app.FragmentContainerView

android:id="@+id/fragment_container"

android:name="androidx.navigation.fragment.NavHostFragment"

android:layout_width="match_parent"

android:layout_height="match_parent"

android:background="@android:color/transparent"

android:keepScreenOn="true"

app:defaultNavHost="true"

app:navGraph="@navigation/nav_graph"

android:layout_marginTop="?android:attr/actionBarSize"

tools:context=".MainActivity"/>

<androidx.appcompat.widget.Toolbar

android:id="@+id/toolbar"

android:layout_width="match_parent"

android:layout_height="?attr/actionBarSize"

android:layout_alignParentTop="true"

android:background="@color/toolbar_background">

<ImageView

android:layout_width="wrap_content"

android:layout_height="wrap_content"

android:scaleType="fitCenter"

android:src="@drawable/tfl_logo" />

74
</androidx.appcompat.widget.Toolbar>

</RelativeLayout>

</androidx.coordinatorlayout.widget.CoordinatorLayout

Fragment_camera.xml

<?xml version="1.0" encoding="utf-8"?>

<androidx.coordinatorlayout.widget.CoordinatorLayout
xmlns:android="http://schemas.android.com/apk/res/android"

xmlns:app="http://schemas.android.com/apk/res-auto"

android:id="@+id/camera_container"

android:layout_width="match_parent"

android:layout_height="match_parent">

<androidx.camera.view.PreviewView

android:id="@+id/view_finder"

android:layout_width="match_parent"

android:layout_height="match_parent"

app:scaleType="fillStart"/>

<org.tensorflow.lite.examples.objectdetection.OverlayView

android:id="@+id/overlay"

android:layout_height="match_parent"

android:layout_width="match_parent" />

<include

android:id="@+id/bottom_sheet_layout"

layout="@layout/info_bottom_sheet" />

</androidx.coordinatorlayout.widget.CoordinatorLayout>

75
TESTING

76
6.TESTING

Testing is the process of finding differences between the expected behaviour specified
by system models and the observed behaviour of the system. Testing is a critical role in
quality assurance and ensuring the reliability of development and these errors will be
reflected in the code, so the application should be thoroughly tested and validated.

Unit testing finds the differences between the object design model and its
corresponding components. Structural testing finds differences between the system design
model and a subset of integrated subsystems. Functional testing finds differences between the
use case model and the system. Finally, performance testing, finds differences between non-
functional requirements and actual system performance. From modelling point of view,
testing is the attempt of falsification of the system with respect to the system models. The
goal of testing is to design tests that exercise defects in the system and to reveal problems.

6.1 Testing Activities:

Testing a large system is a complex activity and like any complex activity. It has to be
broke into smaller activities. Thus, incremental testing was performed on the project ie,
components and subsystems of the system were tested separately before integrating them to
form the subsystem for system testing.

6.2 Testing Types:

6.2.1 Unit Testing:

Unit testing focuses on the building blocks of the software system that is the objects
and subsystems. There are three motivations behind focusing on components. First unit
testing reduces the complexity of overall test activities allowing focus on smaller units of the
system, second unit testing makes it easier to pinpoint and correct faults given that few
components are involved in the rest. Third unit testing allows parallelism in the testing
activities, that is each component are involved in the test. Third unit testing allows
parallelism in the testing activities that is each component can be tested independently of one
another. The following are some unit testing techniques.

77
 Equivalence Testing: It is a black box testing technique that minimizes the number of
test cases. The possible inputs are partitioned into equivalence classes and a test case
is selected for each class.
 Boundary Testing: It is a special case of equivalence testing and focuses on the
conditions at the boundary of the equivalence classes. Boundary testing requires that
the elements be selected from the edges of the equivalence classes.
 Path Testing: It is a white box testing technique that identifies faults in the
implementation of the component the assumption here is that exercising all possible
paths through the code at least once. Most faults will trigger failure. This acquires
knowledge of source code.

6.2.2 Integrating Testing:

Integration testing defects faults that have not been detected. During unit testing by
focusing on small groups on components two or more components are integrated and tested
and once tests do not reveal any new faults, additional components are added to the group.
This procedure allows testing of increasing more complex parts on the system while keeping
the location of potential faults relatively small. I have used the following approach to
implements and integrated testing.

Top-down testing strategy unit tests the components of the top layer and then
integrated the components of the next layer down. When all components of the new layer
have been tested together, the next layer is selected. This was repeated until all layers are
combined and involved in the test.

6.2.3 Validation Testing:

The systems completely assembled as package, the interfacing have been uncovered
and corrected, and a final series of software tests are validation testing. The validation testing
is nothing but validation success when system functions in a manner that can be reasonably
expected by the customer. The system validation had done by series of Black-box test
methods.

6.2.4 System Testing:

 System testing ensures that the complete system compiles with the functional
requirements and non-functional requirements of the system, the following are some

78
 Functional testing finds differences between the functional between the functional
requirements and the system. This is a black box testing technique. Test cases are
divided from the use case model.
 Performance testing finds differences between the design and the system the design
goals

6.3 Various Types of Testing:

6.3.1 White-Box Testing:

To ensure the internal functionality of the Blind Assistance Android App


operates according to specifications, white-box testing was conducted on its core
components. The testing began with the checkCameraPermission() function to verify
correct handling of Android runtime permissions for camera access. Next, the camera
initialization logic in setUpCamera() was tested to confirm that the app correctly sets
up CameraX and streams live frames. The detect() method in ObjectDetectorHelper
was thoroughly evaluated to ensure it preprocesses images accurately, including
rotation correction, and runs the TensorFlow Lite model correctly with the configured
settings. The filtering logic applying the confidence threshold was tested to verify it
excludes low-confidence detections. The Text-to-Speech module was tested by
passing known object labels to confirm clear, timely, and accurate speech
announcements. Finally, UI components such as the OverlayView’s setResults()
method were tested to ensure bounding boxes are drawn in the correct positions and
scaled appropriately on the camera preview. This rigorous internal testing ensures the
app reliably captures input, processes it, and provides real-time guidance without
crashes or logical errors.

79
Fig 6.3.1 White Box Testing Diagram

6.3.2 Black-Box Testing:

Black-Box Testing, also called Behavioural Testing, was applied to verify the
functionality of the Blind Assistance App without inspecting its internal code. This
method involved providing inputs such as launching the app, adjusting detection
settings, moving various objects in front of the camera, and observing the outputs
(voice announcements and bounding box overlays). Black-box testing focused on the
following areas:

 Correct detection of supported object classes (e.g., walls, curbs, chairs, people).
 User interface behavior when permissions are granted or denied.
 Handling of camera permission errors or unavailable camera hardware.
 Real-time audio feedback accuracy and clarity.
 UI responsiveness when adjusting detection threshold, max results, and delegate
options.

Testers verified that the app correctly announces detected objects and displays bounding
boxes as the user moves around. They also checked that adjusting detection settings updates
the detection behavior immediately. These tests ensure the app behaves correctly from the
user’s perspective under various real-world scenarios.

Fig 6.3.2 Black Box Testing Diagram

80
6.4 Test Plan:

The test plan for the Blind Assistance App focuses on validating the app’s
functionality, performance, security, and usability across the entire detection workflow. Input-
handling tests confirm that the app correctly manages camera permissions, responds
gracefully to permission denials, and displays appropriate error dialogs.

Functional accuracy tests involve presenting known objects (e.g., chairs, doors,
potholes) to the camera and verifying that detections are accurate and announced clearly via
text-to-speech. Error-handling tests assess the app’s behavior during conditions like camera
hardware failure, low-light scenarios, or corrupted camera frames, ensuring the system
provides meaningful feedback or recovers smoothly.

Performance testing measures frame processing time and text-to-speech delays to


ensure the system meets real-time constraints. Integration tests verify smooth operation
across modules, from camera capture to object detection, overlay rendering, and speech
output. Security testing checks that the app handles sensitive permissions safely and does not
store or misuse captured images.

Usability testing evaluates the clarity of the interface, intuitiveness of settings


adjustments, and the responsiveness of the app to user actions. This comprehensive plan
ensures the Blind Assistance App is accurate, reliable, user-friendly, and secure in delivering
real-time guidance to visually impaired users.

UI interactions for adjusting detection settings, and robust error handling in scenarios
such as permission denial or device disconnection. The testing strategy involves unit testing
for individual components, integration testing for module interactions, system testing for end-
to-end workflows, regression testing to ensure new changes don’t break existing features, and
usability testing to confirm an intuitive experience for visually impaired users.

A detailed schedule aligns testing activities with development milestones, while


responsibilities are assigned among developers, testers, and project leads to ensure
accountability. Deliverables include comprehensive test cases, bug reports, and final test
summary reports, all aimed at guaranteeing the app’s stability, performance, and accessibility
before deployment.

81
6.5 Test Cases:

Test Case for Camera Permission Request

Test Case ID TC001


Pre-requisites App installed on device
Action Open app without granting camera
permission
Expected Result System requests camera permission and
displays prompt
Test Result Pass

Table 6.5(a) Test Case for Camera Permission Request

Test Case ID TC002


Pre-requisites Camera permission granted
Action Open app and start detection
Expected Result Live camera preview starts without delay
Test Result Pass

Table 6.5(b) Test Case for Camera Preview Start

Test Case ID TC003


Pre-requisites Camera running and object detection
initialized
Action Show supported object (e.g., chair, person)
in camera view
Expected Result Bounding box appears and object label is
displayed
Test Result Pass

Table 6.5(c) Test Case for Object Detection

82
Test Case for Audio Feedback

Test Case ID TC004


Pre-requisites Object detected with label
Action Wait for detection and listen to app's speech
Expected Result Detected object label is announced clearly
through text-to-speech
Test Result Pass

Table 6.5(d) Test Case for Audio Feedback

Test Case for Detection Threshold Adjustment

Test Case ID TC005


Pre-requisites App running on detection screen
Action Increase or decrease detection threshold
using UI buttons
Expected Result System updates detection threshold
immediately without restarting detection
Test Result Pass

Table 6.5(e) Test Case for Detection Threshold Adjustment Control Feature

Test Case for Changing Delegate (CPU/GPU/NNAPI)

Test Case ID TC006


Pre-requisites App running on detection screen
Action Select a different hardware delegate from
dropdown
Expected Result System switches to selected delegate and
updates detection behavior
Test Result Pass

Table 6.5(f) Test Case for Changing Delegate

83
Test Case for Continuous Detection Loop

Test Case ID TC007


Pre-requisites Camera and detection active
Action Move various objects in and out of camera
view continuously
Expected Result System consistently detects and announces
objects in real time
Test Result Pass

Table 6.5(g) Test Case for Continuous Detection

Test Case for Error Handling on Camera Unavailable

Test Case ID TC008


Pre-requisites Camera hardware disabled or in use by
another app
Action Open Blind Assistance App
Expected Result App displays appropriate error message and
exits gracefully
Test Result Pass

Table 6.5(h) Test Case for Error Handling on Camera

84
Screens

85
7.OUTPUT SCREENS

Fig 7.1 App in Android Device

After Click on Android app its ask for permission to access camera from Google Play

86
Fig 7.2 Permission to access Camera of Android Device

After permitting for access the camera of android device it navigates to Start Page, If the
Maximum results are 3 then it detects the 2 objects at a time.

Fig 7.3 Start Page Fig 7.4 Detect Objects two at a time

According to the Fig 7.3, the output explains about the starting page of android application.

87
According to the Fig 7.4, the output explains about the feature of detecting object more than
one at a time.

Changing the maximum results to one it detects the objects one at a time.

Changing threshold to more than +0.50 which increases accuracy of detecting objects but
maximum of detecting objects will be decreases.

Fig 7.5 Detecting Object one at a time Fig 7.6 Increased Threshold to +0.60

According to Fig 7.5, the object detects by android application only one at a time.

According to Fig 7.6, the user increases threshold for better accuracy.

88
It also shows how much time taken to detect the object.

Fig 7.7(a) Setting Page

89
Fig 7.7(b) Inference Time

Fig 7.7(c) Project Page Fig 7.7(d) Project Page

90
CONCLUSION

91
8.CONCLUSION

The Blind Assistance Android App was successfully developed and tested to provide
real-time obstacle detection and voice guidance for visually impaired users. By leveraging
advanced computer vision techniques with TensorFlow Lite models, the app accurately
identifies objects such as walls, curbs, potholes, chairs, doors, and people directly from the
smartphone’s camera feed. The detected objects are clearly announced through text-to-
speech, enabling users to navigate unfamiliar or cluttered environments with greater safety
and independence. The implementation of an intuitive interface with adjustable detection
settings allows users or their caregivers to fine-tune the system’s sensitivity and performance
to their needs. The system achieved low-latency performance on mobile devices,
demonstrating its suitability for real-world deployment without requiring specialized
hardware. Comprehensive validation and testing, including both white-box and black-box
techniques, confirmed the app’s reliability, responsiveness, and accuracy under different
environmental conditions. Overall, the Blind Assistance App demonstrates a practical and
affordable solution that combines computer vision, mobile development, and assistive
technology to improve the quality of life and autonomy for visually impaired individuals.

92
REFERENCES

93
9.REFERENCES

9.1 Academic Papers and Journals:

TensorFlow Lite Object Detection API.


https://www.tensorflow.org/lite/models/object_detection/overview

Provides lightweight deep learning models optimized for mobile and edge devices.

Android CameraX Jetpack Library. https://developer.android.com/training/camerax

Official documentation for Android’s CameraX API used for capturing live video frames.

Android Developers Guide - Text-to-Speech.


https://developer.android.com/reference/android/speech/tts/TextToSpeech

Official guide to implement text-to-speech functionality on Android.

EfficientDet: Scalable and Efficient Object Detection. Mingxing Tan, Ruoming Pang, Quoc
V. Le. arXiv preprint arXiv:1911.09070 (2019).

Paper describing the EfficientDet architecture used in mobile object detection.

MobileNet SSD: MobileNets: Efficient Convolutional Neural Networks for Mobile Vision
Applications. Andrew G. Howard et al. arXiv preprint arXiv:1704.04861 (2017).

Introduces the MobileNet architecture optimized for lightweight detection on mobile devices.

Android Jetpack Navigation Component. https://developer.android.com/guide/navigation

Documentation for Android Navigation framework used in the app for fragment transitions.

Google ML Kit and TensorFlow Lite Model Maker. https://developers.google.com/ml-kit

Used as a reference for on-device ML model conversion and optimization techniques.

CameraX Use Cases: Preview, ImageAnalysis, ImageCapture.


https://developer.android.com/reference/androidx/camera/core/package-summary

Describes the CameraX use cases implemented for live video processing.

94
FUTURE SCOPE

95
10.FUTURE SCOPE

The Blind Assistance Android App lays the groundwork for providing real-time
obstacle detection and voice guidance to visually impaired users, and there are several
promising directions for future development. Integrating depth estimation techniques would
allow the app to provide precise distance information to obstacles, greatly enhancing user
safety by announcing how far away hazards are. Support for specialized Edge AI hardware,
such as Google Coral or dedicated DSPs, could significantly improve performance and
reduce battery consumption during prolonged use. Adding offline navigation and indoor
localization features would enable users to receive turn-by-turn assistance both outdoors and
inside buildings. Upgrading the detection module to perform semantic segmentation could
give users a better understanding of complex scenes by recognizing regions like crosswalks
or uneven pavements. Incorporating voice command functionality would allow users to
control the app completely hands-free, further improving accessibility. Cloud connectivity
could enable real-time sharing of obstacle alerts or location updates with caregivers for
additional security. Expanding the training dataset to include more specific obstacles, such as
escalators, bicycles, or temporary construction hazards, would make the app more versatile in
diverse environments. Multilingual support in audio feedback would allow the app to cater to
a global audience. Augmented reality overlays could assist partially sighted users by
highlighting detected objects visually, and further optimizations in processing would extend
battery life, ensuring the app remains practical for all-day use.

96
APPENDIX

97
11.APPENDIX
11.1 List of Figures:

Figure No. Figure Name Page No.


Fig 3.4.1.2 Use Case Diagram 23
Fig 3.4.2(a) Sequence Diagram of User 27
Fig 3.4.2(b) Sequence Diagram of System 28
Fig 3.4.3 Activity Diagram 31
Fig 4.3 System’s Block Diagram 35
Fig 4.5 SSD MobileNet-v2 37
Fig 6.3.1 White Box Testing Diagram 59
Fig 6.3.2 Black Box Testing Diagram 60
Fig 7.1 App in Android Device 65
Fig 7.2 Permission to access Camera of Android Device 65
Fig 7.3 Start Page 66
Fig 7.4 Detect Objects two at a time 66
Fig 7.5 Detecting Object one at a time 67
Fig 7.6 Increased Threshold to +0.60 67
Fig 7.7(a) Setting Page 68
Fig 7.7(b) Inference Time 68
Fig 7.7(c) Project Page 69
Fig 7.7(d) Project Page 69

98
11.2 List of Tables:

Figure No. Figure Name Page No.


Fig 3.4.1.2(a) Start Camera 23
Fig 3.4.1.2(b) Detect Objects 24
Fig 3.4.1.2(c) Provide Audio Feedback 24
Fig 3.4.1.2(d) Adjust Detection Settings 25
Fig 3.4.1.2(e) View Detection Results 25
Fig 6.5(a) Test Case for Camera Permission Request 62
Fig 6.5(b) Test Case for Camera Preview Start 62
Fig 6.5(c) Test Case for Object Detection 62
Fig 6.5(d) Test Case for Audio Feedback 63
Fig 6.5(e) Test Case for Detection Threshold Adjustment 63
Control Feature
Fig 6.5(f) Test Case for Changing Delegate 63
Fig 6.5(g) Test Case for Continuous Detection 64
Fig 6.5(h) Test Case for Error Handling on Camera 64

99

You might also like