Blind Assistance System
Blind Assistance System
RUSHIKONDA, VISAKHAPATNAM-45
VISION
MISSION
“Unfold into a world class organization with strong academic and research base
producing responsible citizens to cater to the changing needs of the society”
BLIND ASSISTANCE SYSTEM USING
MACHINE LEARNING
Submitted by
Associate Professor
RUSHIKONDA, VISAKHAPATNAM-45
2023-2025
GAYATRI VIDYA PARISHAD
RUSHIKONDA, VISAKHAPATNAM-45
CERTIFICATE
External Examiner
DECLARATION
PG232406017
ACKNOWLEDGEMENT
I consider this as a privilege to thank all those people who helped me a lot
for successful completion of the project “Blind Assistance System Using
Machine Learning.”
8
ABSTRACT
Once these objects are detected, the app provides immediate voice
feedback, announcing the objects aloud through text-to-speech technology so
the user can be aware of obstacles and surroundings without needing to see
them. The object detection system is powered by TensorFlow Lite, which allows
the app to run models efficiently directly on the mobile device without requiring
an internet connection.
The app processes live camera frames using the CameraX library and
performs object detection with customizable parameters such as detection
threshold, maximum number of detected objects, and the number of processing
threads. This makes it adaptable to various device capabilities and user
preferences, balancing speed and accuracy. Blind Assist’s interface includes
adjustable settings to increase or decrease the detection confidence threshold (to
show only more certain detections) and to switch between supported hardware
delegates (CPU, GPU, or NNAPI) for optimized performance on different
devices. By integrating voice commands and intuitive controls, the app ensures
ease of use for visually impaired users.
9
TABLE OF CONTENTS
CHAPTER Page No
1. INTRODUCTION
2. LITERATURE SURVEY
2.1. Introduction 12
2.4.1. Objectives 15
10
2.6.1. Software Requirements 18
3. UML MODELING
4. SYSTEM DESIGN
4.5. Algorithm 37
11
5. CODING
6. TESTING
7. OUTPUT SCREENS 65
8. CONCLUSION 70
9. REFERENCES 71
12
10. FUTURE SCOPE 72
11. APPENDIX
13
INTRODUCTION
14
1.INTRODUCTION
The app interface provides adjustable controls, allowing users to set detection
thresholds (to filter out low-confidence results), configure the maximum number of detected
objects, and choose the number of threads used for processing—balancing speed and device
performance. It supports multiple hardware delegates (CPU, GPU, NNAPI) to take advantage
of each device’s processing power and ensures smooth operation on a wide range of Android
devices.
To customize the object detection for the user’s environment, the app can integrate a
custom-trained model. This involves collecting and labeling images of relevant objects (e.g.,
walls, potholes, curbs), training the dataset using TensorFlow’s object detection API, and
converting the trained model into a lightweight TensorFlow Lite format. The final TFLite
model can then replace the default model in the app, enabling it to recognize specific
obstacles or landmarks important for visually impaired navigation.
Unlike traditional, costly hardware solutions for the blind, Blind Assist runs entirely
on a smartphone without needing internet access after setup, making it affordable, portable,
and accessible. By providing timely voice alerts, it reduces the risk of accidents and improves
confidence and independence for users moving through unfamiliar or hazardous
environments. The app’s customizable parameters also make it flexible for different users’
needs, whether they prefer faster detection or higher accuracy. This project demonstrates how
15
modern computer vision, deep learning, and mobile technology can create practical, real-
world solutions that directly enhance the quality of life for people with disabilities.
In the Blind Assist project, AI is used to make the environment more understandable
and navigable for people who are blind or visually impaired. By combining AI-based object
detection models with a smartphone camera, the app can analyze live video frames in real
time. It detects and identifies important objects like walls, curbs, potholes, chairs, people, and
doors. This information is then converted into audio feedback using text-to-speech
technology, allowing users to hear what is around them instantly. The AI model at the heart of
the system has been trained on thousands of labeled images to recognize these objects
accurately even in challenging conditions.
This AI-powered approach provides a practical and affordable solution for visually
impaired individuals, enabling them to navigate unfamiliar environments safely without
relying on others. It also demonstrates the power of AI to improve lives by bridging the gap
between human perception and machine understanding, showing how technology can create
more inclusive societies.
Before we take a look at the details of various machine learning methods, let's start
by looking at what machine learning is, and what it isn't. Machine learning is often
categorized as a subfield of artificial intelligence, but I find that categorization can often
be misleading at first brush. The study of machine learning certainly arose from research
in this context, but in the data science application of machine learning methods, it's more
helpful to think of machine learning as a means of building models of data.
16
Fundamentally, machine learning involves building mathematical models to help
understand data. "Learning" enters the fray when we give these models to unable
parameters that can be adapted to observed data; in this way the program can be
considered to be "learning" from the data. Once these models have been fit to previously
seen data, they can be used to predict and understand aspects of newly observed data.
I'll leave to the reader the more philosophical digression regarding the extent to
which this type of mathematical, model-based "learning" is similar to the "learning"
exhibited by the human brain. Understanding the problem setting in machine learning is
essential to using these tools effectively, and so we will start with some broad
categorizations of the types of approaches we'll discuss here.
1.2.2 Categories of Machine Leaning
At the most fundamental level, machine learning can be categorized into two main
types: supervised learning and unsupervised learning.
17
Let's understand supervised learning with an example. Suppose we have an input
dataset of cats and dog images. So, first, we will provide the training to the machine to
understand the images, such as the shape & size of the tail of cat and dog. Shape of eyes,
colour, height (dogs are taller, cats are smaller), etc. After completion of training, we input
the picture of a cat and ask the machine to identify the object and predict the output. Now,
the machine is well trained, so it will check all the features of the object, such as height,
shape, color, eyes, ears, tail, etc., and find that it's a cat. So, it will put it in the Cat
category. This is the process of how the machine identifies the objects in Supervised
Learning
Human beings, at this moment, are the most intelligent and advanced species on
earth because they can think, evaluate and solve complex problems. On the other side, Al
is still in its initial stage and haven't surpassed human intelligence in many aspects. Then
the question is that what is the need to make machine learn. The most suitable reason for
doing this is, "to make decisions, based on data, with efficiency and scale".
18
Lack of Specialist Persons - As ML technology is still in its infancy stage, availability of
expert resources is a tough job.
No Clear Objective for Formulating Business Problems - Having no clear objective and
well-defined goal for business problems is another key challenge for ML because this
technology is not that mature yet.
Issue of Overfitting & Underfitting - If the model is overfitting or underfitting, it cannot
be represented well for the problem.
Curse of Dimensionality - Another challenge ML model faces is too many features of
data points. This can be a real hindrance.
Difficulty in Deployment - Complexity of the ML model makes it quite difficult to be
deployed in real life.
Artificial Intelligence (AI) and Machine Learning (ML) encompass a wide range of
techniques and methodologies that enable machines to perform tasks that typically require
human intelligence. These techniques are at the heart of many modern applications, from
voice assistants and chatbots to advanced medical diagnostics and autonomous vehicles.
19
One of the fundamental techniques in AI is the use of algorithms to process and analyze
data. These algorithms can be broadly categorized into traditional AI algorithms and those
used in machine learning. Traditional AI algorithms include search algorithms, which are
used to navigate through data to find specific information or solutions to problems. Examples
include the A* search algorithm, which is used in pathfinding and graph traversal, and the
minimax algorithm, which is used in decision-making processes, particularly in game theory
and competitive environments.
Machine learning algorithms, on the other hand, are designed to enable systems to learn
from data and improve their performance over time. These algorithms can be divided into
three main types: supervised learning, unsupervised learning, and reinforcement learning.
Supervised learning involves training a model on a labeled dataset, meaning that each
training example is paired with an output label. The model learns to map inputs to the correct
output based on this training data. Common algorithms used in supervised learning include
linear regression, logistic regression, support vector machines (SVM), and decision trees.
Neural networks, a more advanced form of supervised learning, are particularly powerful for
tasks involving complex and high-dimensional data, such as image and speech recognition.
Unsupervised learning deals with unlabeled data, meaning the algorithm tries to find
patterns and relationships within the data without any explicit instructions on what to look
for. Clustering algorithms, such as k-means and hierarchical clustering, are used to group
similar data points together. Dimensionality reduction techniques, like principal component
analysis (PCA) and t-distributed stochastic neighbor embedding (t-SNE), are used to reduce
the number of variables under consideration and to visualize high-dimensional data.
Unsupervised learning is particularly useful for exploratory data analysis and for finding
hidden patterns or intrinsic structures in the data.
20
discovered through trial and error. Reinforcement learning has been successfully applied to a
wide range of problems, including robotics, game playing, and autonomous driving.
Another critical technique in AI and ML is the use of neural networks, which are
designed to simulate the way the human brain processes information. Neural networks consist
of layers of interconnected nodes, or neurons, that process data in a hierarchical manner. The
most basic form of a neural network is the feedforward neural network, where information
flows in one direction from the input layer to the output layer. More complex architectures,
such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs), have
been developed to handle specific types of data and tasks. CNNs are particularly effective for
image processing tasks due to their ability to capture spatial hierarchies in images, while
RNNs are suited for sequential data, such as time series or natural language, because they can
maintain information about previous inputs through their recurrent connections.
Deep learning, a subset of machine learning, refers to the use of neural networks with
many layers (hence "deep") to model complex patterns in data. The increased depth of these
networks allows them to learn more abstract and high-level features from raw data, which has
led to significant advancements in fields such as computer vision, natural language
processing, and speech recognition. Training deep learning models typically requires large
amounts of data and computational resources, but the results have been groundbreaking,
enabling the development of AI systems that can outperform humans in certain tasks.
Overall, the techniques in AI and ML are diverse and continually evolving, driven by
advances in computational power, availability of data, and innovative research. These
techniques form the backbone of intelligent systems that are transforming industries and
shaping the future of technology.
Artificial Intelligence (AI) and Machine Learning (ML) have profoundly impacted
numerous sectors, revolutionizing how we approach problem-solving and decision-making.
Their influence extends across industries such as healthcare, finance, transportation, and
entertainment, driving significant advancements and efficiencies. In healthcare, AI and ML
are transforming diagnostic and treatment processes. AI-driven diagnostic tools can analyze
medical images with remarkable accuracy, often outperforming human radiologists in
detecting diseases like cancer. Machine learning models predict patient outcomes and
21
recommend personalized treatment plans based on vast amounts of historical data. These
technologies are also accelerating drug discovery by identifying potential drug candidates and
predicting their efficacy, significantly reducing the time and cost involved in bringing new
medications to market. Moreover, AI-powered wearable devices continuously monitor
patients' vital signs, enabling early detection of health issues and timely medical
interventions.
The financial sector has also witnessed substantial changes due to AI and ML. These
technologies enhance fraud detection by identifying unusual patterns in transaction data,
thereby preventing fraudulent activities. In trading, machine learning algorithms analyze
market data to forecast stock prices and inform investment strategies, often executing trades
at high speeds and with greater precision than human traders. Furthermore, AI-driven
chatbots and virtual assistants provide personalized customer service, handling routine
inquiries and transactions, which frees up human agents for more complex tasks.
The impact of AI and ML extends beyond these sectors, influencing areas such as
education, agriculture, and environmental conservation. In education, adaptive learning
platforms use machine learning to tailor educational content to students' individual learning
styles and paces, improving learning outcomes. In agriculture, AI-powered systems monitor
crop health, optimize irrigation, and predict yields, enhancing productivity and sustainability.
Environmental conservation efforts benefit from AI's ability to analyze data from sensors and
22
satellite images, tracking wildlife populations and detecting illegal activities like poaching
and deforestation.
Looking to the future, AI and ML are poised to drive further innovations and societal
changes. One key area of development is the advancement of explainable AI (XAI), which
aims to make AI systems more transparent and understandable to humans. As AI systems
become more complex, ensuring that their decision-making processes are interpretable and
trustworthy is crucial, particularly in high-stakes domains like healthcare and finance.
Another promising direction is the integration of AI with the Internet of Things (IoT). IoT
devices generate vast amounts of data, and AI can analyze this data to derive insights and
make intelligent decisions in real-time.
AI and ML will also play a significant role in addressing global challenges such as
climate change and pandemics. Machine learning models can predict climate patterns,
optimize renewable energy sources, and improve disaster response strategies. In public
health, AI can assist in monitoring disease outbreaks, developing vaccines, and managing
healthcare resources more effectively. Ethical considerations and regulatory frameworks will
be critical as AI and ML continue to evolve. Ensuring that these technologies are developed
and deployed responsibly, with attention to issues such as bias, privacy, and job displacement,
will be essential to maximizing their benefits while mitigating potential risks. Collaborative
efforts between governments, industry, and academia will be necessary to create policies and
standards that promote ethical AI development and use.
Artificial Intelligence (AI) and Machine Learning (ML) have made significant strides in
recent years, but they still face several challenges and limitations. Understanding these issues
is crucial for developing more robust and ethical AI systems.
1. Technical Challenges
23
Scalability: As datasets grow larger and more complex, scaling ML algorithms
becomes challenging. Training models on massive datasets requires substantial
computational resources and time. Ensuring that algorithms can efficiently handle
large-scale data while maintaining performance is a significant hurdle.
Data Quality and Quantity: The effectiveness of ML models depends heavily on the
quality and quantity of the data. Inadequate or biased data can lead to inaccurate or
skewed predictions. Data preprocessing, cleaning, and augmentation are critical to
ensure that models are trained on high-quality data that accurately represents the
problem domain.
24
Black-Box Nature: Many advanced ML models, particularly deep learning models,
operate as "black boxes," meaning their internal decision-making processes are not
easily interpretable. This lack of transparency can hinder trust and make it difficult
to understand how decisions are made. Developing techniques for model
interpretability and explainability is essential for ensuring that AI systems are
transparent and their decisions can be understood and justified.
Model Interpretability: For AI systems to be widely accepted and trusted, it is
crucial that their predictions and decision-making processes are interpretable by
humans. Efforts to enhance model interpretability involve creating methods and
tools that provide insights into how models arrive at their conclusions, which is
especially important in high-stakes domains like healthcare and finance.
Job Displacement: The automation of tasks through AI and ML can lead to job
displacement, as machines and algorithms increasingly perform tasks previously
done by humans. Addressing the impact on employment and developing strategies
for workforce retraining and support are important for mitigating the negative
effects of automation.
25
LITERATURE SURVEY
26
2.LITERATURE SURVEY
2.1 Introduction:
This project is an advanced Android application named Blind Assist, created to
support visually impaired people by helping them recognize and identify objects around them
in real time using their smartphone’s camera. The app uses TensorFlow Lite, a lightweight
deep learning framework, to run object detection models directly on the mobile device,
allowing fast and offline processing without needing an internet connection.
We developed the app using Android Studio and wrote the code in Kotlin, a modern,
easy-to-read, and powerful programming language designed for Android development. To
capture live video from the camera, we used the CameraX API, which provides reliable and
flexible camera control on Android devices. The detected objects are highlighted with
bounding boxes on the camera preview using a custom overlay view, and the app uses Text-
to-Speech (TTS) to announce the names of detected objects aloud, giving instant audio
feedback to the user.
The application includes a bottom sheet interface that lets users adjust key detection
settings like the confidence threshold, the maximum number of objects to detect, and the
number of threads used for processing, providing customization based on the device’s
performance and the user’s needs. By combining real-time computer vision, voice output, and
a simple user interface, the app offers an effective, accessible tool for blind and visually
impaired users to navigate their surroundings more confidently and independently. The entire
solution runs smoothly on mobile devices, making it a practical and portable assistive
technology.
2.2 Existing System:
In the current scenario, visually impaired people face many challenges in identifying objects
around them and navigating unfamiliar environments. Although there are some existing
systems and technologies, they have several limitations:
Traditional Walking Sticks or White Canes
These are the most commonly used tools for the blind. They help in detecting
obstacles directly in front of the user by physical contact, but they cannot recognize
27
objects at a distance or provide information about what the objects are.
Guide Dogs
Guide dogs are trained to help visually impaired people move around safely.
However, having a guide dog is expensive and requires ongoing maintenance, care,
and training. It is not an affordable or practical solution for everyone.
Mobile Apps Using GPS
Some smartphone apps offer navigation help using GPS, like Google Maps or
specialized apps for blind users. These apps can help with directions but do not detect
obstacles or recognize objects in the user’s immediate surroundings.
Dedicated Electronic Devices
Certain electronic devices like wearable cameras or specialized obstacle detection
gadgets exist, but they are often bulky, costly, and inconvenient to carry daily. Many
of these devices require frequent calibration or charging, and they may not offer real-
time object recognition.
Camera-based Applications Without Real-time Feedback
A few apps can identify objects when a photo is taken, but they process images slowly
or require an internet connection to send photos to a server. This delay in response
makes them less effective for real-time assistance.
Lack of Voice Feedback
Some existing solutions do not include text-to-speech support, forcing users to read
on-screen text, which is not practical for blind people. Even apps with voice feedback
might announce information in an unclear or delayed manner, which reduces their
usefulness.
2.2.1. Disadvantages of Existing Systems
Limited Detection Range
Most existing tools like white canes or guide dogs only help detect obstacles
very close to the user. They cannot identify objects or hazards at a distance,
making it difficult to plan a safe path ahead.
No Object Recognition
Traditional aids cannot recognize or describe what an object is — they only
help avoid it. Visually impaired people still don’t know whether an obstacle is
a chair, a wall, or a person.
High Cost and Maintenance
Devices like smart glasses or guide dogs require a lot of money to buy and
28
maintain. Many visually impaired people cannot afford these expensive
solutions.
Slow or Delayed Feedback
Some apps or devices that use photos to identify objects send images to
servers for processing, which causes delays in getting results. Real-time
response is often impossible.
Dependence on Internet Connectivity
Many solutions require a constant internet connection for object recognition,
which is unreliable or unavailable in many places, especially outdoors or in
rural areas.
Bulky and Inconvenient Devices
Some electronic aids are large, heavy, or uncomfortable to carry, making them
impractical for daily use.
2.3 Problem Statement:
Visually impaired individuals face numerous difficulties in navigating their
surroundings safely and independently. Traditional mobility aids such as white canes and
guide dogs provide limited assistance—they can help detect obstacles directly in front of the
user but cannot identify or describe the nature of these obstacles or objects beyond the
immediate path. As a result, visually impaired people often remain unaware of important
details about their environment, which can lead to accidents, injuries, or difficulty in finding
essential landmarks like doors or chairs. Although some electronic and wearable assistive
devices exist, they are often prohibitively expensive, complex to use, or heavily reliant on
continuous internet connectivity, which limits their practicality in real-life scenarios.
Moreover, many of these solutions struggle to provide fast and accurate feedback necessary
for real-time navigation, especially in dynamic environments. This highlights a critical need
for an affordable, lightweight, and reliable system that leverages modern computer vision
techniques to detect and identify objects such as walls, curbs, potholes, chairs, doors, and
people in real time. By integrating this technology into a mobile platform and providing
immediate audio feedback through speech, we can significantly improve the confidence,
safety, and independence of visually impaired individuals as they move through their daily
environments.
2.4 Proposed System:
Real-Time Object Detection with Mobile Camera
29
The proposed system uses an Android smartphone’s camera to continuously capture
live video frames of the user’s surroundings. These frames are analyzed instantly to
detect objects in real time, helping visually impaired users become aware of obstacles
or important items around them.
Offline, On-Device Processing
Unlike many solutions that rely on cloud servers or internet connectivity, this system
performs all object detection directly on the device using TensorFlow Lite. This
ensures the app works reliably even in areas without network coverage and protects
user privacy by not uploading images.
Audio-Based Object Announcements
Once objects like walls, curbs, potholes, chairs, doors, or people are detected, the
system announces them using Text-to-Speech in clear, easy-to-understand language.
This immediate spoken feedback allows visually impaired users to react quickly and
navigate their environment more safely.
Simple and User-Friendly Interface
The app features an intuitive interface with straightforward controls. Users can adjust
detection sensitivity (threshold), choose the maximum number of objects detected at
once, and select the processing hardware (CPU, GPU, or NNAPI) without needing
technical expertise.
Modern Android Architecture
The application is developed using Android Studio with Kotlin, leveraging modern
components like Jetpack libraries, CameraX for reliable camera handling, and
ViewBinding for safer, cleaner code. This ensures the app is maintainable, efficient,
and compatible with a wide range of devices.
Enhanced Safety and Independence
By giving real-time awareness of obstacles and surroundings, the proposed system
aims to empower visually impaired individuals to move around confidently and
independently, reducing their reliance on assistance from others.
2.5 Objectives:
The main objective of this project is to develop a reliable Android application that
helps visually impaired individuals navigate their environment safely and independently
using real-time object detection. By leveraging TensorFlow Lite, the system aims to identify
critical obstacles and common items such as walls, curbs, potholes, chairs, doors, and people
directly on the user’s smartphone without requiring internet connectivity.
30
Another important objective is to provide immediate audio feedback through Text-to-
Speech so that users can receive timely alerts about detected objects. The app also seeks to
offer a simple, intuitive interface that allows users or caretakers to adjust detection settings,
including the detection threshold, maximum results, and processing hardware.
By achieving these objectives, the proposed system intends to significantly enhance
situational awareness, reduce the risk of accidents, and improve the independence and
confidence of visually impaired users in their daily lives.
2.6 Functional Requirements:
Real-Time Object Detection
The system shall detect objects such as walls, curbs, potholes, chairs, doors, and
people instantly by processing live video frames from the smartphone’s camera. This
ensures the app provides timely alerts to users navigating their environment.
Audio Feedback for Detected Objects
The system shall use Text-to-Speech to announce the labels of detected objects. This
audio feedback will enable visually impaired users to understand their surroundings
without needing to look at the screen.
Adjustable Detection Settings
The app shall allow users to customize detection parameters, such as the minimum
confidence threshold (which affects sensitivity), the maximum number of objects
detected per frame, and the number of threads used for processing. This flexibility lets
users balance accuracy and performance based on their device’s capability.
Hardware Delegate Selection
The app shall let users choose between CPU, GPU, or NNAPI delegates to perform
detection. This gives users the ability to optimize detection speed and battery usage
depending on their hardware.
Visual Overlay of Detections
The system shall draw bounding boxes around detected objects directly on the camera
preview. This provides visual confirmation of what the app has detected, helping users
with partial vision or assisting a sighted helper.
Camera Permission Handling
The app shall check if the camera permission is granted. If not, it will request
permission from the user before launching the camera, ensuring proper functionality
without manual configuration.
31
Navigation to Camera Screen
The system shall automatically navigate the user from the permission screen to the
camera screen once the required permissions are granted, providing a seamless user
experience.
2.7 Non-Functional Requirements:
The non-functional requirements outline how the Blind Assistance Android
application should perform, focusing on quality attributes like performance, reliability,
usability, and maintainability. These characteristics ensure that the system is efficient, user-
friendly, and robust enough for visually impaired users to rely on daily.
Performance Requirements
The app must analyse each camera frame and generate object detection
results within 200 milliseconds per frame.Quick processing ensures smooth,
uninterrupted real-time voice feedback to the user as they navigate their environment.
Reliability and Stability
The application should remain stable during prolonged use, such as when
assisting a user on long walks or in crowded places.It must handle unexpected
conditions (e.g., sudden light changes or temporary camera interruptions) without
crashing.
User Interface Usability
The app interface should use large, high-contrast buttons and clear text
labels, making it easy for visually impaired users to find and activate controls.Voice
prompts and intuitive navigation should guide users effectively throughout the app’s
features.
Device Compatibility
The system must work on Android smartphones and tablets running
Android 8.0 (Oreo) or later versions, ensuring that a wide range of affordable devices
can support the app.
Future Scalability
The architecture should allow for the addition of new detection models,
voice languages, or extra features without requiring a complete rewrite of the app.This
flexibility helps the app adapt to evolving user needs or advancements in object
detection technology.
Security and Privacy
32
The app must only request essential permissions, such as camera access,
and avoid collecting or storing any user-identifiable data.All processing should occur
locally on the device to protect user privacy, with no images or recordings uploaded to
external servers.
The hardware requirements ensure that the application runs smoothly on both the
development system and the target Android devices, delivering reliable real-time object
detection and audio feedback.
Development Machine:
33
Processor: Multi-core processor (Intel i5/Ryzen 5 or higher recommended)
Storage: At least 256 GB of free disk space (SSD recommended for faster project
builds and emulation)
Storage: At least 1 GB of free internal storage for app installation and caching
Camera: Rear camera with at least 8 MP resolution for clear object capture
Audio: Built-in speaker or connected headphones for audible feedback using Text-to-
Speech
34
UML MODELING
35
3.UML MODELLING
Although UML is generally used to model software system, it is not limited within
this boundary. It is generally used to model software system as well. For example,
the process flows in a manufacturing unit, etc.
UML is not a programming language, but tools can be used to generate code in
various language using UML diagrams. UML has a direct relation with object-oriented
analysis and design. After some standardization, UML has become an OMG standard.
36
made simple to understand and use.
A picture is worth a thousand words, this idiom absolutely fits describing UML.
Object- oriented concepts were introduced much earlier than UML. At that point of time,
there were no standard methodologies to organize and consolidate the Object-oriented
development. It was then that UML came into picture. There are number of goals for
developing UML but the most important is to define some general-purpose modelling
language, which all models can use and it also need to be made simple to understand and us
37
considered as complete when both the aspects are fully covered. Behavioural diagram
captures the dynamic aspect of a system. Dynamic aspect can be further described as the
changing/moving parts of a system. UML has the following five types of behavioural
diagrams:
Use case diagram
Sequence diagram
Collaboration diagram
State chart diagram
Activity diagram
38
Flow of events: Sequence of steps describing the function of use case
Exit Condition: Condition for terminating the use case
Quality Requirements: Requirements that do not belong to the use case but
constraint the functionality of the system
39
permission
Flow of Events 1. User initiates camera
2. Live camera feed starts
Exit Condition Condition Camera preview is active
Quality Requirements Fast camera startup without delays
40
output
41
Use Case for Adjust Detection Settings:
42
3.4.2 Sequence Diagram:
A sequence diagram is the most commonly used interaction diagram. It simply depicts
interaction between objects in a sequential order ie. the order in which these interactions take
place. We can also use the terms event diagrams or event scenarios to refer to a sequence
diagram Sequence diagrams describe how and in what order the objects in a system function.
These diagrams are widely used by businessmen and software developers to document and
understand requirements for new and existing systems. Sequence diagrams can be useful
references for businesses and other organizations. Purpose of sequence diagrams are:
See how objects and components interact with each other to complete a process.
1. Lifeline:
2.Messeges:
3. Activation Box:
Activation boxes, also known as activation bars or lifeline activations, show the
period of time during which an object or actor is actively engaged in processing a message.
They are drawn as rectangles on the lifeline and indicates the duration of the methods or
operation execution.
43
4.Focus of Control:
This indicates which objects or actors has control over is actively processing the
message at a given point in time. It is represented by a vertical dashe4d line extending from
the lifeline to the activation box.
5. Return Message:
Return messages show the flow of information back to the sender after the completion
of a method or operation, they are reprinted by dashed arrows.
Description:
This sequence diagram illustrates the interaction flow in the Blind Assistance Android
app from the moment the user opens the app to logging out. It starts with the user accessing
the homepage, where they log in or register if necessary.
44
Once logged in, the user can initiate camera-based object detection, adjust settings,
and view detection results with real-time object identification overlaid on the screen.
Throughout the process, the app checks permissions, configures the camera, processes
frames, detects objects, and provides voice feedback.
Finally, the user can log out securely, returning to the login screen. This diagram helps
visualize the main steps of using the app and highlights the key modules and interactions
between user actions and system components.
45
Description:
This sequence diagram details the process of creating and updating the object
detection model for the Blind Assistance app. It begins with the Admin collecting images of
relevant objects like walls, curbs, potholes, and doors.
The Admin then uses an annotation tool to label these images, creating a dataset
suitable for training. Next, the labeled dataset is fed into a training pipeline, where
TensorFlow is used to train the model, producing an updated TFLite file.
Finally, the trained model is deployed to the Android app, enabling improved object
detection performance for end users. This diagram clarifies the stages involved in preparing
the dataset, training, and updating the model for real-world deployment.
Activity diagrams (also called Activity Charts or Flow Diagrams) depict the flow of
control or data from activity to activity within a system. They capture what the system does,
step-by-step, rather than how objects change state. Activity diagrams can model the behaviour
of a single use case, an operation, a business process, or even an entire workflow that spans
multiple systems.
Activity diagrams are especially valuable when you need to visualise complex
sequences that involve parallel processing, branching, loops, human tasks, or data
transformations.
1. Activities / Actions
2. Control Flows
Control flows are directed arrows that connect actions or activities, indicating the
order in which steps are executed. They model the progression of control from one node to
the next once the preceding action completes.
46
A decision node (diamond shape) splits the flow based on boolean expressions or
guard conditions. Outgoing edges are labelled with conditions such as [valid] or [invalid]. A
merge node (also a diamond) brings multiple alternative flows back into a single path.
A fork node (thick horizontal or vertical bar) divides the flow into concurrent paths
that execute in parallel. A join node synchronises these parallel branches: all incoming flows
must complete before the diagram proceeds. Forks and joins are critical for modelling
multi-threaded or asynchronous behaviour.
5. Swimlanes / Partitions
Swimlanes partition the diagram into vertical or horizontal zones that assign
responsibility for each action to an actor, class, or subsystem. They clarify who or what
performs each activity, improving traceability across organisational or architectural
boundaries.
6. Object Flows
Object flow arrows carry data objects (shown as rectangles with object names)
between actions. They illustrate how information is produced, consumed, or transformed
throughout the workflow.
Initial node (filled black circle): the entry point where control first enters the activity
diagram.
Activity final node (encircled black dot): signifies the end of all flows in the activity.
Flow final node (encircled X): terminates its particular path without stopping other
concurrent flows.
47
Fig 3.4.3 Activity Diagram
Description:
This activity diagram illustrates the workflow of the Blind Assistance Android app
from the moment the user opens the application until the continuous detection loop. The
process begins with the user launching the app, which immediately checks if camera
permissions have been granted. If permission is granted, the camera preview starts, and the
app begins real-time object detection on the video frames.
48
Detected objects are processed to estimate their distance, and the app uses text-to-
speech to announce them to the user. Bounding boxes and object labels are displayed on-
screen for reference. If the user adjusts detection settings such as detection threshold or
delegate, the new configuration is applied instantly without restarting the app.
49
DESIGN
50
4.SYSTEM DESIGN
Developers define the design goals of the project and decompose the system into smaller
subsystems that can be realized by individual teams. Developers also select strategies for
building the system, such as the hardware/software platform on which the system will run,
the persistent data management strategy, the goal control flow the access control policy and
the handling of boundary conditions. The result of the system design is model that includes a
clear description of each of these strategies, subsystem decomposition, and a UML
deployment diagram representing the hardware/software mapping of the system.
Design goals are the qualities that the system should focus on. Many design goals be
inferred from the non-functional requirements or from the application domain.
User friendly: The system is user friendly because it is easy to use and understand.
51
Reliability: Proper checks are there for any failure in the system if they exist.
The system design of the Blind Assistance App is structured to provide real-time
object detection and audio guidance for visually impaired users. The application is based on a
modular architecture that integrates camera input, object detection processing, and speech
output, all optimized for Android devices using Kotlin.
The system begins with a user-friendly interface where permissions are checked and
camera access is initiated. The CameraX API is used to handle live camera streams
efficiently. Captured video frames are analyzed by the object detection module, which uses
TensorFlow Lite models to identify objects such as walls, curbs, potholes, chairs, doors, and
people. Bounding boxes and object labels are drawn on the live camera feed for on-screen
reference.To convert detections into accessible information, a Text-to-Speech (TTS) engine is
integrated, which reads out the names of detected objects, helping users understand their
surroundings.
The system uses an event-driven approach where the detection process and TTS
announcements operate asynchronously, ensuring smooth real-time performance without lag.
The app also includes error handling for permissions and system errors, providing appropriate
messages to the user. Overall, the system design focuses on low-latency processing, high
detection accuracy, and seamless user experience to deliver effective assistance in dynamic
environments.
52
Fig 4.3 System’s Block Design
Input Design
The input design defines how users interact with the Blind Assistance Android App
and how the system accepts camera data to process. It creates the bridge between the visually
impaired user and the app’s object detection functionality. The input is primarily the live
video feed from the mobile device’s camera, and it must be designed for ease of access, low
latency, and secure permission handling. The goal is to ensure minimal user intervention
while providing an efficient way to start and control detection.
The input design focuses on controlling errors, avoiding unnecessary steps, and
simplifying the process by automatically handling camera permissions and settings
adjustments. It ensures security by checking user permissions before accessing the camera
and protects user privacy by not storing images.
53
Objectives
Convert user actions (like starting detection) into system operations in the app,
ensuring the process is error-free.
Provide a simple interface where the user can start detection without technical
knowledge.
Validate that the camera feed is functional, with clear error messages guiding the user
when problems occur.
Make the input layout intuitive and fast, minimizing user effort.
Output Design
Output design ensures the app presents detection results effectively to visually impaired
users through both audio and visual feedback. The output includes bounding boxes overlaid
on detected objects and voice announcements of object names and estimated distances.Clear,
concise, and timely output is critical to allow the user to navigate safely. Outputs must be
accurate, immediate, and easily understood to improve user trust and decision-making during
movement.
Outputs include live bounding box overlays on the camera view for sighted helpers
and announcements for the user.
Outputs must be optimized for low-latency delivery, ensuring real-time assistance.
Outputs should convey important information like detected object names and
proximity warnings.
Objectives
Provide actionable information, e.g., “Person ahead,” “Chair nearby,” allowing the user to
make safe decisions.
4.5 Algorithms:
Object detection in this app is implemented using TensorFlow Lite models like
MobileNet SSD or EfficientDet Lite, optimized for Android devices. These models
54
take live camera frames as input, process them with convolutional neural networks
(CNNs), and output detected objects with their labels and confidence scores.
The object detection algorithm identifies relevant objects such as walls, curbs,
potholes, chairs, doors, and people. Once detected, their bounding boxes are
calculated, scaled to the preview size, and displayed. Detected object labels are
converted to speech output for the user.
Each detection result includes a bounding box (x, y, width, height) and
a confidence score. The app filters detections using a configurable threshold
(e.g., 0.5), ensuring only reliable detections are announced. This threshold
helps reduce false positives and improve usability.
55
(Bidirectional Feature Pyramid Network) to combine features at different scales in the
image efficiently. This allows EfficientDet to detect small, medium, and large objects
in a single pass.
Feature fusion with BiFPN: Combines information from different resolutions of the
image better than traditional feature pyramids, leading to improved detection of
objects at multiple scales.
56
CODING
5.CODING
5.1 Coding Approach:
57
The objective of the coding or programming phase is to translate the design of the
system produced during the design phase into code in a given programming language, which
can be executed by a computer and that performs the computation specified by the design.
The coding phase affects both testing and maintenance. The goal of coding is not to reduce
the implementation cost, but the goal should be to reduce the cost of later phases. There are
two major approaches for coding any software system. They are Top-Down approach and
bottom-up approach.
Bottom-up Approach can best suit for developing the object-oriented systems. During
system design phase, we decompose the system into an appropriate number of subsystems,
for which objects can be modelled independently. These objects exhibit the way the
subsystems perform their operations.
Once objects have been modelled, they are implemented by means of coding. Even
though related to the same system as the objects are implemented of each other, the Bottom-
Up approach is more suitable for coding these objects. In this approach, we first do the
coding of objects independently and then we integrate these modules into one system to
which they belong.
Verification ensures that the Blind Assistance App is built correctly according to the
design specifications, while validation checks that the app fulfills its purpose of providing
accurate, real-time obstacle detection and guidance for visually impaired users.
During the development of this system, all code modules related to camera
management, object detection, text-to-speech, and settings controls have been thoroughly
verified through detailed testing of their design, integration, and runtime behavior. Various
techniques were applied during validation, as discussed in the testing phase of the system.
Validations were implemented at two primary levels to ensure correctness and reliability:
Screen Level Validation: Validations of all user interactions, such as permission prompts,
detection setting adjustments, and button presses, are handled at the screen level. The system
raises appropriate error dialogs or messages if the camera permission is denied or if
unsupported settings are selected. This ensures users are guided through resolving issues
before detection starts.
58
Control Level Validation: Validations are applied directly to individual UI controls, such as
spinners for detection thresholds and delegate selections. If an invalid option is chosen or
required permissions are not granted, the system displays clear dialogs or toasts, helping the
user correct their input. This ensures every control behaves predictably and prevents invalid
configurations that could cause system errors or degraded detection performance.Throughout
the app, real-time runtime validations also ensure that camera input is active and object
detection results are reliable before audio announcements occur, maintaining both safety and
usability.
MainActivity.kt
package org.tensorflow.lite.examples.objectdetection
import android.os.Build
import android.os.Bundle
import androidx.appcompat.app.AppCompatActivity
import org.tensorflow.lite.examples.objectdetection.databinding.ActivityMainBinding
super.onCreate(savedInstanceState)
activityMainBinding = ActivityMainBinding.inflate(layoutInflater)
setContentView(activityMainBinding.root)
if (Build.VERSION.SDK_INT == Build.VERSION_CODES.Q) {
finishAfterTransition()
59
} else {
super.onBackPressed()
ObjectDetectorHelper.kt
package org.tensorflow.lite.examples.objectdetection
import android.content.Context
import android.graphics.Bitmap
import android.os.SystemClock
import android.util.Log
import org.tensorflow.lite.gpu.CompatibilityList
import org.tensorflow.lite.support.image.ImageProcessor
import org.tensorflow.lite.support.image.TensorImage
import org.tensorflow.lite.support.image.ops.Rot90Op
import org.tensorflow.lite.task.core.BaseOptions
import org.tensorflow.lite.task.vision.detector.Detection
import org.tensorflow.lite.task.vision.detector.ObjectDetector
class ObjectDetectorHelper(
60
val context: Context,
){
// For this example this needs to be a var so it can be reset on changes. If the
ObjectDetector
init {
setupObjectDetector()
fun clearObjectDetector() {
objectDetector = null
// thread that is using it. CPU and NNAPI delegates can be used with detectors
// that are created on the main thread and used on a background thread, but
// the GPU delegate needs to be used on the thread that initialized the detector
fun setupObjectDetector() {
// Create the base options for the detector using specifies max results and score threshold
val optionsBuilder =
ObjectDetector.ObjectDetectorOptions.builder()
.setScoreThreshold(threshold)
.setMaxResults(maxResults)
61
val baseOptionsBuilder = BaseOptions.builder().setNumThreads(numThreads)
// Use the specified hardware for running the model. Default to CPU
when (currentDelegate) {
DELEGATE_CPU -> {
// Default
DELEGATE_GPU -> {
if (CompatibilityList().isDelegateSupportedOnThisDevice) {
baseOptionsBuilder.useGpu()
} else {
DELEGATE_NNAPI -> {
baseOptionsBuilder.useNnapi()
optionsBuilder.setBaseOptions(baseOptionsBuilder.build())
val modelName =
when (currentModel) {
62
MODEL_EFFICIENTDETV2 -> "efficientdet-lite2.tflite"
try {
objectDetector =
ObjectDetector.createFromFileAndOptions(context, modelName,
optionsBuilder.build())
objectDetectorListener?.onError(
if (objectDetector == null) {
setupObjectDetector()
// Inference time is the difference between the system time at the start and finish of the
// process
// lite_support#imageprocessor_architecture
val imageProcessor =
ImageProcessor.Builder()
63
.add(Rot90Op(-imageRotation / 90))
.build()
objectDetectorListener?.onResults(
results,
inferenceTime,
tensorImage.height,
tensorImage.width)
interface DetectorListener {
fun onResults(
results: MutableList<Detection>?,
inferenceTime: Long,
imageHeight: Int,
imageWidth: Int
companion object {
64
const val DELEGATE_NNAPI = 2
OverlayView.kt
package org.tensorflow.lite.examples.objectdetection
import android.content.Context-
import android.graphics.Canvas
import android.graphics.Color
import android.graphics.Paint
import android.graphics.Rect
import android.graphics.RectF
import android.util.AttributeSet
import android.view.View
import androidx.core.content.ContextCompat
import java.util.LinkedList
import kotlin.math.max
import org.tensorflow.lite.task.vision.detector.Detection
65
private var boxPaint = Paint()
init {
initPaints()
fun clear() {
textPaint.reset()
textBackgroundPaint.reset()
boxPaint.reset()
invalidate()
initPaints()
textBackgroundPaint.color = Color.BLACK
textBackgroundPaint.style = Paint.Style.FILL
textBackgroundPaint.textSize = 50f
textPaint.color = Color.WHITE
textPaint.style = Paint.Style.FILL
textPaint.textSize = 50f
66
boxPaint.strokeWidth = 8F
boxPaint.style = Paint.Style.STROKE
super.draw(canvas)
canvas.drawRect(drawableRect, boxPaint)
val drawableText =
String.format("%.2f", result.categories[0].score)
canvas.drawRect(
left,
67
top,
textBackgroundPaint
fun setResults(
detectionResults: MutableList<Detection>,
imageHeight: Int,
imageWidth: Int,
){
results = detectionResults
companion object {
TFObjectDetectionTest.kt
68
package org.tensorflow.lite.examples.objectdetection
import android.content.res.AssetManager
import android.graphics.Bitmap
import android.graphics.BitmapFactory
import android.graphics.RectF
import androidx.test.ext.junit.runners.AndroidJUnit4
import androidx.test.platform.app.InstrumentationRegistry
import java.io.InputStream
import org.junit.Assert.assertEquals
import org.junit.Assert.assertNotNull
import org.junit.Assert.assertTrue
import org.junit.Test
import org.junit.runner.RunWith
import org.tensorflow.lite.support.label.Category
import org.tensorflow.lite.task.vision.detector.Detection
@RunWith(AndroidJUnit4::class)
class TFObjectDetectionTest {
69
)
@Test
@Throws(Exception::class)
fun detectionResultsShouldNotChange() {
val objectDetectorHelper =
ObjectDetectorHelper(
context = InstrumentationRegistry.getInstrumentation().context,
objectDetectorListener =
object : ObjectDetectorHelper.DetectorListener {
// no op
results: MutableList<Detection>?,
inferenceTime: Long,
imageHeight: Int,
imageWidth: Int
){
assertEquals(controlResults.size, results!!.size)
for (i in controlResults.indices) {
assertEquals(results[i].boundingBox, controlResults[i].boundingBox)
70
// data have the same number of categories
assertEquals(
results[i].categories.size,
controlResults[i].categories.size
assertEquals(
results[i].categories[j].label,
controlResults[i].categories[j].label
objectDetectorHelper.detect(bitmap!!, 0)
@Test
71
@Throws(Exception::class)
fun detectedImageIsScaledWithinModelDimens() {
val objectDetectorHelper =
ObjectDetectorHelper(
context = InstrumentationRegistry.getInstrumentation().context,
objectDetectorListener =
object : ObjectDetectorHelper.DetectorListener {
results: MutableList<Detection>?,
inferenceTime: Long,
imageHeight: Int,
imageWidth: Int
){
assertNotNull(results)
72
// Create Bitmap and convert to TensorImage
objectDetectorHelper.detect(bitmap!!, 0)
@Throws(Exception::class)
InstrumentationRegistry.getInstrumentation().context.assets
return BitmapFactory.decodeStream(inputStream)
Activity_main.xml
<androidx.coordinatorlayout.widget.CoordinatorLayout
xmlns:android="http://schemas.android.com/apk/res/android"
xmlns:tools="http://schemas.android.com/tools"
xmlns:app="http://schemas.android.com/apk/res-auto"
android:background="@android:color/transparent"
android:layout_width="match_parent"
android:layout_height="match_parent">
<RelativeLayout
android:layout_width="match_parent"
73
android:layout_height="match_parent"
android:orientation="vertical">
<androidx.fragment.app.FragmentContainerView
android:id="@+id/fragment_container"
android:name="androidx.navigation.fragment.NavHostFragment"
android:layout_width="match_parent"
android:layout_height="match_parent"
android:background="@android:color/transparent"
android:keepScreenOn="true"
app:defaultNavHost="true"
app:navGraph="@navigation/nav_graph"
android:layout_marginTop="?android:attr/actionBarSize"
tools:context=".MainActivity"/>
<androidx.appcompat.widget.Toolbar
android:id="@+id/toolbar"
android:layout_width="match_parent"
android:layout_height="?attr/actionBarSize"
android:layout_alignParentTop="true"
android:background="@color/toolbar_background">
<ImageView
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:scaleType="fitCenter"
android:src="@drawable/tfl_logo" />
74
</androidx.appcompat.widget.Toolbar>
</RelativeLayout>
</androidx.coordinatorlayout.widget.CoordinatorLayout
Fragment_camera.xml
<androidx.coordinatorlayout.widget.CoordinatorLayout
xmlns:android="http://schemas.android.com/apk/res/android"
xmlns:app="http://schemas.android.com/apk/res-auto"
android:id="@+id/camera_container"
android:layout_width="match_parent"
android:layout_height="match_parent">
<androidx.camera.view.PreviewView
android:id="@+id/view_finder"
android:layout_width="match_parent"
android:layout_height="match_parent"
app:scaleType="fillStart"/>
<org.tensorflow.lite.examples.objectdetection.OverlayView
android:id="@+id/overlay"
android:layout_height="match_parent"
android:layout_width="match_parent" />
<include
android:id="@+id/bottom_sheet_layout"
layout="@layout/info_bottom_sheet" />
</androidx.coordinatorlayout.widget.CoordinatorLayout>
75
TESTING
76
6.TESTING
Testing is the process of finding differences between the expected behaviour specified
by system models and the observed behaviour of the system. Testing is a critical role in
quality assurance and ensuring the reliability of development and these errors will be
reflected in the code, so the application should be thoroughly tested and validated.
Unit testing finds the differences between the object design model and its
corresponding components. Structural testing finds differences between the system design
model and a subset of integrated subsystems. Functional testing finds differences between the
use case model and the system. Finally, performance testing, finds differences between non-
functional requirements and actual system performance. From modelling point of view,
testing is the attempt of falsification of the system with respect to the system models. The
goal of testing is to design tests that exercise defects in the system and to reveal problems.
Testing a large system is a complex activity and like any complex activity. It has to be
broke into smaller activities. Thus, incremental testing was performed on the project ie,
components and subsystems of the system were tested separately before integrating them to
form the subsystem for system testing.
Unit testing focuses on the building blocks of the software system that is the objects
and subsystems. There are three motivations behind focusing on components. First unit
testing reduces the complexity of overall test activities allowing focus on smaller units of the
system, second unit testing makes it easier to pinpoint and correct faults given that few
components are involved in the rest. Third unit testing allows parallelism in the testing
activities, that is each component are involved in the test. Third unit testing allows
parallelism in the testing activities that is each component can be tested independently of one
another. The following are some unit testing techniques.
77
Equivalence Testing: It is a black box testing technique that minimizes the number of
test cases. The possible inputs are partitioned into equivalence classes and a test case
is selected for each class.
Boundary Testing: It is a special case of equivalence testing and focuses on the
conditions at the boundary of the equivalence classes. Boundary testing requires that
the elements be selected from the edges of the equivalence classes.
Path Testing: It is a white box testing technique that identifies faults in the
implementation of the component the assumption here is that exercising all possible
paths through the code at least once. Most faults will trigger failure. This acquires
knowledge of source code.
Integration testing defects faults that have not been detected. During unit testing by
focusing on small groups on components two or more components are integrated and tested
and once tests do not reveal any new faults, additional components are added to the group.
This procedure allows testing of increasing more complex parts on the system while keeping
the location of potential faults relatively small. I have used the following approach to
implements and integrated testing.
Top-down testing strategy unit tests the components of the top layer and then
integrated the components of the next layer down. When all components of the new layer
have been tested together, the next layer is selected. This was repeated until all layers are
combined and involved in the test.
The systems completely assembled as package, the interfacing have been uncovered
and corrected, and a final series of software tests are validation testing. The validation testing
is nothing but validation success when system functions in a manner that can be reasonably
expected by the customer. The system validation had done by series of Black-box test
methods.
System testing ensures that the complete system compiles with the functional
requirements and non-functional requirements of the system, the following are some
78
Functional testing finds differences between the functional between the functional
requirements and the system. This is a black box testing technique. Test cases are
divided from the use case model.
Performance testing finds differences between the design and the system the design
goals
79
Fig 6.3.1 White Box Testing Diagram
Black-Box Testing, also called Behavioural Testing, was applied to verify the
functionality of the Blind Assistance App without inspecting its internal code. This
method involved providing inputs such as launching the app, adjusting detection
settings, moving various objects in front of the camera, and observing the outputs
(voice announcements and bounding box overlays). Black-box testing focused on the
following areas:
Correct detection of supported object classes (e.g., walls, curbs, chairs, people).
User interface behavior when permissions are granted or denied.
Handling of camera permission errors or unavailable camera hardware.
Real-time audio feedback accuracy and clarity.
UI responsiveness when adjusting detection threshold, max results, and delegate
options.
Testers verified that the app correctly announces detected objects and displays bounding
boxes as the user moves around. They also checked that adjusting detection settings updates
the detection behavior immediately. These tests ensure the app behaves correctly from the
user’s perspective under various real-world scenarios.
80
6.4 Test Plan:
The test plan for the Blind Assistance App focuses on validating the app’s
functionality, performance, security, and usability across the entire detection workflow. Input-
handling tests confirm that the app correctly manages camera permissions, responds
gracefully to permission denials, and displays appropriate error dialogs.
Functional accuracy tests involve presenting known objects (e.g., chairs, doors,
potholes) to the camera and verifying that detections are accurate and announced clearly via
text-to-speech. Error-handling tests assess the app’s behavior during conditions like camera
hardware failure, low-light scenarios, or corrupted camera frames, ensuring the system
provides meaningful feedback or recovers smoothly.
UI interactions for adjusting detection settings, and robust error handling in scenarios
such as permission denial or device disconnection. The testing strategy involves unit testing
for individual components, integration testing for module interactions, system testing for end-
to-end workflows, regression testing to ensure new changes don’t break existing features, and
usability testing to confirm an intuitive experience for visually impaired users.
81
6.5 Test Cases:
82
Test Case for Audio Feedback
Table 6.5(e) Test Case for Detection Threshold Adjustment Control Feature
83
Test Case for Continuous Detection Loop
84
Screens
85
7.OUTPUT SCREENS
After Click on Android app its ask for permission to access camera from Google Play
86
Fig 7.2 Permission to access Camera of Android Device
After permitting for access the camera of android device it navigates to Start Page, If the
Maximum results are 3 then it detects the 2 objects at a time.
Fig 7.3 Start Page Fig 7.4 Detect Objects two at a time
According to the Fig 7.3, the output explains about the starting page of android application.
87
According to the Fig 7.4, the output explains about the feature of detecting object more than
one at a time.
Changing the maximum results to one it detects the objects one at a time.
Changing threshold to more than +0.50 which increases accuracy of detecting objects but
maximum of detecting objects will be decreases.
Fig 7.5 Detecting Object one at a time Fig 7.6 Increased Threshold to +0.60
According to Fig 7.5, the object detects by android application only one at a time.
According to Fig 7.6, the user increases threshold for better accuracy.
88
It also shows how much time taken to detect the object.
89
Fig 7.7(b) Inference Time
90
CONCLUSION
91
8.CONCLUSION
The Blind Assistance Android App was successfully developed and tested to provide
real-time obstacle detection and voice guidance for visually impaired users. By leveraging
advanced computer vision techniques with TensorFlow Lite models, the app accurately
identifies objects such as walls, curbs, potholes, chairs, doors, and people directly from the
smartphone’s camera feed. The detected objects are clearly announced through text-to-
speech, enabling users to navigate unfamiliar or cluttered environments with greater safety
and independence. The implementation of an intuitive interface with adjustable detection
settings allows users or their caregivers to fine-tune the system’s sensitivity and performance
to their needs. The system achieved low-latency performance on mobile devices,
demonstrating its suitability for real-world deployment without requiring specialized
hardware. Comprehensive validation and testing, including both white-box and black-box
techniques, confirmed the app’s reliability, responsiveness, and accuracy under different
environmental conditions. Overall, the Blind Assistance App demonstrates a practical and
affordable solution that combines computer vision, mobile development, and assistive
technology to improve the quality of life and autonomy for visually impaired individuals.
92
REFERENCES
93
9.REFERENCES
Provides lightweight deep learning models optimized for mobile and edge devices.
Official documentation for Android’s CameraX API used for capturing live video frames.
EfficientDet: Scalable and Efficient Object Detection. Mingxing Tan, Ruoming Pang, Quoc
V. Le. arXiv preprint arXiv:1911.09070 (2019).
MobileNet SSD: MobileNets: Efficient Convolutional Neural Networks for Mobile Vision
Applications. Andrew G. Howard et al. arXiv preprint arXiv:1704.04861 (2017).
Introduces the MobileNet architecture optimized for lightweight detection on mobile devices.
Documentation for Android Navigation framework used in the app for fragment transitions.
Describes the CameraX use cases implemented for live video processing.
94
FUTURE SCOPE
95
10.FUTURE SCOPE
The Blind Assistance Android App lays the groundwork for providing real-time
obstacle detection and voice guidance to visually impaired users, and there are several
promising directions for future development. Integrating depth estimation techniques would
allow the app to provide precise distance information to obstacles, greatly enhancing user
safety by announcing how far away hazards are. Support for specialized Edge AI hardware,
such as Google Coral or dedicated DSPs, could significantly improve performance and
reduce battery consumption during prolonged use. Adding offline navigation and indoor
localization features would enable users to receive turn-by-turn assistance both outdoors and
inside buildings. Upgrading the detection module to perform semantic segmentation could
give users a better understanding of complex scenes by recognizing regions like crosswalks
or uneven pavements. Incorporating voice command functionality would allow users to
control the app completely hands-free, further improving accessibility. Cloud connectivity
could enable real-time sharing of obstacle alerts or location updates with caregivers for
additional security. Expanding the training dataset to include more specific obstacles, such as
escalators, bicycles, or temporary construction hazards, would make the app more versatile in
diverse environments. Multilingual support in audio feedback would allow the app to cater to
a global audience. Augmented reality overlays could assist partially sighted users by
highlighting detected objects visually, and further optimizations in processing would extend
battery life, ensuring the app remains practical for all-day use.
96
APPENDIX
97
11.APPENDIX
11.1 List of Figures:
98
11.2 List of Tables:
99