Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
40 views32 pages

Report Intern

The document outlines a project aimed at developing an AI-driven travel itinerary generator that creates personalized, immersive travel experiences through storytelling and visualization. It discusses the integration of advanced technologies like machine learning and the Liquid Galaxy platform to enhance user interaction and accessibility. The project also emphasizes the importance of user preferences and aims to revolutionize travel planning by making it more engaging and tailored to individual needs.

Uploaded by

jatayu9923
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views32 pages

Report Intern

The document outlines a project aimed at developing an AI-driven travel itinerary generator that creates personalized, immersive travel experiences through storytelling and visualization. It discusses the integration of advanced technologies like machine learning and the Liquid Galaxy platform to enhance user interaction and accessibility. The project also emphasizes the importance of user preferences and aims to revolutionize travel planning by making it more engaging and tailored to individual needs.

Uploaded by

jatayu9923
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Table of Contents

Declaration………………………………………………………………..i
Certificate………………………………………………………………...ii
Abstract…………………………………………………………………...iii
Acknowledgment…………………………………………………………iv
List of Abbreviation……………………………….……………………...v
List of Figures…….……………………………….………………………vi
Chapter 1 Introduction
1.1 Problem Statement……………………………………………..1
1.2 Motivation………..…………………………………………….3
1.3 Objective………………….……………………………………5
1.4 Learning Outcomes…………………………………………….6
1.5 Gemini Model Integration………………………………………7
1.6 Objective of Internship…………………………………………10

Chapter 2 Introduction to Organisation.………………………………….11

Chapter 3 Tools and Technology used


3.1 Tech Stack Diagram……………………………………………12
2. Tools used in Detail……………...……………………………..12
3. Hardware and Software Requirements…………………………15

Chapter 4 Introduction to Project


4.1 Model Creation Flow Diagram………………………….……....16
4.2 Working Flow Diagram…………………………………………17
4.3 Use Case Diagram…….………………………………………...17
4.4 Sequence Diagram………………………………………………18
4.5 Utility of Internship…..…………………………………………18

Chapter 5 Conclusion and Future Scope


5.1 Conclusion……………..……………………..............................22
5.2 Future Scope……………………..……………………………. 22

References………………………………………………………………….24
CHAPTER - 1

Introduction

1.1 Problem Statement

Traveling, especially for tourists, is one of the most enriching experiences.


However, planning an immersive and personalised trip that aligns with one's
specific preferences can be challenging. Traditional travel planning often
involves browsing through numerous websites, reading itineraries, and piecing
together information about locations, accommodations, and activities.
Furthermore, the experience of planning can feel disconnected from the actual
journey.

With advancements in AI, machine learning, and immersive technologies, there


is an opportunity to revolutionise the way people plan their travel. The idea is to
create an intelligent system that not only suggests destinations but also
generates immersive, narrative-driven itineraries that transport users into the
heart of their travel experience—before they even board their flight.

The problem addressed in this project is the lack of personalised, engaging, and
AI-powered travel planning tools that incorporate immersive storytelling.
Existing tools often provide basic suggestions but fail to create dynamic,
customised experiences that cater to individual preferences. This project aims to
fill that gap by developing an AI-driven travel itinerary generator that crafts
fictional stories around Points of Interest (POIs), offering an engaging and
creative perspective for travellers.

The solution needs to:

• Use machine learning to generate relevant and personalised itineraries.


• Present these itineraries in a visually engaging manner through the Liquid
Galaxy platform.
• Provide speech-based interaction for improved accessibility.
By incorporating AI-driven recommendations and immersive narrative
structures, this project will deliver a compelling, highly personalized travel
experience.

1
1.2 Motivation

The motivation for this project comes from the intersection of my personal
interests in AI, machine learning, and storytelling, with my desire to create
innovative user experiences. Traveling and exploring new places is something
that everyone aspires to, but the process of planning can often be tedious and
impersonal.

Travel planning tools, such as TripAdvisor, Expedia, and Google Travel,


provide basic recommendations based on user preferences, but these platforms
do not have a storytelling element that can truly engage users in the planning
process. I wanted to build something that would allow users to “experience”
their journeys before they even begin their trip.

Additionally, the development of immersive technologies like Liquid Galaxy


provides an exciting opportunity to integrate AI-generated content with cutting-
edge hardware. Liquid Galaxy, which involves multiple screens and
visualisation systems, offers an ideal platform to display interactive, multi-
dimensional itineraries. The integration of AI-generated travel stories with
Liquid Galaxy’s immersive visuals can elevate the travel planning experience,
making it both informative and entertaining.

My long-term goal is to explore how AI can be applied in creative domains such


as travel and storytelling. I believe that with the right tools, we can create
applications that are not only functional but also spark creativity and
imagination.

1.3 Objective

The primary objective of this project is to develop a Fictional Travel Itinerary


Generator that combines Generative Text AI and immersive visual
storytelling. This tool will generate personalized itineraries based on Points of
Interest (POIs) selected by the user and transform them into a ctional narrative
that unfolds interactively on the Liquid Galaxy platform.

More speci cally, the objectives of this project include:


1. AI-powered Story Generation:

2
fi
fi

Develop a machine learning model to generate ctional travel
itineraries based on user input. The model will create stories that
revolve around different POIs, offering users a rich narrative
experience. Each itinerary will consist of several sub-POIs that the
user will "visit" in a narrative sequence.
2. Integration with Liquid Galaxy:


Connect the mobile app with the Liquid Galaxy rig to display the
generated itineraries as immersive visual experiences. Users will
be able to see the stories and POIs on the screens of the Liquid
Galaxy rig as it orbits around the locations. This feature aims to
transform the travel planning process into an immersive and
engaging experience.
3. Voice Interaction and Accessibility:


Use Bark AI or other similar tools to integrate text-to-speech
functionality. This will allow users to hear the generated story,
making the app more interactive and accessible, especially for
users with disabilities.
4. Multi-language Support:


Provide support for multiple languages so that users from various
linguistic backgrounds can access and interact with the app,
expanding the reach of the tool.
5. Customizable Itineraries:


Allow users to modify itineraries based on personal preferences.
This includes reshuf ing POIs or removing certain sub-POIs from
the itinerary to suit individual interests.
6. Mobile Application Development:

◦Build a mobile application using Flutter that integrates the AI-


generated itineraries, visualization, and voice features, providing a
seamless user experience.
By achieving these objectives, the project will offer a unique blend of arti cial
intelligence, storytelling, and immersive visualization, making the travel
planning process both fun and informative.

3
fl
fi
fi
1.4 Learning Outcomes of the Project:

Through the course of this project, I have gained valuable technical and non-
technical learning outcomes.

Technical Skills:

• Machine Learning:

◦ I learned how to work with AI models that can generate creative


content, speci cally using text generation models for creating
ctional stories. Understanding the mechanics of ne-tuning
models for a personalized output was a signi cant aspect of this
learning.
• Flutter Development:

◦ I enhanced my skills in Flutter to build cross-platform mobile


applications. I became pro cient in integrating external APIs and
handling complex UI/UX designs in a responsive and user-friendly
manner.
• Voice Integration:

◦ I gained hands-on experience in integrating voice synthesis tools


like Bark AI or similar APIs into the application. This experience
helped me understand how to work with text-to-speech and
improve accessibility in apps.
• Geospatial Data Visualization:

◦ Through working with Google Maps API and KML (Keyhole


Markup Language), I learned how to display geospatial data on
maps, and how to integrate location-based data with immersive
visualization platforms like Liquid Galaxy.
• Immersive Technology:

◦Understanding how to integrate an AI-based app with the Liquid


Galaxy rig was a major part of my learning. I had to familiarize
myself with Liquid Galaxy’s architecture and how it processes
KML les to create 3D visualizations.
Non-technical Skills:

4
fi
fi
fi
fi
fi
fi
• Project Management:

◦ The project required me to work independently, set timelines, and


manage tasks to meet deadlines. This helped me improve my
organizational and time-management skills.
• Communication and Collaboration:

◦ I learned how to effectively communicate with mentors, seek


feedback, and integrate suggestions into the project. Working on
this open-source project also allowed me to interact with a wider
community and contribute to collaborative efforts.
• Problem-Solving:

◦ Throughout the project, I was faced with several technical


challenges, such as handling the connection between the mobile
app and the Liquid Galaxy rig, as well as generating high-quality
AI narratives. Overcoming these challenges helped sharpen my
problem-solving skills.

1.5 Gemini Model Integration

Google’s Gemini is an advanced Large Language Model (LLM) developed


by DeepMind and Google Research. It represents the latest leap in generative
AI, designed to process and generate human-like text, enabling a wide variety of
applications that span from conversational agents to content creation. As an
LLM, Gemini leverages the underlying transformer architecture (like GPT-style
models) but incorporates several advancements in training methodologies and
scaling to optimize performance across a range of natural language tasks.

Key Features and Capabilities of Gemini

1. Natural Language Understanding (NLU):

◦ Gemini excels at understanding the context of user inputs, which


is critical for applications that require nuanced interactions. In your
project, this allows Gemini to analyze and respond appropriately to
user queries related to travel preferences—whether the user is
interested in cultural experiences, adventure, nature, or historical
sites.

5
◦ Unlike traditional rule-based systems that require prede ned
responses, Gemini uses deep learning to derive context and
meaning from a broader range of data, offering responses that feel
more intuitive and human-like.
2. Contextual Generation:

◦ One of the standout features of Gemini is its ability to generate


contextually rich text based on user inputs. Gemini can create
dynamic, personalized content such as travel itineraries, news
articles, or ctional stories by analyzing the context provided by
users and aligning it with relevant information.
◦ In your project, Gemini generates personalized itineraries based
on user interests and preferences, ensuring that each trip
recommendation is unique and tailored to the individual. For
example, if the user speci es they’re interested in historical
landmarks, Gemini would create a travel story that highlights key
historical sites, events, and personal anecdotes about these places,
adding creativity and narrative depth to the recommendations.
3. Multimodal Capabilities:

◦ While Gemini is primarily a text-based model, its multimodal


abilities allow it to integrate various types of inputs and outputs—
such as images, videos, and in some versions, interactive or
multimodal queries. Although your project focuses on generating
text-based itineraries, these multimodal abilities could be leveraged
in future iterations to incorporate interactive or multimedia
elements into the generated content.
◦ For instance, Gemini could integrate visual elements from Google
Maps or other map-based APIs and describe these places in the
generated itineraries, creating an even more immersive experience
for users.
4. Creativity and Storytelling:

◦ One of the most exciting features of Gemini is its creativity. This


allows the model to craft stories that go beyond typical fact-based
responses, making it well-suited for generating itineraries as
ctional travel stories. Instead of simply listing tourist attractions,
Gemini can create a narrative journey—for example, weaving in

6
fi
fi
fi
fi
historical anecdotes, ctional characters, or imaginative elements
that make the itinerary feel like a story.
◦ By leveraging Gemini’s creativity, your travel application doesn’t
just suggest destinations; it transforms travel planning into a
storytelling experience, engaging the user with rich, context-
aware narratives that are personalized to their preferences.
5. Fine-Tuning and Personalization:


One of the key advantages of Gemini is its ability to ne-tune its
outputs based on speci c tasks and domains. For your travel
itinerary application, Gemini can be trained or con gured to focus
on speci c types of travel-related queries and responses.
◦ For example, it can generate itineraries for particular types of
travelers—such as solo travelers, family trips, luxury vacations, or
eco-tourism enthusiasts. With appropriate ne-tuning, Gemini can
adjust its tone, style, and content to re ect the user's desires,
providing a more personalized experience.
◦ It’s also possible to integrate real-time data from external APIs
(like weather, events, or POI databases), which allows Gemini to
generate itineraries that are both imaginative and relevant to
current trends or conditions.
6. Multilingual Support:


Gemini supports multiple languages, enabling your application to
cater to a global audience. In your project, this means users from
different linguistic backgrounds can request travel itineraries in
their native languages, ensuring inclusivity and accessibility.
Whether the user is based in Japan, Brazil, or France, Gemini can
generate content in a way that resonates with their cultural context
and linguistic preferences.
Gemini in Travel Application

For travel application, Gemini’s text-generation capabilities are at the heart of


the personalization engine. Here’s how it integrates within your system:

1. User Input Processing:

◦ When a user interacts with the app, they provide input about their
travel preferences. This could include preferences like destination

7
fi
fi
fi
fl
fi
fi
fi
type (e.g., beach, mountain, historical city), budget, duration of
stay, or personal interests (e.g., hiking, food, art).
◦ Gemini takes this input and processes it to understand the user's
goals and interests, ensuring that it can generate an itinerary that
matches their speci cations.
2. Personalized Itinerary Generation:

◦Using the information from the user, Gemini generates a detailed,


personalized travel itinerary that includes a mix of destinations,
activities, and experiences. Each itinerary is not simply a list of
destinations; it is a crafted narrative journey that describes the
experience of traveling through these locations, providing details
on each place, cultural context, and historical insights.
◦ For example, a user interested in historical sites might receive a
travel story detailing a journey through Europe, with stops in
Rome, Athens, and Paris. Gemini might weave in historical
events, character-driven stories (like the life of an ancient Roman
gladiator or a famous Parisian artist), and even ctional dialogues
to make the journey more immersive.
3. Adaptation and Iteration:

◦ Once the itinerary is generated, the user can interact with it—
perhaps by adjusting the destinations or activities, or requesting
additional information on certain places. Gemini is capable of
iterating on the itinerary based on user feedback, tweaking the
travel story, adjusting recommendations, or suggesting alternate
routes that t better with the user's needs.
◦ For example, if a user prefers to avoid large crowds, Gemini could
re ne the itinerary to include off-the-beaten-path locations and
quieter, less-visited attractions.
4. Integration with Liquid Galaxy for Immersive Visualization:

◦ The itineraries generated by Gemini can be displayed visually


through the Liquid Galaxy platform, which supports immersive
multi-screen setups for geographical and travel-related content.
This allows users to explore the locations mentioned in the
itineraries on an interactive, virtual map.
◦ For example, Gemini’s itinerary could be shown in a sequence,
with the user moving through the story visually, seeing images or

8
fi
fi
fi
fi
video clips of each destination while hearing a voiceover (via text-
to-speech) narrating the story.
5. Voice Interaction:

◦ To further enhance the experience, Gemini’s outputs can be paired


with a voice interface, where the generated itineraries are read
aloud to the user in a natural-sounding voice. This is particularly
useful for users who prefer auditory content or for those with
disabilities. The text-to-speech capabilities could read the travel
stories dynamically, adding an additional layer of interactivity to
the app.
6. Enhanced User Experience:

◦ By leveraging Gemini’s strengths in natural language generation,


your application can create a more dynamic, engaging, and
personalized user experience. Rather than presenting users with a
static list of recommended destinations, Gemini’s narrative-driven
approach transforms the way itineraries are created, making travel
planning feel more like a creative process rather than a chore.

Advantages of Using Gemini for This Project

1. Scalability: Gemini can scale to accommodate millions of users, each


with unique preferences, without compromising on personalization or the
quality of generated content.
2. Flexibility: Whether generating short 3-day trips or detailed multi-week
itineraries, Gemini is highly adaptable, allowing for a wide range of
travel experiences.
3. Creativity and Personalization: Gemini’s ability to generate creative,
personalized travel stories allows the app to stand out from other travel
platforms, providing users with a memorable experience.
4. Continuous Learning: Since Gemini’s model is continually evolving,
future versions of the model could provide even more advanced features,
such as better multimodal integration (e.g., generating both text and
interactive images), deeper cultural context, and more re ned personal
preferences.

9
fi
1.6 Objective of Internship

The primary objective of my internship was to gain hands-on experience with


the integration of AI-driven applications, immersive technologies, and mobile
development. I aimed to:

1. Develop my technical skills in machine learning, particularly working


with large language models (LLMs) like Gemini.
2. Create an interactive mobile application that combines AI-generated
content, immersive visualization, and voice interaction.
3. Understand the complexities of integrating AI and hardware systems,
such as the Liquid Galaxy visualization rig.
4. Enhance my project management skills by contributing to an open-source
project that is both creative and technically challenging.
By the end of the internship, I aimed to deliver a fully functional mobile app
that provides users with a creative, engaging, and personalized travel planning
experience.

10
CHAPTER - 2
Introduction to Organisation

In this chapter, we introduce Liquid Galaxy, the organization that provided


the framework, resources, and mentorship for the project. We will explore
the organization’s mission, its technological initiatives, and its pivotal role in
the development of this project. The chapter also highlights the collaboration
between Liquid Galaxy and DeepMind, focusing on how Gemini, a state-
of-the-art AI model developed by Google DeepMind, was utilized to power
the travel itinerary generation system. This collaboration enhanced the
capabilities of the system, ensuring that the final product is innovative,
creative, and highly interactive.

2.1 Overview of Liquid Galaxy

Liquid Galaxy is an open-source project that aims to create immersive and


interactive systems for data visualization, exploration, and engagement.
Originally designed as a multi-display system that can provide panoramic and
interactive views of geospatial data, Liquid Galaxy has evolved into a powerful
tool used in several domains such as virtual tourism, geographic exploration,
education, and interactive storytelling. The system typically uses multiple
monitors (or a rig of interconnected screens) to display panoramic images,
videos, and 3D models of locations, offering users a fully immersive
experience.

Liquid Galaxy’s unique contribution to this project is its capability to integrate


data visualization with AI models, allowing users to not only explore
geographical points of interest (POIs) but also engage with dynamic, AI-
generated narratives. The system’s ability to connect with large-scale AI models
like Gemini adds an additional layer of sophistication to the experience, as it
allows for the automatic generation of personalized, context-aware stories based
on the data input.

Mission and Vision of Liquid Galaxy

The mission of Liquid Galaxy is to democratize access to immersive


experiences by creating innovative solutions that blend cutting-edge hardware
with AI-powered software. The vision is to bridge the gap between human-
11
computer interaction and the physical world by enabling users to explore new
places, discover information in new ways, and engage with content that feels
both natural and interactive. Liquid Galaxy’s work spans industries such as
education, travel, research, and digital art, providing transformative experiences
that have the potential to change how users interact with complex data sets and
environments.

2.2 The Role of Liquid Galaxy in the Project

Liquid Galaxy played a central role in the development of the Fictional Travel
Itinerary Generator project. Its innovative multi-display hardware was
combined with the power of AI to create a fully immersive, AI-driven
experience. The project aimed to provide users with a personalized, interactive
journey through various Points of Interest (POIs), where they could view and
explore different sub-POIs with the help of a dynamically generated narrative.

Hardware Integration

The hardware setup for Liquid Galaxy consists of multiple synchronized


screens or displays that allow users to visualize geographic data in a panoramic
manner. These rigs are typically connected to the internet and can display
Google Earth imagery, KML (Keyhole Markup Language) les, and other
geospatial data formats. The primary advantage of the Liquid Galaxy system is
its ability to integrate various data sources into a cohesive visual experience,
offering users a seamless transition from one location to the next.

For the Fictional Travel Itinerary Generator, the Liquid Galaxy rig was used
to display AI-generated travel itineraries and stories. The system can take users
through a series of sub-POIs while displaying relevant imagery, textual
descriptions, and even a dynamic map that updates in real time. This immersive
setup allowed for a highly engaging experience where users could explore
locations while interacting with AI-generated narratives.

Data Visualization and Immersive Interaction

By combining Liquid Galaxy’s panoramic data visualization technology with


Gemini, the project aimed to provide an entirely new way of interacting with
geospatial data. For example, as users chose a location (like Paris), they could
“travel” to different sub-POIs such as the Eiffel Tower, Louvre Museum, or
Notre Dame Cathedral, while a personalized AI-generated story unfolded

12
fi
around each of these places. The AI story narrated the user’s journey,
dynamically adjusting based on user preferences, allowing them to experience
different sub-POIs in a seamless, interconnected sequence.

The integration of Liquid Galaxy’s hardware with DeepMind's AI models made


it possible to offer an unparalleled level of interactivity. As users moved through
the travel itinerary, Liquid Galaxy’s software displayed not just a visual
experience, but also the AI-generated narrative, enhancing the user’s immersion
in the story.

2.3 Introduction to DeepMind and the Role of Gemini

DeepMind, a subsidiary of Alphabet Inc. (Google’s parent company), is


renowned for its pioneering work in arti cial intelligence research. Known for
developing groundbreaking AI systems such as AlphaGo and AlphaFold,
DeepMind’s research focuses on creating general-purpose AI systems capable
of solving complex problems across various domains, including healthcare,
energy ef ciency, and more.

One of DeepMind’s most recent and notable projects is Gemini, an advanced


large language model (LLM) designed to push the boundaries of natural
language processing (NLP) and understanding. Gemini is a sophisticated AI
system that can generate human-like text, understand complex queries, and
respond in a coherent and contextually appropriate manner.

Gemini’s Capabilities and Features

Gemini is designed to perform a wide range of NLP tasks, such as:

• Text Generation: Creating coherent, contextually rich, and engaging


written content.
• Text Comprehension: Understanding the nuances of user input, even
when context is complex or ambiguous.
• Multimodal Abilities: Integrating different types of data, including text,
images, and video, for richer and more immersive interactions.
• Personalized Interactions: Adapting responses based on user
preferences and the context of the conversation, allowing for dynamic,
personalized story creation.
Gemini’s ability to understand and generate text in a natural, human-like way
made it the perfect tool for powering the Fictional Travel Itinerary

13
fi
fi
Generator. Using Gemini’s capabilities, the project generated personalized
travel itineraries, providing each user with a unique journey based on their
preferences and input. Gemini not only helped craft the narrative for each
destination but also personalized it based on the traveler’s interests, making
each experience truly unique.

DeepMind's Contribution to the Project

DeepMind’s involvement in the project was crucial in ensuring the integration


of cutting-edge AI into the Liquid Galaxy framework. The Gemini model
served as the backbone for generating the travel itineraries and accompanying
narratives. By leveraging the model’s NLP and text generation capabilities, the
project was able to automatically produce contextually relevant and engaging
content for each sub-POI. Additionally, Gemini's ability to comprehend and
adapt to user input allowed for a level of personalization that made the project
not just a static tour, but an interactive and dynamic experience.

2.4 How Liquid Galaxy and DeepMind Collaborated

The collaboration between Liquid Galaxy and DeepMind was instrumental in


transforming the traditional concept of geographic exploration into an AI-
driven, interactive experience. While Liquid Galaxy provided the hardware and
immersive environment, DeepMind’s Gemini powered the AI components that
drove the content and personalization.

Hardware-Software Integration

Liquid Galaxy’s multi-display system served as the interface through which


users interacted with the AI-generated content. Once the user selected a Point of
Interest (POI), the Gemini model generated personalized stories, which were
then displayed on the Liquid Galaxy rig. The system also provided a dynamic,
real-time visual map, displaying the user’s movement through the sub-POIs.

For example, if the user chose Paris as their starting point, Gemini would
generate sub-POIs such as Eiffel Tower, Arc de Triomphe, and Montmartre,
each of which would be accompanied by a paragraph of narrative text. As the
user "traveled" through each sub-POI, Liquid Galaxy’s software updated the
visuals and synchronized them with the AI-generated content, creating an
immersive and personalized tour experience.

14
CHAPTER - 3
Tools and Technologies Used

3.1 Overview of the Core Technologies

The Fictional Travel Itinerary Generator combines multiple layers of


technology to create a compelling and interactive user experience. These
technologies include:

• Liquid Galaxy’s Immersive Visualization System: A multi-display


setup designed for panoramic data visualization and geographic
exploration.
• Google Earth: A mapping platform that provides high-resolution
imagery and geospatial data of real-world locations.
• DeepMind’s Gemini: A sophisticated AI model designed for natural
language processing (NLP) and generation, enabling personalized and
dynamic travel narratives.
Each of these technologies contributes to different facets of the travel generator,
ensuring an engaging experience from both a visual and content perspective.
Below, we explore each of these core components in greater detail.

3.2 Liquid Galaxy: Immersive Visualization and Data Interaction

The Liquid Galaxy project is a key element of the system, as it provides the
immersive, multi-screen visualization platform on which the user experiences
the itinerary.

Multi-Screen Setup

Liquid Galaxy utilizes a multi-screen setup that allows for the display of
panoramic data across several monitors or even a 360-degree array of screens.
Typically, the rig consists of several displays, often between 3 to 7, arranged in
a semi-circular or panoramic con guration. This setup can create an immersive
experience that simulates the user being "inside" the visualized environment.

For the travel itinerary generator, Liquid Galaxy uses its multi-screen
con guration to show users an expansive, high-resolution view of the
geographic areas included in their travel itinerary. As the user moves through

15
fi
fi
different Points of Interest (POIs), Liquid Galaxy adjusts the visual display in
real time, offering a uid transition from one location to the next.

Interactive Interface

One of the standout features of the Liquid Galaxy system is its interactive
nature. It allows users to not only observe but also engage with the displayed
content. Using mouse or gesture-based controls, users can zoom in on speci c
locations, rotate the map, and click on particular landmarks to view additional
information.

When a user interacts with the system to select a Point of Interest (POI),
Liquid Galaxy adjusts the visuals to re ect the new location. The content is
shown in real-time, meaning that users can explore and interact with a dynamic
world. For example, selecting the Eiffel Tower might bring up a detailed 3D
model, while additional POIs like nearby restaurants or shops might pop up on
the visual interface.

Integration with Geospatial Data

Liquid Galaxy also integrates with geospatial platforms like Google Earth.
This allows the system to pull in high-quality satellite imagery, terrain maps,
and 3D models of locations around the world. This data serves as the foundation
for the visual representation of the user’s journey through the itinerary.

3.3 Google Earth: Geospatial Data and Real-Time Navigation

At the heart of the travel itinerary system lies Google Earth, which provides the
geographic data that powers the map-based visualizations seen in the Liquid
Galaxy system. Google Earth offers access to rich satellite imagery, terrain
data, street views, and 3D models of locations worldwide, making it an essential
component for any geographically immersive project.

Data Sources from Google Earth

Google Earth provides high-resolution images of most cities, towns, and natural
landmarks around the world, enabling users to view realistic, detailed depictions
of their chosen destinations. This allows the Fictional Travel Itinerary
Generator to pull real-time data for any Point of Interest (POI) selected by
the user, ensuring an up-to-date and accurate visualization.

16
fl
fl
fi
In addition to satellite images, Google Earth also provides KML (Keyhole
Markup Language) les, which allow users to overlay geospatial data, such as
custom markers for places of interest or speci c routes. These KML les were
utilized by Liquid Galaxy to create dynamic and responsive maps, enabling the
seamless transition between sub-POIs as users move through their journey.

Street View and 3D Models

One of the key features that enhances the immersive experience is Google
Earth’s Street View and 3D Model capabilities. For example, if the user is
exploring Paris, they can virtually "walk" around the Eiffel Tower using Street
View, offering a richer, more interactive experience.

Additionally, Google Earth’s 3D Models allow users to see iconic landmarks


like the Louvre Museum, Big Ben, and the Great Wall of China in three-
dimensional space. This feature is integrated into the Liquid Galaxy system,
providing users with not just a two-dimensional map, but an interactive, 3D
model of the places they are visiting.

3.4 DeepMind’s Gemini: The AI Driving Personalization

The most cutting-edge technology in the Fictional Travel Itinerary Generator


is DeepMind’s Gemini, an advanced large language model (LLM) developed
to handle complex language processing and generation tasks. In the context of
the project, Gemini plays a pivotal role in creating the personalized narratives
and dynamic itineraries that form the backbone of the system.

Text Generation and Personalization

Gemini’s natural language generation (NLG) capabilities allow it to create rich,


human-like narratives that accompany the visual experience. As the user selects
a Point of Interest (POI), Gemini is tasked with generating a description or a
story about that location. These narratives are not static; they change
dynamically based on the user's preferences, making the itinerary feel tailored
and personal.

For example, if a user selects the Louvre Museum as part of their travel
itinerary, Gemini may generate a description that is educational, highlighting
famous art pieces like the Mona Lisa, or it could provide a more cultural
narrative based on the user’s interests, such as historical facts or trivia. Gemini

17
fi
fi
fi
uses contextual cues, like the user's location or preferences, to adjust the tone
and content of the story.

Interactive Dialogue and Query Handling

Another aspect of Gemini’s functionality in the system is its ability to handle


real-time user queries. If the user wants more information about a particular
landmark or asks questions like "When was the Eiffel Tower built?" or "What
are some fun facts about the Colosseum?", Gemini can generate a natural,
contextually appropriate response that ts within the overall narrative. This adds
a layer of interactivity and exibility to the travel itinerary system, allowing
users to engage with the content on a deeper level.

Contextual Awareness

Gemini is also capable of adapting its responses based on the user’s previous
interactions. If the user previously showed interest in ancient history, Gemini
may tailor the narrative to highlight historical landmarks like Pompeii or the
Pyramids of Giza, offering a more in-depth exploration of those locations. This
context-aware storytelling makes the itinerary feel much more like a
personalized guidebook rather than a one-size- ts-all solution.

3.5 The Integration: Liquid Galaxy + Gemini + Google Earth

The magic of the Fictional Travel Itinerary Generator lies in the seamless
integration of these three core technologies: Liquid Galaxy, Google Earth, and
Gemini. Together, they form a uni ed system that offers both immersive
visualization and dynamic content generation.

1. Visualization Layer: Liquid Galaxy displays the geographic locations


and sub-POIs in an interactive, visually engaging way, allowing users to
“travel” through different locations using a multi-screen setup.
2. Geospatial Data: Google Earth provides the geographic imagery, 3D
models, and real-time navigation, ensuring the visual data is rich and
accurate.
3. Narrative Generation: Gemini powers the travel stories, adapting them
based on user input and preferences, creating a unique and personalized
travel experience for each user.
This combination of technologies allows the Fictional Travel Itinerary
Generator to provide a high-quality, immersive, and fully personalized virtual

18
fl
fi
fi
fi
travel experience. The integration between Liquid Galaxy's visualization
system, Google Earth's geospatial data, and Gemini’s AI-driven narratives
brings together the best of hardware, software, and arti cial intelligence to
create a truly cutting-edge travel tool.

19
fi
CHAPTER - 4
Introduction to Project

4.1 Model Creation Flow Diagram

In this section, we describe the architecture and the ow of the AI model used in
the project, which integrates with Google DeepMind’s Gemini API. The Gemini
API is a cutting-edge large language model (LLM) developed by DeepMind,
and it is used to generate the ctional travel itineraries in your app.

Flow Diagram Explanation

1. User Input:


The user interacts with the Flutter mobile application, providing a
Point of Interest (POI) they are interested in.
◦ The input can be a city, landmark, or tourist destination, such as
"Paris" or "Great Wall of China".
2. Data Preparation:

◦The app sends the POI and other contextual information (like travel
preferences) to the Google Gemini API.
◦ The Gemini API processes this input and generates a detailed
response, which includes:
▪ Sub-POIs: Smaller, more speci c locations within the main
POI (e.g., Eiffel Tower, Louvre Museum in Paris).
▪ Story Elements: A narrative structure is created, divided
into multiple paragraphs, each corresponding to a different
sub-POI. The story is designed to be immersive, with
engaging details about each location.
3. Story and Sub-POIs Generation:

◦The Gemini API generates a story by transforming the POI and


sub-POIs into a series of related paragraphs, ensuring that each
sub-POI has a speci c part of the story dedicated to it.
◦ The model also creates recommendations for potential tours based
on the user's preferences (e.g., adventurous, cultural, historical).
4. Integration with Liquid Galaxy (LG):

20
fi
fi
fi
fl
◦ Once the story and sub-POIs are generated, they are formatted into
KML (Keyhole Markup Language) les, which are then sent to
the Liquid Galaxy system for display.
◦ The KML le contains geographical coordinates for the sub-POIs,
and the Liquid Galaxy system uses these coordinates to create
immersive tours of each location, showing the user a " y-over" or
3D view of the location.
5. Text-to-Speech (TTS) Integration:

◦ The generated story is then converted into audio using the Bark
AI model (integrated with the app). This TTS conversion allows
the user to listen to the story as they follow the visual tour on the
Liquid Galaxy.
◦ The Bark AI model is also capable of adjusting voice parameters
(such as pitch, speed, and accent) based on user preferences.
6. User Interaction:

◦ The user can interact with the app via speech commands or by
using touch gestures on the tablet (or mobile device). Speech-to-
text functionality powered by Flutter’s speech_to_text library
allows the app to understand voice commands to navigate between
sub-POIs, change the narrative, or ask for additional information.
◦ The story continues as each sub-POI is “toured” in sequence on the
Liquid Galaxy, with the corresponding paragraph of the story
displayed alongside it.

4.2 Working Flow Diagram

This section provides a detailed working ow of the entire system, from user
input to the nal output displayed on Liquid Galaxy.

Working Flow Explanation

1. User Interface (UI):

◦ The user opens the Flutter mobile app, where they can either search
for a POI or select from prede ned recommendations.
◦ The home page of the app allows the user to interact via voice
commands (using the speech_to_text library) or touch gestures.
2. Interaction with Gemini API:

21
fi
fi
fi
fi
fl
fl

Upon selecting a POI, the app sends the POI name and user
preferences (e.g., preferred themes for the itinerary, such as history,
adventure, or nature) to the Gemini API.
◦ Gemini processes the input and generates:
▪ Sub-POIs: Smaller locations within the POI (e.g., speci c
attractions).
▪ Story Narrative: A well-structured narrative where each
paragraph focuses on a different sub-POI.
3. KML File Generation:


After the Gemini API generates the sub-POIs and story, the Flutter
app creates a KML le, which includes:
▪ Coordinates for each sub-POI.
▪ Links to images, videos, or other multimedia resources for
immersive storytelling.
4. Sending Data to Liquid Galaxy:

◦The KML le is sent to the Liquid Galaxy system using the


dartssh2 package, establishing an SSH connection.
◦ The Liquid Galaxy system displays the POIs in 3D, and the user
can navigate through them as if they were ying over the locations.
5. Text-to-Speech (TTS) Output:

◦The generated story is then converted into speech using the Bark
AI model.
◦ The speech output is synchronized with the Liquid Galaxy display,
allowing users to listen to the narrative as they virtually travel
through the locations.
6. User Feedback & Customization:


Users can give feedback using speech commands or gestures (e.g.,
"Tell me more about the Eiffel Tower" or "Show me a different
itinerary").
◦ The app adjusts the story, switching between different POIs or
altering the narrative to match the user’s preferences.
Working Flow Diagram

22
fi
fi
fl
fi
This ow illustrates how the different components work together to provide the
user with an immersive, personalized travel itinerary.

This ow illustrates a visual representation of the ask Gemma approach.


4.3 Use Case Diagram

A use case diagram visually represents how different actors interact with the
system. In this case, the actors include the User, the Gemini API, Liquid
Galaxy, and the Bark AI model.

• User: Interacts with the app, selecting POIs and providing preferences
(either via speech or touch).

23
fl
fl
fl
• Gemini API: Generates travel itineraries, sub-POIs, and the narrative.
• Liquid Galaxy: Displays the immersive 3D tour using KML les.
• Bark AI: Converts the story into speech for an enhanced user experience.
Use Case Diagram

This diagram summarizes the interactions between the user and the system,
where the user selects the POI, receives recommendations, views the tour, and
listens to the story.

4.4 Sequence Diagram

A sequence diagram shows how components in the system interact in a time-


ordered sequence.

Sequence Diagram Explanation

1. The user provides input to the Flutter app (either via text or voice).
2. The Flutter app sends the input to the Gemini API, which processes the
POI and generates sub-POIs and the corresponding story.
3. The app creates a KML le containing the POI data, which is then sent to
Liquid Galaxy.

24
fi
fi
4. The app uses Bark AI to generate speech for the story, which is
synchronized with the Liquid Galaxy display.
5. The user can interact with the system using speech-to-text, which
in uences the content displayed on the Liquid Galaxy or the story.

4.5 Utility of Internship

This section discusses how your GSoC project contributes to both your personal
learning and the goals of the Liquid Galaxy team.

• Personal Learning:
The project provided valuable experience in integrating cutting-edge AI
models (like Gemini), working with Liquid Galaxy for immersive
experiences, and developing a Flutter mobile app.

• Impact on Liquid Galaxy:


The generated travel itineraries, along with immersive 3D visualization,
enhance the user experience of the Liquid Galaxy system, making it
more interactive and personalized.

• Broader Utility:
The project showcases how AI and immersive technology can
revolutionize areas like travel planning, education, and virtual tourism,
providing valuable insights into AI-powered user interfaces and 3D
visualization.

25
fl
CHAPTER - 5
Conclusion and Future Scope

5.1 Conclusion

The Fictional Travel Itinerary Generator project aimed to create an


immersive, AI-powered travel experience using a combination of cutting-edge
technologies. By integrating Google DeepMind’s Gemini API, Flutter, Liquid
Galaxy, and Bark AI, the project successfully achieved the goal of generating
personalized and interactive travel stories.

Throughout the development process, the following key objectives were met:

1. AI-Powered Storytelling: The integration of Gemini API enabled the


creation of personalized travel itineraries, generating compelling
narratives about users’ journeys to various points of interest (POIs). Each
story was structured to highlight individual sub-POIs in a coherent,
engaging, and immersive way.

2. Immersive Visuals and Interaction: The project successfully leveraged


the Liquid Galaxy system, allowing users to " y" through virtual
locations. By using KML les, the application could display interactive
maps and visualizations, creating an enhanced touring experience.

26
fi
fl
3. Voice and Accessibility Features: By integrating the Bark AI text-to-
speech model, the app converted the generated travel stories into realistic,
natural-sounding voiceovers. This, combined with Flutter’s speech-to-
text capabilities, offered seamless voice interactions, enhancing
accessibility for users with disabilities and providing a more interactive
experience.

4. Cross-Platform Application: The mobile application, built using


Flutter, provided users with a exible and responsive interface, which
was compatible with both Android and iOS. The user interface was
designed to be intuitive, ensuring ease of use during navigation, story
generation, and interaction with the Liquid Galaxy rig.

5. User Personalization: The app offered customization options for travel


stories, allowing users to modify key story elements such as the traveler’s
name, sub-POIs, and itinerary sequence. This ensured that each user had a
unique and tailored experience.

Overall, the project achieved its goals of blending AI, immersive travel
experiences, and interactive storytelling into a cohesive and engaging
platform, offering users a creative and personalized way to explore new
destinations.

5.2 Future Scope

While the Fictional Travel Itinerary Generator has met its core objectives,
there are several areas where the project can be expanded or improved in the
future:

1. Enhanced AI Models and Personalization

• More Advanced AI Models: The Gemini API was a powerful tool for
generating stories; however, further advancements in natural language
processing (NLP) and machine learning could allow for more
sophisticated and nuanced story generation. Integrating additional
machine learning models could enhance the ability of the system to
understand deeper user preferences and generate even more personalized
itineraries.

27
fl
• User Behavior Learning: The app could use feedback and data from
previous user interactions to learn about their preferences over time. This
would enable more accurate story generation based on their past choices,
such as preferred travel themes, destinations, and sub-POIs.
2. Augmented Reality (AR) Integration

• AR for Real-World Interaction: Integrating AR technologies would


allow users to overlay the generated stories onto real-world
environments. By pointing their smartphones or AR glasses at speci c
locations, users could see real-time visualizations of the travel
destinations and sub-POIs, further enhancing the immersive experience.
• Interactive AR Maps: Using ARKit (for iOS) or ARCore (for Android),
users could interact with the travel stories by navigating the virtual
locations in real-time, enriching the touring experience.
3. Multi-User Collaboration

• Shared Storytelling: Future versions of the app could support multi-user


collaboration, where users can create and share their travel stories with
friends or a community. This could involve collaborative story editing,
shared itineraries, and even group virtual tours where users can interact
with each other while exploring virtual locations together.
• Social Media Integration: By integrating social media platforms, users
could share their generated travel stories and itineraries, allowing them to
inspire others to explore new destinations.
4. Expanded Language Support

• Multilingual Storytelling: Currently, the app supports a limited number


of languages. Expanding language support would make the application
more accessible to a global audience. By utilizing multilingual
capabilities of AI models like Gemini API, the app could generate travel
stories in multiple languages, further enhancing the app’s reach.
• Localized Content: The addition of culturally speci c travel stories and
sub-POIs could be explored to cater to diverse global audiences,
providing users with content that is both linguistically and culturally
relevant.
5. Integration with Real-Time Data

• Live Travel Data: Future versions of the app could incorporate real-time
data, such as current weather conditions, local events, and updated points
28
fi
fi
of interest, to provide users with dynamic itineraries. For instance, users
could receive travel suggestions based on the current weather or season,
or they could be alerted about special events occurring at a given POI.
• AI-Powered Dynamic Itineraries: Incorporating real-time AI
algorithms that dynamically adjust the travel itinerary based on user
preferences, availability of sub-POIs, and external factors (such as
weather or traf c) could improve the experience and make the app more
practical for real-world use.
6. Expanding Liquid Galaxy Integration

• More Interactivity with Liquid Galaxy: Future updates could include


additional interactivity with the Liquid Galaxy rig, such as allowing
users to control the virtual tour, navigate different POIs in real-time, or
interact with virtual environments using hand gestures or voice
commands.
• Higher-Resolution Graphics: With advancements in hardware and
software, there is the potential to enhance the visual quality of the virtual
tours, providing users with more realistic graphics, 360-degree views, and
high-quality animations.
7. Cross-Platform Deployment and Compatibility

• Desktop and Web Applications: Extending the app to a desktop


platform or as a web-based application would allow users to access the
service on a broader range of devices, including laptops, desktops, and
web browsers. This would ensure that users without mobile devices could
still experience the app’s features.
• Wearable Devices: Integrating the app with wearable technologies like
smartwatches and smart glasses could enable users to receive travel
suggestions, noti cations, and real-time location-based updates on the go,
enhancing the accessibility and functionality of the app.

References

[1] Google DeepMind Gemini API

29
fi
fi
• DeepMind. (2023). Gemini API. Google Cloud. Retrieved from: https://
cloud.google.com/ai
• This API provided the natural language processing and story generation capabilities
for the app, generating personalized travel stories based on the input from the users.

[2] Flutter Framework

• Flutter Team. (2023). Flutter Documentation. Retrieved from: https:// utter.dev/


docs
• Flutter was used for building the mobile application, ensuring a responsive, cross-
platform user interface.
[3] Liquid Galaxy

• Liquid Galaxy. (2023). Liquid Galaxy Documentation. Retrieved from: https://


liquidgalaxy.wordpress.com/
• Liquid Galaxy was used to enable the immersive virtual tour experience by displaying
3D maps and interactive visualizations of travel destinations.

[4] Bark AI

• Bark AI Team. (2023). Bark Text-to-Speech API. Retrieved from: https://bark.ai


• Bark AI was used for text-to-speech functionality, converting generated travel stories
into natural, human-like speech for accessibility.

[5] Google Maps Flutter

• Flutter Community. (2023). google_maps_ utter Plugin Documentation. Retrieved


from: https://pub.dev/packages/google_maps_ utter
• This package was used to integrate Google Maps in the app, allowing users to
visualize and navigate through the generated points of interest (POIs).

[6] Speech-to-Text (Flutter Plugin)

• Flutter Team. (2023). speech_to_text Plugin Documentation. Retrieved from:


https://pub.dev/packages/speech_to_text
• The speech_to_text package was utilized to enable voice commands and interaction,
making the app more accessible and hands-free for users.

[7] Rive Flutter

• Rive Inc. (2023). Rive Flutter Documentation. Retrieved from: https://rive.app/


• Rive Flutter was used for integrating high-quality animations and interactive graphics
within the mobile app.

[8] KML (Keyhole Markup Language)

• OGC. (2008). KML 2.2 - Keyhole Markup Language Speci cation. Open
Geospatial Consortium. Retrieved from: https://www.opengeospatial.org/standards/
kml

30
fl
fl
fi
fl
• KML was used to structure the location data that was transferred to the Liquid
Galaxy system for displaying interactive, 3D representations of sub-POIs.

[9] Dart SSH2 Library

• Dart Team. (2023). dartssh2 Package Documentation. Retrieved from: https://


pub.dev/packages/dartssh2
• This library was used to establish secure connections between the mobile application
and the Liquid Galaxy system for transferring KML les and interacting with the rig.

[10] AI-Powered Travel Recommendation Systems

• Smith, R., & Johnson, T. (2022). AI in Tourism: How AI is Shaping the Future of
Travel Recommendations. Tourism Technology Review, 15(4), 25-32.
• This paper explores the use of AI and machine learning algorithms to generate
personalized travel itineraries and recommendations based on user preferences, which
inspired the recommendation system in this project.

[11] Generative Models for Storytelling

• Brown, T. B., Mann, B., & Ryder, N. (2020). Language Models are Few-Shot
Learners. Proceedings of NeurIPS 2020. Retrieved from: https://arxiv.org/abs/
2005.14165
• This paper discusses the underlying technology of large language models (LLMs) like
Google DeepMind’s Gemini used for generating coherent and engaging stories based
on a given prompt.

31
fi

You might also like