UNIVERSIDAD POLITECNICA TERRITORIAL DE YARACUY
DR. RANMER MARCHAN
INGLES I (T1 – T2) OBJ. 7 los Conectores
TALLER VIRTUAL (20%)
NAME: __________________________________________________________ FILE: _______________
PART I Insertar 20 conectores de Inglés dentro del texto.
What is Google Gemini?
Gemini is Google’s large language model (LLM). More broadly, it’s a
family of multimodal AI models designed to process multiple modalities
or types of data, including audio, images, software code, text and video.
Gemini is also the model that powers Google’s generative AI (gen AI)
chatbot (formerly Bard) of the same name, much like Anthropic’s Claude
is named for both the chatbot and the family of LLMs behind it. The
Gemini apps on both the web and mobile act as a chatbot interface for
the underlying models.
Google is gradually integrating the Gemini chatbot into its suite of
technologies. For instance, Gemini is the default artificial intelligence
(AI) assistant on the latest Google Pixel 9 and Pixel 9 Pro phones,
replacing Google Assistant. In Google Workspace, Gemini is available on
the Docs side panel to help write and edit content, and on the Gmail side
panel to assist with drafting emails, suggesting responses and searching
a user’s inbox for information.
Other Google apps are also incorporating Gemini. Google Maps, for
example, is drawing on Gemini model capabilities to supply summaries
of places and areas.
How does Google Gemini work?
Gemini has been trained on a massive corpus of multilingual and
multimodal data sets. It employs a transformer model, a neural
network architecture that Google itself introduced in 2017.
Here’s a brief overview of how transformer models work:
Encoders transform input sequences into numerical
representations called embeddings that capture the semantics and
position of tokens in the input sequence.
A self-attention mechanism enables transformers to “focus their
attention” on the most important tokens in the input sequence,
regardless of their position.
Decoders use this self-attention mechanism and the encoders’
embeddings to generate the most statistically probable output
sequence.
Unlike generative pretrained transformer (GPT) models that take only
text-based prompts or diffusion models used for image generation that
take both text and image prompts, Google Gemini supports interleaved
sequences of audio, image, text and video as inputs and can produce
interleaved text and image outputs.
A brief history of Google Gemini
Google has been a pioneer in LLM architecture and draws upon its robust
research to develop its own AI models.
2017: Google researchers present the transformer architecture,
which underpins many of today’s LLMs.
2020: The company introduces the Meena chatbot, a neural
network-based conversational agent with 2.6 billion parameters.
2021: Google unveils LaMDA (Language Model for Dialogue
Applications), its conversational LLM.
2022: PaLM (Pathways Language Model) is released, with more
advanced capabilities compared to LaMDA.
2023: Bard starts during the first quarter of the year, backed by a
lightweight and optimized version of LaMDA. The second quarter
sees PaLM 2 released—with enhanced coding, multilingual and
reasoning skills—and adopted by Bard. Google announces Gemini
1.0 in the last quarter of the year.
2024: Google renames Bard as Gemini and upgrades its
multimodal AI models to version 1.5.
The word “Gemini” means “twins” in Latin and is both a zodiac sign and
a constellation. It was an apt name given that the Gemini model is the
brainchild of Google DeepMind, a merging of forces between the teams
at DeepMind and Google Brain. The company also took inspiration from
NASA’s Project Gemini, a two-person spacecraft integral to the success
of the Apollo mission.