Written Assignment 6
1. Data, information and knowledge
Data, information, and knowledge form a hierarchy of values in information systems, each
building on the previous:
Data are raw, unprocessed facts, without context or purpose. They can be quantitative (e.g.,
“34 units”) or qualitative (e.g., “Red” as a car color). By themselves, data points—such as
sensor readings, transaction records, or customer names—lack meaning until organized or
interpreted.
Information results when data are processed and given context, relevance, and purpose. For
example, aggregating daily sales transactions into monthly totals transforms isolated data
points into information that reveal trends, such as a 15 % increase in sales during spring and a
drop in winter. This information helps managers to make data-driven decisions.
Knowledge emerges when information is interpreted, absorbed and applied to make
decisions. It represents the human understanding of relationships among facts. For instance,
recognizing that higher spring sales correlate with specific marketing campaigns and using
this key insight to plan future promotions illustrates knowledge. Knowledge can be explicit
(documented procedures) or tacit (intuition based on experience) and facilitates action,
policy-setting, and innovation.
2. Big Data
Big data refers to extremely large and complex data sets which cannot be handled by
traditional methods and tools. The growth in the volume, variety, and velocity of digital
information has led to the creation of the term “Big Data”. Big data is often described by
these “3 V’s”: huge Volume (massive amounts of data generated), high Variety (many
different data types and sources), and great Velocity (data being produced very rapidly).
These characteristics make big data challenging to store, process, and analyze using
conventional methods. In practice, everyday user activities on social media, online
transactions, and IoT sensor feeds all contribute to this phenomenon. It is estimated that by
2025 about 463 of data will be generated globally each day, equivalent to over 212 million
DVDs (GeeksforGeeks, 2022)(OpenAI, 2025).
Businesses and governments can use big data analytics to discover patterns or trends (such as
customer behavior or disease outbreaks), using specialized tools (such as Machine Learning)
and techniques to extract insights from these vast information resources (OpenAI, 2025).
2. Redundant data and data integrity
Redundant data occurs when the exact same information is stored in multiple places. This
leads not only to storage waste but also to inconsistencies. For example, if a student’s details
are stored in both course and payment records, updates may not align, harming data integrity
(openlibrary-repo.ecampusontario.ca). To prevent this, normalization structures data into
related tables, ensuring each fact is stored exactly once. Integrity constraints further enforce
consistency. For example, it could prevent grade entries for non-existent students.
3. Relational database, Data Organization
A relational database organizes data into tables which are interconnected. For example, a
small bookstore database might have two tables named "Books" and "Authors". The "Books"
table stores information such as book ID, title, genre, price, and author ID, while the
"Authors" table contains author ID, name, nationality, and birth year. Each book is linked to
an author by the unique author ID, which creates a relational link. This structure prevents
redundancy since each author’s details are stored exactly once, regardless of how many books
they have authored and if an author's information changes, updates only need to occur in one
location. Queries can be retrieved as comprehensive data, such as all books by a certain
author, efficiently by joining tables. This relational approach simplifies data maintenance,
reduces duplication, and ensures integrity across related data points (Bourgeois et al., 2019,
p.83).
4. Primary key, table field characteristics needed to qualify as a candidate
To qualify, the field must have unique values and cannot contain null or invalid values. Ideal
candidates include fields like ID numbers, serial numbers, or codes designed to uniquely
identify and represent each record. For instance, in a "Customers" table, a customer ID would
serve as an effective primary key since it ensures each customer record remains distinct
(Bourgeois et al., 2019, p. 80).
5. Purpose of Normalization
Normalization structured databases reduce redundancy and improve data integrity. Looking at
the bookstore example from before, without normalization, each book record could
redundantly store the author's full details. Using normalization data is split into related tables,
which store each piece of information just once. Thus, author details are stored in an
"Authors" table, while books refer to authors via an author ID. If an author’s details change,
only one update is required, ensuring consistency and reducing errors. Normalization also
helps databases remain efficient by decreasing storage requirements and simplifying data
management, ultimately ensuring accuracy and reliability across the database (Bourgeois et
al., 2019, p. 86).
6. Importance of Defining Data Types
If one does not declare data types, data will not necessarily be stored and manipulated
correctly, as it prevents errors and maintains data consistency. For example, numeric fields
which can be defined as integers or floats make sure that mathematical operations can be
performed properly. For example, if it was declared as a string, numbers would just be added
at the end instead of an addition of the numbers being performed. A "Price" field is defined as
a decimal (float) to allow accuracy calculations in financial settings. Text fields is defined as
a string (char list or string) to store names or addresses to display the correct format and
length limits. A "Date" field restricts entries to valid dates, preventing nonsensical entries.
Defining appropriate data types also enhances performance by optimizing storage and
retrieval. Precise data types thus improve the overall functionality, reliability, and usability of
databases (Bourgeois et al., 2019, p.79).
7. Example of an SQL query that would retrieve all the data from a table in the database
example you described in Question 4.
SELECT * FROM Books;
8. Data Warehouse Concept and Advantages
A data warehouse centralizes large volumes of data from different sources and optimizes it
for analysis and reports. It stores various historical data and integrates multiple data sources
into one cohesive environment, where all the data can be retrieved. As the data is time-
variant, data come with an entry in form of a time stamp, “which allows for comparisons
between different time periods” (Bourgeois et al., 2019, p. 92). One advantage is, as
mentioned, enhanced reporting, where centralized data simplifies getting and generating
comprehensive reports, as “an organization can generate “one version of the truth””
(Bourgeois et al., 2019, p. 93). Another advantage is analytics, as it allows for efficient and
strategic decision-making. Also, an advantage is improved data quality, as the centralization
promotes standardization, accuracy, and consistency across the board, including datasets, and
facilitates reliable analysis. Additionally, efficient querying is a benefit, where optimized
structures and indexing accelerate query performance, and even with massive datasets
insights into the data are quick and efficient.
9. Knowledge Management Concept
The concept of accumulated knowledge management (KM) is mostly not written down, but
systematically manages organizational knowledge in order to improve efficiency and
innovation. It includes creating, capturing, indexing, storing, sharing, and applying the
company’s knowledge effectively. KM distinguishes between explicit knowledge
(documented procedures, manuals) and tacit knowledge (employee insights, expertise).
Effective KM strategies promote organizational learning, enhance decision-making, and
foster innovation by ensuring valuable knowledge is accessible and applicable. For example,
an internal company can capture and distribute procedures (explicit knowledge), while
mentorship programs can transfer experienced workers’ insights (tacit knowledge). Overall,
KM helps organizations leverage their collective knowledge assets strategically (Bourgeois et
al., 2019, p. 95).
References:
Bourgeois, D. T., Smith, J. L., Wang, S., & Mortati, J. (2019). Information systems for
business and beyond (Updated ed.). Biola University. https://digitalcommons.biola.edu/open-
textbooks/1
GeeksforGeeks. (2022, December 2). What is big data? GeeksforGeeks.
https://www.geeksforgeeks.org/what-is-big-data/
OpenAI. (2025). ChatGPT (o4-mini) [Large language model]. Retrieved May 10, 2025, from
https://chat.openai.com/