2023 Assignment
Task 2 – Using dataset and description below, answer the questions that
follow:
Dataset is available on Kaggle titled “eBay Product Listing Dataset”, at
https://www.kaggle.com/datasets/promptcloud/ebay-product-listing
dataset?resource=download&select=marketing_sample_for_ebay_com
ebay_com_product_details__20200901_20201031__30k_data.csv
IMPORTANT NOTE:
Total Records Count: 980958
Domain Name: ebay.com
Date Range: 01st Sep 2020 - 31st Oct 2020
File Extension: csv
Available Fields: Uniq Id, Crawl Timestamp, Pageurl, Website, Title, Num Of Reviews,
Average Rating, Number Of Ratings, Model Num, Sku, Upc, Manufacturer, Model
Name, Price, Monthly Price, Stock, Carrier, Color Category, Internal Memory, Screen
Size, Five Star, Four Star, Three Star, Two Star, One Star
Given the description of the dataset, you have been tasked with preparing data and
carrying out an assessment analysis report or dashboard on PowerBI.
Using a minimum of 1,000 words and excluding list of references, explain your work,
interpret the results and reflect on your experiences:
a) Explain how you built your Power BI report service (Microsoft, 2019) and the
issues you faced. In particular, how you achieved the following:
a. Given that table 1, shows that “Missing Data is Yes” what would you
recommend for checking the quality of the data [4]
b. Explain how to upload/retrieve dataset onto Power BI service [6],
c. built your report/dashboard [8], and
d. shared your report/dashboard with tutor and lecturer [2]
b) Interpret the results of running your report/dashboard, using any four (4)
suitable graphs that are interlinked, [12] (support your interpretation with visual
evidence).
c) Reflect on lessons learned, citing any noticeable trends from your findings [8]
(support your answer with visual insights).
Task 2 Resources/List of references:
1) Microsoft (2019) ‘From Excel workbook to stunning report in the Power BI service’.
Available at: https://docs.microsoft.com/en-us/power-bi/service-from-excel-to
stunning-report (Accessed on 6th March 2020)
2) Data set description from UCI Machine Learning Repository, available at:
https://archive.ics.uci.edu/ml/datasets/Census-Income+%28KDD%29 (Accessed:
14th March 2022)
3) PowerBI Tutorial Reference [online]: Power BI Tutorial – A Complete Guide on
Introduction to Power BI. Available at: https://data-flair.training/blogs/power-bi
tutorial/ (Accessed: 17th March 2022)
TASK 2
a)
i) To address missing values in a data set, I suggest utilizing PowerBI to
assess data quality. Upon uploading the data set to PowerBI, the
presence of missing values was indicated by the "NA" symbol. To prevent
bias, I eliminated the columns with missing values that were deemed
unhelpful
`
Additionally, I removed redundant and unnecessary columns and eliminated duplicates from
the model's name column to streamline the data. This resulted in a remaining 142 rows and
15 columns, which allows for more versatile report generation.
ii) The Power BI application was readily accessible in my computer,
eliminating the need for any downloading or installation. Upon launching
the app, I proceeded to select the "import from excel" option, and
subsequently navigated to the "archive" folder within the downloads
section. From there, I located and opened the specific excel file titled
"marketing_sample_for_ebay_com-ebay before finally clicking on the load
button to initiate the data upload onto Power BI.
iii) To generate my report, I navigated to the report view in my workspace
and accessed the report builder. This tool consists of three panes: the
Fields pane, Visualization pane, and Presentation pane. To begin, I
selected the "stacked bar chart" icon from the Visualization pane and
used the "manufacturer" field for the Y-axis and "Num of reviews" field for
the X-axis to create my first report. I then repeated this process for three
additional reports, using the "Model name," "Internal memory," and
"Screen size" fields for the multi-row card report, the "Manufacturer" and
"Five star" fields for the line chart report. Finally, I generated a pie chart
report using the "Manufacturer" and "Number of fields" fields.
b) INTERPRETATION OF MY REPORT
i) Stalked Bar Chart representing Manufacturer and Count of Reviews
Presented above is a stacked bar chart that illustrates the number of reviews received by
various manufacturers. The bars in the chart have been color-coded to distinguish between
different data segments. Samsung emerges as the manufacturer with the highest number of
reviews (40), which suggests a higher level of customer satisfaction as compared to other
manufacturers. Conversely, manufacturers like Sony, OPPO, OnePlus, and others have
received no ratings, indicating a lack of customer satisfaction.
ii) Multi-Row Card representing model name, internal memory, and screen size.
The visual displayed above is called a Multi-Row Card and it presents a list of device model
names, their internal memory, and screen sizes in rows. This layout enables users to easily
compare these features across different products, making visual analysis faster and helping
users identify similarities or differences between them. Additionally, the Multi-Row Card can
be used as a filtering tool. By selecting specific rows or values, users can filter other visuals
or data on the report. For instance, selecting a particular model name will filter other visuals
and display relevant information about that product only.
iii) Line chart representing count of Five star by Manufacturer
The above chart depicts a line graph that displays the quantity of five-star ratings that
different manufacturers have received. By utilizing a line chart, it becomes effortless to
compare the different manufacturers. You can easily observe the fluctuation of five-star
ratings among the manufacturers, which helps to determine which manufacturers have a
greater or lesser number of positive ratings. This information can prove to be useful for
decision-making, such as assessing the reputation or customer satisfaction of different
manufacturers. As evident in the line chart, Samsung has the highest number of five stars
which indicates higher customer satisfaction while manufactures like Sony, OPPO, OnePlus,
and others have received no five stars, indicating a lack of customer satisfaction.
iv) Pie Chart representing count of number of ratings by Manufacturer.
The above diagram presents a visual representation of the distribution of ratings among
various manufacturers in the form of a pie chart. The sections in the chart have been color-
coded to distinguish between different data segments. This chart visually displays the
proportion or percentage of each manufacturer's share of the total ratings. It enables a rapid
and intuitive comparison of the distribution of ratings across different manufacturers. This
depiction facilitates the identification of manufacturers that have a larger portion, which is
Samsung or smaller portion which of the total ratings which is Sony, OPPO, OnePlus, and
others.
c) REFLECTION
i) During my report creation, I observed a consistent trend that Samsung
outperformed other manufacturers in terms of numerical data. Samsung's figures
were consistently higher than those of other manufacturers.
ii) Conclusion:
Based on the consistent trend of Samsung outperforming other manufacturers, it may be
concluded that the company is committed to delivering high levels of customer satisfaction
and producing superior products as compared to its competitors.