Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
79 views5 pages

Data Set

elective

Uploaded by

Richlyn Mannag
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
79 views5 pages

Data Set

elective

Uploaded by

Richlyn Mannag
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Optimizing Retail Strategies: Insight from selling

price, customer review and product category


Erson C. Molate
Jose Rizal University, 3rd Year College
Cainta, Rizal
[email protected]

Abstract—This dataset offers a thorough examination of and services while catering to customer needs and evolving
Adidas products that were present in the U.S. retail market as of expectations. This industry not only offers covenience but
October 23rd, 2021. It comprises over 846 records (each containing also supplies valuable information about product features,
20 attributes) that detail individual products. These attributes benefits, and usage, enalbing informed purchasing decisions.
include essential information: product URL, name, SKU, price, Retailers play a crucial role in consumers' daily lives,
availability, color, category and customer reviews. fulfilling needs across categories and creating memorab le
shopping experiences.
Compiled by Crawl Feeds, the dataset yields significant
insights for researchers, analysts and businesses involved in retail, Furthermore, retail drives employment, contributing
e-commerce, marketing, supply chain management and data significantly to economic growth by creating jobs in sales,
science. Users can utilize this data to analyze market trends, customer service, logistics, and more. Ultimately retailing's
understand customer preferences, optimize inventory management purpose extends beyond transactions; it builds a community
and make data-driven decisions. However, the dataset's of consumers, serving and connecting people through the
effectiveness relies on the ability to interpret it correctly, because exchange of product and services
insights can only be as valuable as the analysis that supports them.
Although it provides a wealth of information, one must exercise
caution not to draw conclusions without appropriate context. III. PROBLEM STATEMENT (3 QUESTIONS)

Keywords—Adidas, US retail products, product information, A. Is there a significant difference in average rating and
pricing, availability, customer reviews, market analysis, e- review count for products that belong to similar
commerce, marketing, supply chain management, data science subcategories are sold at different price tiers?

B. How do customer ratin gs (average rating) and review


I. INTRODUCTION counts vary across products with different feature
This data set provides a complete snapshot of adidas combination (suchs as color and category) within a
products in the US retail market as of October 23, 2021. It specific price tier?
examines 846 unique Adidas products and provides key data
on six key attributes: sales, color, category, breadcrumbs, IV. What is the impact of subcategory (from
average rating, and review statistics. By analyzing these breadcrumbs) on the correlation between selling price and
features, we can see the relationship between average ratings average rating?
and sales prices by category, look at variations in color
within categories, and compare average sales prices and
V. DATA COLLECTION
ratings across ranges. We can answer basic research
questions such as These insights help operators identify A. Dataset Description
customer preferences, improve inventory management, and
develop data-driven strategies. Using this data set, businesses The Adidas US retail products dataset provides a detailed
and analysts can gain a deeper understanding of the US description of Adida products available in the market in the
sportswear market, allowing them to make better decisions United States including 20 attributes across a dataset of 846
and stay ahead of the curve. records. It serves as a helpful resource to analyze the current
developments in apparel and footwear in the sportswear
II. DOMAIN DESCRIPTION market. Each record shows the important product features
starting with a URL that links the users to a specific product
The retail industry, rooted in French term "retailer" page for more information, with the product name and SKU
meaning to cut off a piece" or "break bulk" is a critical
for enhanced identification and stock monitoring.
sector in the conomic chain, specializing in transaction that
handle smaller quantities of goods for direct consumer use.
unlike wholesale, which focuses on large-scale purchases, The dataset created by Crawl Feeds is an organization
retail represents the final stage of the product journey, focused on collecting and presenting comprehensive datasets
directly connecteing goods and services to customers. from different sources. Crawl Feeds focused on data
inventory. gathering providing users the ability to benefit from
comprehensive market data. By creating structured datasets
such as those adidas products they provide researchers,
Retail serves as a bridge between producers and end- analysts, businesses critical information, understanding
users, providing easy access to a wide variety of products

XXX-X-XXXX-XXXX-X/XX/$XX.00 ©20XX IEEE


consumer trends and optimizing The format of Adidas US retail products dataset is only
inventory. available in CSV format, it is the common format for datasets
that contain rows and columns of data. It is good for
This dataset was finalized before uploading and updating on managing large volumes of datasets like the 846 records of
Oct 23, 2021.Customer satisfaction and product performance Adidas products. Adidas dataset is accessible to the public
are one the indicators that this dataset shows the related but you need to sign in first.
research questions that we’ve found in this dataset. Customer
satisfaction and product performance expands if this has an
B. Data Exploration
impact on the product sales and availability.

The scope of this dataset is centered around the adidas Selling Price
Selling Price - it refers to the price where the retailers sell
product sold in the U.S as of Oct 23, 2021 it shows the Attributes the product and item to a customer or buyer. It is the price
comprehensive data gathered from adidas about their product where the customer buys the product without having any
pricing, availability and customer engagement. discounts or additional fees.
Sample Data 84 28 52
They are about 20 data fields or variables in this dataset they
are (URL,Name, sku, selling_price, original_price, currency,
Color
availability, color, category, source, source_website,
Attributes Color - this refers to the different hues and shades found
breadcrumbs, description, brand, images, country, language,
in a lot of things like clothes, hair, shoes and more.
average_rating, review_count, crawled_at) Also adding
additional attributes to the 20 making it 21 (price_tiers) Sample Data White Blue Red
further details are provided in the data dictionary.
Category
We have three data sets relating to this dataset, most of these Category - this refers to the classification of grouping
Attributes
are from data.world : E Commerce Datasets, Tesco groceries products and items into different lists whether they have
dataset and Product data set from Nike. the same color, characteristic or attributes.
Sample Data Shoes Accessories Clothing
This dataset contains a lot of applications in different
sections like retail, e-commerce, marketing, supply chain
management and data science. With this knowledge people Breadcrumbs
from these sections are the intended audience of this dataset Breadcrumbs - this indicates the page’s position of the
Attributes site’s hierarchy and it is a navigational aid that shows the
because such individuals from different sectors can leverage path to the customer or user exploring on the site. It also
this data in making strategies and making data informed shows the specific classification of the products like
decisions. Women/ Clothing, Men/Clothing and Kids/Clothing.
Training/ Women/
Sample Data Women/Shoes
The dataset also reveal the selling price and original price of Accessories Clothing
each item providing pricing strategies and discounting
options. All items are priced in US Dollars focusing on the Average rating
United states market. The availability attributes show the Average Rating - It is a numeric value that reflects
Attributes
stock availability of the items. Also each product is arranged customer review of a product, it is usually using the 1 to
by color and category allowing for better organization and 5 stars.
easier access. Sample Data 5 4.8 4.7

Also the dataset identifies attributes like brand, saying that all
items listed are part of Adidas collection and a detailed list of Review Count
images that enhance visual understanding of each product Attributes Review counts - It is an attribute that displays the total
reviews received on each product, it shows the numbers
improving the online shopping experience. The country and of customer reviews.
language attributes ensure that the dataset is aligned with the
Sample Data 6 206 249
United States markets allowing local consumers to easily
access it. And lastly the inclusion of an average rating and
review counts provides understanding of customers feedback Price_tiers
and satisfaction which is an important impact in market study Attributes Price tiers- this is related to the selling price
and product quality improvement. The crawled timestamp attributes which indicates the given selling price
into Low, Medium and High.
reflects the most recent update of the data allowing users to
access the latest update. Sample Data Low Medium High

This dataset serves important roles specially for businesses,


analysts, and researchers exploring to understand consumer
preferences, inventory management, and pricing strategies in C. Data preprocessing
the sportswear market. It supports businesses to enhance
making data-driven decisions related to inventory
management, marketing strategies, and product development Data Import
in the competitive sport wear sector. The first step I took was to import my raw data, labeled as
adidas USA.
to provide a clearer visualization and deeper understanding
of these outliers within the dataset.

Data Cleaning
As I reviewed my raw data, I observed that there are missing
values in the original price attribute.

Upon identifying the missing values in my raw data, I Outlier Detection (2)
utilized the Sort & Filter feature to isolate and review the As we can see in the box plot, we can infer that most of the
missing entries in the original_price attribute. average rating of the customer is somewhere between 4-5
range and with the pivot table we can further see and
analyze that 1, 3, 3,8 only has one user each that put a rating
while in the 4.8 which has the highest count of people rating
that particular product 4.8.

Outlier Detection (1)


To detect outliers in the selling price attribute, I created a
box plot, which highlighted the lowest selling price as 9 and
the highest as 240. Additionally, I constructed a pivot table
(Table: Question 1)

Low Medium High


Row Avera Sum Avera Sum Ave Sum Total Total
Labels ge of of ge of of rage of of of
averag revie averag revie of revie Avera Sum
e w e w avera w ge of of
rating count rating count ge count averag review
s s rating s e counts
rating

Clothi 4.61 5298 4.65 678 4.33 8 4.61 5984


ng
Black 4.62 1701 4.61 532 4.50 6 4.61 2239
Mult 4.63 71 4.75 107 N/A N/A 4.68 178
Color
White 4.60 3526 4.76 39 4.00 2. 4.61 3567
Shoes 4.55 3405 4.50 9636 4.50 1358 4.51 14401
9 45 7 1
Black 4.49 1543 4.48 1483 4.49 5796 4.48 36066
6 4
White 4.59 1862 4.51 8153 4.51 7791 4.52 10794
3 1 5
Grand 4.60 3935 4.53 9704 4.48 1359 4.55 1499
Then I filtered the 20 Atrributes into 6 Attributes total 7 3 5 95
(selling_price, color, category, breadcrumbs, average_rating
and review_count) which I added 1 more attribute (price
tiers) after filtering the attributes to helps me answer 1 and 2
Questions

(Table: Question 2)

Row Labels Average of Average of


average_rating selling_price
Men/Clothing 4.70 46
After that I made 3 Questions relating to the remaining Men/Shoes 4.53 79
filtered 7 attributes (selling_price, price_tiers, color, Women/Clothing 4.55 41
category, breadcrumbs, average_rating and review_count) Women/Shoes 4.49 64
these 3 questions after getting the results will further explain Grand Total 4.55 60
the dataset.

VI. EXPERIMENTAL DESIGN

Row Average of Average of Average of


Labels selling_price Averag_rating review_count
Clothing 43 4.61 56
Low 32 4.61 69
(Table: Question 3)
Medium 60 4.65 26
High 160 4.33 3
Shoes 71 4.51 796 VII. RESULTS AND DISCUSSION
Low 35 4.55 130
Medium 69 4.50 719 A. Is there a significant difference in average rating and
High 124 4.50 647 review count for products that belong to similar
Grand Total 60 4.55 523 subcategories are sold at different price tiers?
After making the pivot table and based on the analysis of it, I can infer that average
rating and review count across different price tiers in the Category attribute (which
has Clothing, Shoes) show a pattern that makes them stand out with the attribute of
Price tier. In the Low Price Tier, this had a higher average that shows higher average
rating and review counts compared to the Medium and High Tiers. These results
suggest that the customer is satisfied with the low-tier price product and will likely
leave a review reflecting their experience with the product purchased. For Medium
price tier, it’s no surprise that it also exceeds the numbers in Average rating and
review count on the High price tiers, which also indicates the same as with Low C. What is the impact of subcategory (from breadcrumbs)
price tier: that customers are most likely to leave higher average rating and feedback
on the product. Despite the high selling price in the High price tier, it does not meet on the correlation between selling price and average
the expectation of the customers who bought them, leaving them with low average rating?
rating and little to no feedback (review count). In conclusion, it seems the lower and
medium price tier products in the category (Clothing, Shoes) meet customer With the results I have gathered from the pivot table and the question, I can see
satisfaction and expectation, whereas higher-price products don't always justify the that average rating and selling prices across various subcategory show a
selling price in the eyes of the customer. This answered the question because it pattern that is helpful for the question. In the first subcategory in breadcrumbs,
highlights the importance of making sure the product pricing aligns with the customer Men/Clothing have the highest average rating (4.70), despite having one of the
needs and expectation. lowest average selling prices (44), which is not a surprise because in retail, most
of the low-price items tend to have high ratings, and the quality is the plus
B. How do customer ratings (average rating) and review considering it has a low price in retail. These suggest that customers are more
counts vary across different feature combination (such satisfied with the value they receive in this category (breadcrumbs). While in
Men/Shoes, having a higher average selling price (79) and a low average
as color and category) within specific price tier? rating (4.53), indicating that high prices tend to make the customer believe that
Based on the analysis of my pivot table and the question, the results that I have come the product will be of high quality, but when they got it, it’s not, leading to low
to is that Clothes and Shoes categories show that Low price tiers products tend to customer satisfaction. Meanwhile, Women/Shoes, with a selling price (64), has
meet customer satisfaction and expectation, having the highest customer feedback the lowest average rating (4.49), potentially showing the customer feedback
(review count) and average rating across different colors. In Clothing, Black and towards these products, which are in the mid-range price. Overall, the grand
White in the low-price tier show that they have the highest review count, while in total of the pivot table with these attributes in average rating (4.55) and in
Multicolor, this color doesn't have a high price tier product—it only has medium, average selling price (60) highlights and reveals that customers' ratings do not
which achieved the highest average rating (4.75) in clothing. For Shoes, Black and correlate to high ratings. In fact, the correlation between selling price and
White products in medium products tend to have the higher lead in review counts, average rating is -0.57, which shows a moderately negative correlation with
but with the low-price tier having the lead on average rating (4.55), which is each other, where higher prices are associated with slightly lower ratings. This
surprising considering usually lower tiers tend to have both. However, each Low and indicates that higher-priced items tend to not meet customer expectations and
medium outdid each other in different analyses. Unsurprisingly, the High price tier needs, as we previously saw in the pivot table. This shows that product type and
shows low in average rating and low review count in these two categories (Clothing customer expectations are a factor that contributes largely to the overall
& Shoes), which suggests the customer tends to perceive better value in Low and satisfaction of the customer, and the relation between selling price and average
Medium price tier in both categories. With these results that I gathered, this rating is not purely linear.
emphasizes the importance of balancing pricing strategies with customer
satisfaction to enhance feedback and retail sales. Data provides valuable insights into
customer preferences and can inform future product development and marketing REFERENCE
strategies.
Crawl Feeds. (2021, October 23). Adidas US retail
products dataset [Data file]. Data World. Retrieved
November 14, 2024, from
https://data.world/crawlfeeds/adidas-us-retail-products-
dataset/workspace/file?filename=adidas_usa.csv

IEEE conference templates contain guidance text for


composing and formatting conference papers. Please
ensure that all template text is removed from your
conference paper prior to submission to the
conference. Failure to remove template text from
your paper may result in your paper not being
published.

You might also like