logo Hurry, Grab up to 30% discount on the entire course
Order Now logo

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

expert
Faith BrownComputer science
(5/5)

896 Answers

Hire Me
expert
Namita JainFinance
(5/5)

925 Answers

Hire Me
expert
AbdulrazzakEngineering
(/5)

542 Answers

Hire Me
expert
Pope AtkinssBusiness
(4/5)

752 Answers

Hire Me
SPSS
(5/5)

Better World Shopping mall is a shopping center that specifically caters to the apparel needs of the urban area residents.

INSTRUCTIONS TO CANDIDATES
ANSWER ALL QUESTIONS
Question 1 

Better World Shopping mall (BWSM) is a shopping center that specifically caters to the apparel needs of the urban area residents. Since last year, its revenue has been declining and many retail shops have decided to move out from the mall. The management of BWSM wants to study the amount of money that their customers would spend on shopping and divides them into groups for promotion. The management decided to use clustering to better profile their customers according to their demographics and spending power. You are now given a dataset (BWSM.csv) as shown in Table 1 to help BWSM to do this data mining project.

 Table 1. Description of BWSM.csv 

Attribute

Description

Labels/Values

CustomerID

Unique identifier of the

customer

Unique code

Gender

Gender of the customer

“M” for Male /

“F” for Female

Education

Education level of the customer

“1” for High school or below / “2” for Bachelor’s degree /

“3” for Master’s degree or above

Age

Age of the customer

Integer measurement

Income

Annual Income of the

customer

Dollar measurement

Household

Household type of the customer

“1” for Single / “2” for Couple / “3” for Family with children / “4” for Extended family (i.e. children and grandparents) /

“5” for Others

VisitFrequency

How many times the

customer visits BWSM per month

Integer measurement

AvgSpent

The average amount the customer spent in the BWSM

per visit

Dollar measurement

 

(a)            With reference to the CRISP-DM framework, discuss how you plan to carry out this data mining project. (24 marks)

 

(b)            Analyse the data based on the summary statistics given in Table 2. 

Table 2. Summary statistics of the attributes

(i)   Explain if there is a need to perform data transformation. (4 marks) 

(ii)  Describe a scenario where z-score normalization is preferable to min-max normalisation. In your answer, differentiate these two (2) categories of data normalisation techniques. (4 marks) 

(c)  Assume that you have built two clustering models, Model A and Model B. The details of each model are given in Table 3. In each model, you are able to clearly describe the profile of each cluster. Based on Table 3, identify the model that you believe is better for deployment. Defend your choice by providing good reasons.

 Table 3. Description of Model A and Model B 

Description

Model A

Model B

Number of clusters

3

5

Number of clustering criteria

4

4

Ease of interpretation of the

profile of each cluster

High

High

Average Silhouette

coefficient

0.75

0.79

Size of each cluster

Cluster 1: 23%

Cluster 2: 46%

Cluster 3: 31%

Cluster 1: 17%

Cluster 2: 25%

Cluster 3: 26%

Cluster 4: 20%

Cluster 5: 12%

 

Question 2 

You are a data scientist in the Quality Control Department of a wine production company. You would like to understand the factors affecting the wine quality by developing a classification tree that can predict whether the wine is of “Low quality” or “High quality”. A dataset related to a particular type of white wine produced by your company was collected. The number of instances in the white wine samples are 4898. 

In the dataset (winequality-white.csv), there are 11 attributes related to physicochemical properties of the wine and 1 attribute “Quality” indicating the quality of the wine. Table 4 shows the attributes in the winequality-white.csv and the range of each attribute.

 Table 4. Description of winequality-white.csv 

Attribute (unit)

Range

fixed acidity (g(tartaric acid)/dm3)

3.8-14.2

volatile acidity (g(acetic acid)/dm3)

0.1-1.1

citric acid (g/dm3)

0-1.7

residual sugar (g/dm3)

0.6-65.8

chlorides (g(sodium chloride)/dm3)

0.01-0.35

free sulfur dioxide (mg/dm3)

2-289

total sulfur dioxide (mg/dm3)

9-440

density (g/cm3)

0.987-1.039

pH

2.7-3.8

sulphates (g(potassium sulphate)/dm3)

0.2-1.1

alcohol (vol.%)

8.0-14.2

quality

3-9

 

 a) The quality of wine is initially determined by 30 wine experts in a scale that ranges from 0 (bad) to 10 (excellent) using blind wine tasting. The attribute “quality” in the dataset is the final wine quality score based on the 30 scores. Discuss whether the attribute “quality” should be the “mode”, “mean” or “median” of the scores given by the wine experts. Provide an explanation to support your answer. (6 marks)

(b) Based on the attribute “quality”, you decided to perform binning in the IBM SPSS Modeler to categorise the wine quality into two classes: Low Quality and High Quality. After the Binning node has been executed, there is a new attribute “quality_BIN” created with two bin values: “1” and “2”. Figure 1 shows the setting in the Binning Node. Discuss the purpose of binning and describe the meaning of the two bins in this context.

(5/5)
Attachments:

Related Questions

. The fundamental operations of create, read, update, and delete (CRUD) in either Python or Java

CS 340 Milestone One Guidelines and Rubric  Overview: For this assignment, you will implement the fundamental operations of create, read, update,

. Develop a program to emulate a purchase transaction at a retail store. This  program will have two classes, a LineItem class and a Transaction class

Retail Transaction Programming Project  Project Requirements:  Develop a program to emulate a purchase transaction at a retail store. This

. The following program contains five errors. Identify the errors and fix them

7COM1028   Secure Systems Programming   Referral Coursework: Secure

. Accepts the following from a user: Item Name Item Quantity Item Price Allows the user to create a file to store the sales receipt contents

Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip

. The final project will encompass developing a web service using a software stack and implementing an industry-standard interface. Regardless of whether you choose to pursue application development goals as a pure developer or as a software engineer

CS 340 Final Project Guidelines and Rubric  Overview The final project will encompass developing a web service using a software stack and impleme