INSTRUCTIONS TO CANDIDATES

1.   Bank Direct Marketing Cluster Analysis

Given the inputs, do clusters of customers exist in the bank direct marketing data set? This exercise explores the bank direct marketing data and tries to profile the resulting clusters.

Take screenshots at every stage, you might want to recheck them or paste them for several questions:

a. Create a new diagram in your project. Name the diagram Bank Clustering.

b. Use the bank_direct_marketing data as a data source for this clustering and profiling exercise.

c. Determine whether the model roles and measurement levels assigned to the variables are appropriate.

d. Examine the distribution of the values of these variables:

balance

day

previous

duration

age

campaign

The three most heavily skewed distributions are for balance, campaign, and previous. Although not optimal, we could reduce the skewness of the distributions by taking the log of the variable.

e. Drag a Transform Variables node onto the diagram and connect it to the Input Data source.

f. Apply a log transformation to the following variables:

balance

previous

campaign

g. Connect a Cluster node to the Transform Variables node.

h. Change Maximum Number of Clusters to 6.

i. Change Use for all the variables to No, except for these:

balance

previous

duration

Age

campaign

j. Examine the distribution of the values of these variables:

balance

day

previous

duration

age

campaign

(5/5)

