Part A – Descriptive Statistics

Question 1 Find the value of average sales obtained by BestStyle. Also, find and compare with average sales of stores in tier 1 and tier 2 cities.

Question 2 Affluent locations record better sales, as they have more potential customers. If any store’s location with a community income level higher than ₹80,000 can be considered affluent, then identify the difference between the average sales in affluent stores and that in the non-affluent stores. Also, what is the percentage of sales that is constituted by the affluent stores?

Question 3 Beststyle plans to conduct an “end-of-summer” sale in September. To advertise the offers of the upcoming sale, the marketing team wants to launch a nation-wide campaign. They wants to run a country-wide marketing campaign. The marketing strategist believes that an understanding of customer attributes is the key to a successful campaign. One such useful attribute is ‘customer education level’. Suggest an appropriate education level that can be used to target maximum locations through the campaign.

Question 4 Suppose in the campaign customers are not categorised by ‘education level’, rather by ‘age group’. The management of BestStyle believes that most of the customers are in the age group of 35–49 years. Suppose the age of the customers is normally distributed, with a mean of 42 years and a standard deviation of 3.5 years. What is the percentage of customers that falls between 35 and 49 years? List the steps of your working.

Question 5 In a recent meeting with the customer feedback team, they reported some customers feel the prices of clothes at BestStyle are more volatile in comparison to its competitors. Are the customers right in feeling so? State the statistic you will use to answer this question and present your calculations in R. Also, comment upon the merits or demerits of volatility of prices.

Note: For the R-code, please submit the R script separately as mentioned in the instructions

Question 6 Describe and compare the advertising budget of BestStyle in mall vs non-mall stores. Use R for calculations and share your R-script. Also, comment upon the advantage or disadvantage of having a store in mall.

Note: For the R-code, please submit the R script separately as mentioned in the instructions

Question 7 Write an R code to plot a histogram illustrating the frequency distribution of the prices of clothes in the stores. Make sure to include the following in your R script:

1. The title of the plot as 'Frequency Distribution of Price charged by BestStyle'

2. The axes labelled as 'Frequency' and 'Price of Clothes'

Note: For the R-code, please submit the R script separately as mentioned in the instructions

Part B – Inferential Statistics

Note: Use alpha = 0.05 if a value is not already mentioned.

Question 1 Beststyle is planning to open new stores across the country. The senior management wishes to make this decisionbased on the overall average sales of each store rather than a sample average. As part of the analytics team, your managed has assigned to estimate the confidence interval for the overall sales achieved by Beststyle. Present your workings clearly and also discuss the significance of confidence limits in the decision-making for a new store.

Question 2 In the previous question, will increasing the number of stores in your sample make the confidence interval narrower or wider? What does it signify?

Determine the sample sizes ‘n’ that is needed to restrict the range of the 95% confidence interval to 450.

Question 3 As part of the marketing campaign, BestStyle has decided to distribute pamphlets along with the local newspapers. The sales team has assumed an average population size of 275 while getting these pamphlets printed. Is this assumption justified? Use hypothesis testing to decide if 275 is a reasonable estimate of the population size.

Question 4 As discussed earlier, a location is considered affluent if its income level exceeds ₹1,00,000. The company uses exclusive marketing strategies for such locations. However, instead of using ₹1,00,000, the marketing teams decided to use the mean income level for identifying affluent markets and proposed that the threshold be changed to ₹80,000. Is the team right in estimating the mean income? Suppose the standard deviation of income level is ₹20,000. Now, verify their claim using a 98% confidence level. State the test statistic used and list the steps involved.

Question 5 In a business meeting, a sales representative claims that ‘Not more than 70% of our stores are located in urban areas’. Since the marketing policies and resource allocation depends upon the distribution of stores, the management wants to verify her claim.

State the null and alternative hypotheses that you would use here as well as the type of test applicable. List the steps and state the conclusion of your test.

Question 6 BestStyle has adopted a policy of price matching, i.e., pricing your product the same as your competitor’s. Due to different cost structures, the price charged might not be homogenous across locations. Based on the sample, can you claim that on average, the pricing is consistent with the policy? Support your answer with appropriate reasoning.

Question 7 You are interested in analysing the influence of the mall vs non-mall location of the store in tier 1 and tier 2 cities. Using R, compute the proportion of mall vs non-mall locations in both tier 1 and tier 2 and complete the following table

Shelving Location Mall Non-mall

Tier 1 P(t1,m) P(t1,nm)

Tier 2 P(t2,m) P(t2,nm) 0.64

(Hint: You can use either a subset of the data or the table function.)

Further, based on the sample proportions computed above, can you claim that the population proportion of good shelving locations in rural regions is greater than that of urban regions? Give reason for your answer.

Part C – Analytical Statistics

Question 1 The law of demand states that when the price of a good falls, its quantity demanded increases, and when the price of good rises, its quantity demanded decreases. Use an appropriate statistic to check if such a negative relationship exists between the price and quantity demanded or sales of clothes. Illustrate the relationship through a scatter plot in R.

Note: For the R-code, please submit the R script separately as mentioned in the instructions

Question 2 BestStyle is planning to expand to more locations. Setting up a new outlet involves a fixed cost. The company wants to evaluate the project based on the expected turnover. The management asks the business analyst to build a model for sales prediction using all the variables related to the store excluding variables labelled under customer data. Summarise your results in a regression table containing coefficients, standard errors and t-stats. (Let’s call it Model I; the model name will be used as reference in further questions.)

Coefficient Std Error T-Stats

Coefficient Std Error T-Stats

(Intercept)

Price

CompPrice

StoreLoc

City

Advertising

Question 3 A sales analyst goes through the regression output of Model I and interprets these results to mean that the value of competitor prices has no effect on the sales. Is this an appropriate conclusion? Give reasons for your answer.

Question 4 Using Model I, which you built in question 2, discuss the interpretation of the coefficient of advertising budget. Is it consistent with the belief that stores which have higher advertising budget have higher sales?

