• As work is submitted on-line, the deadline is midnight on the hand in date
• Read the marking scheme carefully
• This is an individual project and no group work is permitted
• This project is part of the required procedure for obtaining the SAS Joint Certificate in Business Intelligence and Data Mining
• The assignment accounts for 40% towards your final grade.
To be eligible for automatically passing the Enterprise Data Management part of the SAS Milestone project (see Course Handbook), the student must achieve a grade of over 70% in this assignment.
• Prepare and Load Data
• Query Data
• Produce Reports
The purpose of this assignment is to allow students to apply techniques for accessing, processing, managing and reporting of real-world data and to provide solutions to business problems that today’s organizations face through the use of SAS Enterprise Guide. In order to accomplish the above objectives, you are provided with a set of real-world Point of Sale (POS) data that are related to the operations of a retail company RealPOS. You are asked to prepare and query the provided data through SAS Enterprise Guide and to create a comprehensive analysis report that will be presented to the management team of RealPOS.
In particular, in this assignment, you will:
• Import data to a data management system
• Pre-process the data to improve their quality
• Create and Manipulate SQL queries to retrieve data from multiple files
• Prepare basic descriptive statistics reports
• Prepare advanced analysis reports to gain insight about different aspects of the company
RealPOS is a retail company based in Brazil that sells equipment and accessories for outdoor sports. The company has hired you to analyse their sales and customers and provide key insights that will be utilized to refine their marketing and sales strategies. To this end, the company has provided with a number of datasets (see Appendix Dataset Information) that were exported by their individual ERP systems.
Your required to import and process the data and later on prepare a number of reports to deliver key insights to the company. The individual tasks that you need to perform are outlined below. All tasks should be prepared in a single project but in individual process flows.
Import all raw files inside resources and create SAS Datasets for each file.
Important: In order to avoid errors when transforming data sets to SAS format, all variables (e.g., SKU, BasketID) that will not be used in statistics or similar numerical operations should be read using the string type. Additionally, all the new datasets to be produced in SAS Enterprise Guide during the project should be stored in the SASUSER library.
Prepare a library, named SASMS, which will contain all prepared SAS Datasets in step “Import Raw Data”.
Important: The library should be used for all tasks.
Divide the observations of the table “Invoice” into two new tables where one stores the Sales and the other one stores the returns. This division must be done using the variable “Operation”. The two tables’ structure should be identical to the invoices tables besides the variable “Operation”.
What was the level of Sales and Returns? Create a bar chart with the monetary values to answer the question.
Calculate the customer’s age and store it as new integer variable, named “Customer Age”. Assume that today’s date is 01/01/2019. Additionally, you should check if the data are rightfully stored by the system according to the GDPR regulation. In particular, the company is not allowed to keep data for under-age persons (i.e., less than 18 years old). A conditional report should be generated in the cases where such customers exist.
A. Based on the “Customer Age” variable, create a new variable named “Age Group” that is calculated as follows:
Condition (years) Recoded Label 1 - 18 “Under 18”
18 - 25 “Very Young”
26 - 35 “Young”
36 - 50 “Middle Aged”
51 - 65 “Mature”
66 - 75 “Senior”
> 75 “Very Senior”
B. The supplier code (Supplier_ID) of each product is contained in the 9th digit its SKU code. For example, for SKU code 58720393450301, the supplier code is 4. Use appropriate functions to store the supplier code as a new column named Supplier_ID.
Create a report with the Total Items for every invoice.
Create a report with the Total Value (i.e., Quantity x Price) for every invoice. Beware that there exist price discounts that can be seen in the promotions data set. Take into account all the invoices no matter if they are Sales or Returns.
Create a report with appropriate graphs to demonstrate the contribution to the company’s revenues of each region of the country. For the top region found show the contribution to the company’s revenues per gender.
Create a report with appropriate graphs that describe the average basket. The report should include at least the number of SKU’s, total monetary value, etc. Comment on your findings.
Create a report with the demographic analysis of the customers. Your report should include the analysis (tables and graphs) of appropriate variables and should be performed at a national and regional level. The report should include a pie chart and a frequency table with the percentages of customers that belong to each age group.
Create a report that shows the top products per product type with respect to total sales value. The report should include the ranking of each product. Furthermore, the report should also include the subtotal sales of each product type at the beginning.
Create a report that analyzes the behavioral characteristics of each age group (visits to the stores, number of distinct SKU’s purchased, total cost of purchases). Augment your analysis by providing pie charts for the behavioral characteristics for each age group.
Create a report with appropriate graphs to show what is the percentage of products that are sold with and without promotion. Create a format to display the 0% promotion as “No Promotion” and the 10%, 20% and 30% as “Promotion”. Additionally, create a pie chart to show the percentage of products that are sold on each promotion type (use the description of the promotion and not its code). Do not include the products sold without promotion.
Create a report with appropriate graphs to answer the following managerial question: “Is there any difference among the various days with respect to the number of distinct SKU’s per invoice.”
Create a report, which includes appropriate charts, to show an analysis of supplier sales. The report should include the percentage of products that each supplier supplies, the total revenue generated by these products and the weight of that revenue (i.e., percentage of that revenue against all the sales).
Finally, create a cross tabulation table, using the Summary Tables Wizard task to show the total revenue of the company with respect to the country origins of the products sold by each supplier. Use the names of the suppliers and the names of the countries of origins and not their codes. Add the total revenue in the middle of the cross tabulation, the origin in the rows and the suppliers in the columns.
IMPORTANT: The requirements provided in the previous section may not be sufficiently defined. You will need to record your assumptions and how these have influenced your analysis in the cases that are required.
The deliverable consists of a compressed folder using the zip format, named as “202021.CO4759.A2.<GNumber>.zip” (e.g., 202021.CO4759.A2.G1234567.zip), with the
following files:
1. 1x Enterprise Guide project (.egp) named as 202021.CO4759.A2.G1234567.egp.
2. 1x Document (Microsoft Word format) consisting of the documentation of all parts (i.e., A, B, C) and the screenshots of the results (partial results) and charts/tables generated through the analysis.
Marks will be awarded based on the following criteria. Within each part, aim to complete the work for each section before moving on to the next as you will not get the full credit for later sections if there are significant defects in an earlier section.
In assessing the work within a section, factors such as simplicity, quality and appropriateness of comments will be considered.
Part |
Description |
Range |
Marking |
Detailed Marking |
A |
Data Pre- processing |
0-5 |
0-1 marks for each task |
· 0 – No attempt or task completed with deficiencies · 1 – Task completed correctly with no deficiencies |
B |
Basic Analysis |
0-15 |
0-3 marks for each task |
· 0 – No attempt or serious deficiencies · 1 – Task completed correctly with minor deficiencies · 2 – Task completed correctly with no deficiencies, but no key insights were provided · 3 – Task completed correctly with no deficiencies, and key insights were provided |
C |
Advanced Analysis |
0-15 |
0-3 marks for each task |
|
D |
Presentation |
0-5 |
|
|
• Anonymous marking is being used. Apart from your University ID number (“G2…”), avoid doing anything that would allow you to be identified from your work.
• Keep a complete copy of the work you hand in.
• Avoid submitting work at the last minute, but if there is a technical problem uploading to Blackboard, email the zip file to me before the deadline and upload the work when Blackboard is available.
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme