In this first assignment, you will apply what you learned about big data analytics to a business problem. As an example, you are provided an Excel worksheet with real estate data for Houston. The data were downloaded from the redfin website in 2021. This assignment is to concentrate on the first and second step of the CRISP-DM cycle. Although CRISP-DM was proposed for data mining, we will adopt for this course because it serves as a good process when dealing with data analysis in general. The assignments each week will have the same format in general (one page writing and one page appendix of charts and tables. However, the specifics expected in the assignment will change from week to week. This week you work on Part A of Assignment 1, namely cleaning data. You produce the graphs and tables using JMP described in the guide. Next week you work on Part B of Assignment 1, i.e., finding initial relationships between predictors and the dependent variable price and learn how to bin variables of nominal modelling type. The general business problem we want to address is that of predicting price of a real estate property. For instance, suppose you work for Zillow and want to develop an algorithm for predicting price of a property. The assignments are designed to practice certain tasks in analytics & statistics, critical thinking as it relates to interpreting data and communicating the results effectively. There are specific variables that you may pay special attention to such as property type, BEDS, BATHS, SQUARE FEET, LOT SIZE. Also, city, ZIP Code, LOCATION and $/Sf must be evaluated and a decision made what to include and exclude. But for all variables you must provide a rationale for excluding/including/restricting them in modeling price in your report next week. Think about how Zillow might use a model to predict price or think about how you would use a model to predict a price of your home. What predictors would you expect in a model to predicting price and why? Obviously you would want a model to be as close as possible to the price it would sell for. That would on one hand suggest that you include as many variables and properties as possible. But on the other hand you also would like a more homogeneous group of properties since the price of a house in New York is not the same as the price of the same house in Baton Rouge, multi-unit houses are different than single family homes. In general, modelling requires to set a scope for your model, i.e., what properties should be included and what should be excluded and why. For instance, do I build a model for the U.S., Texas or Houston? These are the issues you have to address and therefore I won't tell you what exactly to do. But hopefully this helps a bit for you to make those decisions. Data needs to be cleaned, outliers applied, and the first 2 steps of the CRISP model assessed.
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme