logo Hurry, Grab up to 30% discount on the entire course
Order Now logo

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

expert
Daniel HastingsSociology
(5/5)

940 Answers

Hire Me
expert
Taimoor IftikharComputer science
(/5)

924 Answers

Hire Me
expert
Denis GibbsStatistics
(5/5)

836 Answers

Hire Me
expert
Colin JenkinssSocial sciences
(5/5)

991 Answers

Hire Me
Python Programming

The purpose of this checkpoint is to test your skills with reading CSV files into Data Frames and writing back to CSV.

INSTRUCTIONS TO CANDIDATES
ANSWER ALL QUESTIONS

Python homework

The purpose of this checkpoint is to test your skills with reading CSV files into Data Frames and writing back to CSV. This is a common task during the Data Preparation phase of the data mining life cycle which you will learn about later.

Complete the follow tasks in order by writing Python code:

  1. Download the dataset below: top_100_books.csv
    • This is a real dataset representing the top 100 best-selling books available on Amazon.com
    • It includes the following fields:
      • rank: number in the top 100 list
      • title: the name of the book
        • keep in mind that this is real data; please don't be offended by some of the book titles
      • author: the author who wrote the book
      • rating: the average rating of all verified customers who purchased the book
      • reviews: the number of customers who wrote a review (not the total number who purchased the book)
      • type: the book format; hardcover, paperback, mass market paperback, or board book
      • price: the listed price of the book on amazon.com
  1. Read this dataset into a Pandas DataFrame
  2. There was a clerical error that needs to be fixed: update the price of the book titled "Man's Search for Meaning" to $9.99
  3. Because there is only one case of type = "Mass Market Paperback", let's combine that book with "Paperback" by changing its type to "Paperback". This is a common data cleaning task when there are categorical values with very few instances.
  4. We suspect that, in general, customers rate books relative to the price they had to pay for it. In order to test this hypothesis (which we will learn how to do later), we must first create a new calculated field representing rating/price. Create this field, call it "ratingByPrice", and add it to the DataFrame as the last column
  5. Write your updated DataFrame to a new CSV file called "top_100_books_cleaned.csv"

For this checkpoint, you are going to create a function that will import any DataFrame and print out a report of the following information:

  • For each column, print:
    1. The column header name
    2. The number of unique values
    3. The total number of values

NOTE: These tasks can be performed using basic functions for DataFrames. See the table in Chapter 7.3 or find examples online to make this task easier.

Your output for the included dataset may look as simple as this:

Column, Nunique, Count
age: 47, 1338
sex: 2, 1338
bmi: 548, 1338
children: 6, 1338
smoker: 2, 1338
region: 4, 1338
charges: 1337, 1338

 

For this checkpoint, use the IMDB dataset to visualize and analyze data about movies from 2006 to 2016. You will use the pandas and matplotlib libraries to create four different charts.

  1. Create a bar chart that shows the top 10 highest grossing films.
    • The title for the graph should be “Top 10 Highest Grossing Films”
    • Use the colors ("red","green","blue","orange","pink")
    • The x label should be 'Film'
    • The y label should be 'Revenue (Millions)'
    • The x ticks should be the title of the movies
    • Make the rotation of the xticks vertical
    • The bar chart should look something like this:

Related Questions

. The fundamental operations of create, read, update, and delete (CRUD) in either Python or Java

CS 340 Milestone One Guidelines and Rubric  Overview: For this assignment, you will implement the fundamental operations of create, read, update,

. Develop a program to emulate a purchase transaction at a retail store. This  program will have two classes, a LineItem class and a Transaction class

Retail Transaction Programming Project  Project Requirements:  Develop a program to emulate a purchase transaction at a retail store. This

. The following program contains five errors. Identify the errors and fix them

7COM1028   Secure Systems Programming   Referral Coursework: Secure

. Accepts the following from a user: Item Name Item Quantity Item Price Allows the user to create a file to store the sales receipt contents

Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip

. The final project will encompass developing a web service using a software stack and implementing an industry-standard interface. Regardless of whether you choose to pursue application development goals as a pure developer or as a software engineer

CS 340 Final Project Guidelines and Rubric  Overview The final project will encompass developing a web service using a software stack and impleme