logo Hurry, Grab up to 30% discount on the entire course
Order Now logo

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

expert
Bobby DavroComputer science
(5/5)

764 Answers

Hire Me
expert
Devanshu KamraMarketing
(5/5)

522 Answers

Hire Me
expert
Nawaj SareefData mining
(5/5)

751 Answers

Hire Me
expert
Daniels FarmerSociology
(5/5)

883 Answers

Hire Me
Research Paper
(5/5)

term Big Data is defined for the data sets that are very large or complex that traditional data set processing application software

INSTRUCTIONS TO CANDIDATES
ANSWER ALL QUESTIONS

I. INTRODUCTION

The term Big Data is defined for the data sets that are very large or complex that traditional data set processing application software is inadequate or are unable to deal with these complex or large data sets. The major difference between tradition and big data is in terms of volume, velocity and variation. Volume means amount of data that is been generated; velocity refers to the speed with which the data is been generated and variation means types of structured and non structured data.

 

Nowadays, big data is becoming an important topic for research in almost every field especially cyber security. The main sources of generation of this data are social media sites and smart devices. Generation of data at this speed leads to the various concern regarding the security of the data that is been created as it is very important to keep this data safe because this data also contain some important and sensitive data such as bank account number passwords credit card details etc so it is important to keep this data secure. Also, advances in Big Data analytics provide tools to extract and utilize this data, making violations of privacy easier. . As a result, along with developing Big Data tools, it is necessary to create safeguards to prevent abuse [2].

 

II. DEFINING AND ANALYTICS BIG DATA The term big data is referred to massive amount information that is been stored and transmitted in a computer system.

 

Big Data is differentiated from traditional technology in 3 ways:

 

1. The amount of data (Volume) - Size: the volume of datasets is a critical factor, that is, how much amount of data that is been generated

 

2. The rate of data generation and transmission (Velocity) - Complexity: the structure, behaviour and permutations of datasets in critical factor.

 

3. The types of structured and unstructured data (Variety) - Technologies: tools and techniques that are been used to process a sizable or complex datasets is a crucial factor.

 

III. TECHNOLOGY MEGA TRENDS

Big data is generating an enormous amount of attention among business, media and even the consumers, along with the analytics, cloud based technologies. These all the part of the current eco-system created by technology megatrends.

Big data has become a major topic or the theme of the technology media, it has also made its way into many compliances and in internal audits. In EY's Global Forensic Data Analysis Survey 2014, 72% of respondents believe that emerging big data technologies can play a key role in fraud prevention and detection .yet only few about 7% of respondents were aware about any specific big data technologies, and only very few about 2%of them were actually using them. FDA (Forensic data analysis) technologies are available to help the companies to maintain the pace with increasing data at very high speed (volumes), as well as business complexities.

 

Big Data is broad and encompasses many trends and new technology developments, the top ten emerging technologies that are helping users cope with and handle Big Data in a cost-effective manner.

1. Column oriented database

Traditional, row oriented database are excellent for the online transaction processing with the high update speeds, but they fall short in the query performance as more data volume grows and as data becomes unstructured.

 

2. Schema less database or No Sql database

there are various database types that fit into this category, such as key value storage and document stores, which focus on storage and retrival of large volume of data which is either unstructured, semi-structured, or even structured data.

 

3. Map Reduce

This is a programming paradigm that allows for massive job execution scalability against thousands of servers or clusters of servers. Any Map Reduce implementation consists of two tasks:

The "Map" task, where an input dataset is converted into a different set of key/value pairs, or tuples. The "Reduce"

 

task, where several of the outputs of the "Map" task are combined to form a reduced set of tuples .

4. Hadoop

Hadoop is the best and the most popular implementation of map reduce, being an entirely an open source platform for handling of big data. It is flexible enough to be able to work with multiple data sources. It has several different applications, but one of the top use cases is for large volumes of constantly changing data, such as location- based data from weather or traffic sensors

 

5. Hive

It is a SQL-LIKE bridge that allows conventional BI application to run queries against a Hadoop cluster It was developed originally by Facebook, but has been made open source for some time now, and it's a higher-level abstraction of the Hadoop framework that allows anyone to make queries against data stored in a Hadoop cluster just as if they were manipulating a conventional data store.

 

6. Pig

PIG was developed by Yahoo .PIG is bridge that tries to bring Hadoop closer to the realities of developers and business users, similar to Hive. Unlike Hive, however, PIG consists of a "Perl-like" language that allows for query execution over data stored on a Hadoop cluster, instead of a SQL-like language [9].

 

7. WibiData

Wibi data is a combination of web analytics with hadoop it is been built on the top of Hbase which itself a database layer on hadoop.

 

8. Sky Tree

It is a high performance machine learning and data analytics platform focussed specially on the handling of big data. machine learning is a very important part of big data, since the data volume make manual exploration.

 

(5/5)
Attachments:

Related Questions

. The fundamental operations of create, read, update, and delete (CRUD) in either Python or Java

CS 340 Milestone One Guidelines and Rubric  Overview: For this assignment, you will implement the fundamental operations of create, read, update,

. Develop a program to emulate a purchase transaction at a retail store. This  program will have two classes, a LineItem class and a Transaction class

Retail Transaction Programming Project  Project Requirements:  Develop a program to emulate a purchase transaction at a retail store. This

. The following program contains five errors. Identify the errors and fix them

7COM1028   Secure Systems Programming   Referral Coursework: Secure

. Accepts the following from a user: Item Name Item Quantity Item Price Allows the user to create a file to store the sales receipt contents

Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip

. The final project will encompass developing a web service using a software stack and implementing an industry-standard interface. Regardless of whether you choose to pursue application development goals as a pure developer or as a software engineer

CS 340 Final Project Guidelines and Rubric  Overview The final project will encompass developing a web service using a software stack and impleme