Cs.furman.edu



What is Big Data?Every day, we create quintillion bytes of data in variety of forms such as sensors used to gather climate information, posts to social media websites, digital pictures and videos, purchase transaction records, web logs, and cell phone GPS signals. This data is called "Big Data".Few examples of organizations dealing with Big Data are as follows:Facebook handles 50 billion photos from its site members.Walmart handles more than 1 million customer transactions every hour, which are imported into databases estimated to contain more than 2.5 petabytes of data.FICO Falcon Credit Card Fraud Detection System protects 2.1 billion active accounts world-wide."Big Data" however just doesn't refer only to the huge data manifested in many forms. "Big Data" instead refers to a practice which deals with analyzing this data and utilizing the data analysis to derive strategic decisions such as introducing new category of products, restructuring the organization and improving the customer care services by analyzing the customer care recordings/logs.Big Data generates value from the storage and processing of very large quantities of digital information that cannot be analyzed with traditional computing techniques. It requires different techniques, tools, algorithms and architecture. Some of Big Data tools and technologies are Apache Hadoop, Apache Spark, R Language and Apache ZooKeeper.Characteristics of Big DataAfter briefly talking about Big Data, let's talk about various characteristics that define Big Data:Volume- This represents the size of the data which determines the value and potential of the data under consideration.Variety- This means the category to which Big Data belongs as this helps data analysts to effectively use the data to their advantage and upholding the importance of Big Data. Some of the Big Data categories examples are emails, text messages and documents.Velocity- This refers to the speed of generation and processing of the data to meet the demands and challenges of lying in the path of growth and developmentVeracity- This characteristics represents the fact that quality of the data being captured can vary greatly and thus accuracy of analysis also depends on veracity of the source dataComplexity- Big Data management is very complex process as large volumes of the data coming from multiple sources need to be linked, connected and correlated in order to be able to grasp the required informationAlthough above five factors define Big Data characteristics, only first 3 (Volume, Variety and Velocity - commonly known as 3V's) characteristics are commonly talked about and popular. Below diagram depicts how the data is increasing on these 3 scales:47815502266950MB = megabytesGB = gigabytes TB = terabytesPB =petabytes020000MB = megabytesGB = gigabytes TB = terabytesPB =petabytesWhy Big Data?Next, one may ask why all of sudden we care about this data as earlier also we have had lot of data available to be processed and analyzed. In this section, we will discuss about following factors that have contributed into advent of Big Data practice:Increase in storage capacitiesIncrease in processing powersAvailability of data90% of data in the world have been created in the last couple of years aloneChallenges with Big DataAfter knowing why we need to deal with Big Data, we will now talk about the challenges that we face while processing, analyzing and utilizing Big Data:Scale- Accessing the level of details needed from sheer volumes of data at a high speed.Performance- In an online world where nanosecond delays can cost you sales, big data must move at extremely high velocities no matter how much you scale or what workloads your database must perform. The data handling hoops of RDBMS solutions put a serious drag on performance.Workload Diversity- Big data comes in all shapes and sizes. Rigid schemas have no place here and you instead need a more flexible design. There is need for a technology to fit this type of data.Manageability- Staying ahead of big data using RDBMS technology is a costly, time-consuming and often futile endeavor.High Availability- Data driven applications relying on big data to feed essential revenue-generating business applications need high availability than the traditional high availability.Cost- Meeting the above listed challenges with RDBMS can cost a pretty penny.Objectives of Big DataIn this section, we will explain the objectives that we strive to achieve while dealing with Big Data:Cost reduction- MIPS (Million Instructions per Second) and above terabyte storage for structured data are now cheaply delivered through big data technologies like Hadoop clusters. This is mainly because of the ability of Big Data technologies to utilize commodity scale hardware for processing by employing techniques such as data sharing, distributed computing etc.Faster processing speeds- Big data technologies have also helped reducing large scale-analytics processing from hours to minutes. These technologies have also been instrumental in real time analytics reducing processing times to seconds.Big Data based offerings- Big data technologies have enabled organizations to leverage big data in developing new product and service offerings. The best example may be LinkedIn, which has used big data and data scientists to develop a broad array of product offerings and features, including People You May Know, Groups You May Like, Jobs You May Be Interested In, Who has Viewed My Profile, and several others. These offerings have brought millions of new customers to LinkedIn.Supporting internal business decisions- Just like traditional data analytics, big data analytics can be employed to support business decisions when there are new and less structured data sources. For example, any data that can shed light on customer satisfaction is helpful, and much data from customer interactions is unstructured such as website clicks, transaction records, and voice recordings from call centers ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download