Hadoop&& vs. ParallelDatabases1
[Pages:67]Hadoop vs.
Parallel Databases
Juliana Freire!
The Debate Starts...
The Debate Continues...
? A comparison of approaches to large-scale data analysis. Pavlo et al., SIGMOD 2009 !
o Parallel DBMS beats MapReduce by a lot!! o Many were outraged by the comparison!
? MapReduce: A Flexible Data Processing Tool. Dean and Ghemawat, CACM 2010!
o Pointed out inconsistencies and mistakes in the comparison!
? MapReduce and Parallel DBMSs: Friends or Foes? Stonebraker et al., CACM 2010!
o Toned down claims...!
Outline
? DB 101 - Review! ? Background on Parallel Databases ? for more detail, see
Chapter 21 of Silberschatz et al., Database Systems Concepts, Fifth Edition! ? Case for Parallel Databases! ? Case for MapReduce! ? Voice your opinion!!
Storing Data: Database vs. File System
? Once upon a time database applications were built on top of file systems...
? But this has many drawbacks:
o Data redundancy, inconsistency and isolation
? Multiple file formats, duplication of information in different files
o Difficulty in accessing data
? Need to write a new program to carry out each new task, e.g., search people by zip code or last name; update telephone number
o Integrity problems
? Integrity constraints (e.g., num_residence = 1) become part of program code -- hard to add new constraints or change existing ones
? Atomicity of updates
o Failures may leave database in an inconsistent state with partial updates carried out, e.g., John and Mary get married, add new residence, update John's entry, and database crashes while Mary's entry is being updated...
Why use Database Systems?
? Declarative query languages ? Data independence ? Efficient access through optimization ? Data integrity and security
o Safeguarding data from failures and malicious access
? Concurrent access ? Reduced application development time ? Uniform data administration
Query Languages
? Query languages: Allow manipulation and retrieval of data from a database
? Queries are posed wrt data model
o Operations over objects defined in data model
? Relational model supports simple, powerful QLs:
o Strong formal foundation based on logic o Allows for automatic optimization
SQL and Relational Algebra
? Manipulate sets of tuples ? c R= select -- produces a new relation with the subset of
the tuples in R that match the condition C
o Type = "savings" Account o SELECT * FROM Account
WHERE Account.type = `savings'
? AttributeList R = project -- deletes attributes that are not in projection list.
o Number, Owner, Type Account o SELECT number, owner, type FROM Account
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- department of political science politics and pop culture
- 30272 101 report documentation re port no reciplenra
- america s history apush textbook pdf
- danskudviklede computerspil forside
- spectral and singular value decompositions topic 2
- selenology today
- environ sample data 1st quarter 1984 petrotomics co
- fun games ages 5 8 terminator tag
- artificial intelligence in automotive technology
- t2 terminator tooling specification sheet part no 63856 8000
Related searches
- ischemia vs injury vs infarct
- allergic conjunctivitis vs bacterial vs viral
- flu vs allergies vs cold symptoms chart
- operating vs investing vs financing
- mandarin orange vs clementine vs tangerine
- coordinator vs manager vs director
- i vs me vs myself
- mean vs median vs mode vs range
- average vs mean vs median vs mode
- race vs ethnicity vs nationality vs culture
- marxism vs socialism vs communism vs fascism
- communism vs socialism vs capitalism vs fascism