Structured Data Processing - Spark SQL
Structured Data Processing - Spark SQL
Amir H. Payberah
payberah@kth.se 2020-09-15
The Course Web Page
1 / 87
Where Are We?
2 / 87
Motivation
3 / 87
Hive
A system for managing and querying structured data built on top of MapReduce. Converts a query to a series of MapReduce phases. Initially developed by Facebook.
4 / 87
Hive Data Model
Re-used from RDBMS: ? Database: Set of Tables. ? Table: Set of Rows that have the same schema (same columns). ? Row: A single record; a set of columns. ? Column: provides value and type for a single value.
5 / 87
Hive API (1/2)
HiveQL: SQL-like query languages
6 / 87
Hive API (1/2)
HiveQL: SQL-like query languages Data Definition Language (DDL) operations
? Create, Alter, Drop
-- DDL: creating a table with three columns CREATE TABLE customer (id INT, name STRING, address STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
6 / 87
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- structured data processing spark sql
- scala and the jvm for big data lessons from spark
- cloudera cca175 cca spark and hadoop developer exam
- big data frameworks scala and spark tutorial
- spark sql is the spark component for structured data
- introduction to scala and spark sei digital library
- data science at scale with spark github pages
- apache spark github pages
Related searches
- structured data example
- structured data vs unstructured
- spark sql documentation
- spark sql example
- spark sql reference
- spark sql join
- spark sql vs rdd
- structured data vs unstructured data
- google structured data types
- structured data vs unstructured data examples
- structured data examples
- examples of structured data types