Structured Data Processing - Spark SQL
Structured Data Processing - Spark SQL
Amir H. Payberah
payberah@kth.se 2021-09-20
The Course Web Page
1 / 88
Where Are We?
2 / 88
Motivation
3 / 88
Hive
A system for managing and querying structured data built on top of MapReduce. Converts a query to a series of MapReduce phases. Initially developed by Facebook.
4 / 88
Hive Data Model
Re-used from RDBMS: ? Database: Set of Tables. ? Table: Set of Rows that have the same schema (same columns). ? Row: A single record; a set of columns. ? Column: provides value and type for a single value.
5 / 88
Hive API (1/2)
HiveQL: SQL-like query languages
6 / 88
Hive API (1/2)
HiveQL: SQL-like query languages Data Definition Language (DDL) operations
? Create, Alter, Drop
-- DDL: creating a table with three columns CREATE TABLE customer (id INT, name STRING, address STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
6 / 88
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- advanced data science on spark
- structured data processing spark sql
- spark big data processing framework
- 1 apache spark brigham young university
- lecture on mapreduce and spark asaf cidon
- pyspark sql s q l q u e r i e s intellipaat
- introduction to scala and spark sei digital library
- cheat sheet pyspark sql python lei mao s log book
- introduction to hadoop hive an d apache spark
Related searches
- structured data example
- structured data vs unstructured
- spark sql documentation
- spark sql example
- spark sql reference
- spark sql join
- spark sql vs rdd
- structured data vs unstructured data
- google structured data types
- structured data vs unstructured data examples
- structured data examples
- examples of structured data types