Big Data Analytics with Hadoop and Spark at OSC
Big Data Analytics with Hadoop and Spark at OSC
04/13/2017 OSC workshop
Shameema Oottikkal Data Application Engineer Ohio SuperComputer Center email:soottikkal@osc.edu
1
What is Big Data
Big data is an evolving term that describes any voluminous amount of structured and unstructured data that has the potential to be mined for information.
Ref:
2
The 3V of Big Data
3
Data Analytical Tools
4
Supercomputers at OSC
Owens Ruby (2016) (2014)
Oakley (2012)
Theoretical Performance
# Nodes
# CPU Cores Total Memory Memory per Core
~750 TF ~144 TF
~820
240
~23,500 4800
~120 TB ~15.3 TB
>5 GB 3.2 GB
~154 TF 692 8304 ~33.4 TB 4 GB
Interconnect
EDR IB FDR/EN IB QDR IB
Storage
Home Directory Space
900 TB usable (Disk) (Allocated to each user, 500 GB quota limit)
Scratch ? DDN GPFS
1 PB with 40-50 GB/s peak performance
Project ? DDN GPFS
3.4 PB
5
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- big data analytics with hadoop and spark at osc
- three practical use cases with azure databricks
- running apache spark applications cloudera
- unified data access with spark sql
- spark sql edu
- bootstrapping big data with spark sql and data frames
- spark sql tutorialspoint
- pyspark of warcraft europython
- advanced analytics with sql and mllib
- data import databricks
Related searches
- big data tools and techniques
- big data analytics tools comparison
- data analytics vs data science
- big data analytics book pdf
- big data analytics research
- data analytics with excel pdf
- data analytics with excel
- data analytics vs data analysis
- big data analytics courses
- big data analytics certificate programs
- big data analytics courses online
- big data analytics training free