Spark - Read JSON file to RDD - Example - Tutorial Kart
[Pages:4]Spark ? Read JSON file to RDD ? Example
Spark ? Read JSON file to RDD
JSON has become one of the most common data format that is being exchanged between nodes in internet and applications.
In this tutorial, we shall learn how to read JSON file to an RDD with the help of SparkSession, DataFrameReader and DataSet.toJavaRDD().
Steps to Read JSON file to Spark RDD
To read JSON file Spark RDD,
1. Create a SparkSession. SparkSession spark = SparkSession .builder() .appName("Spark Example - Write Dataset to JSON File") .master("local[2]") .getOrCreate();
2. Get DataFrameReader of the SparkSession.spark.read() 3. Use DataFrameReader.json(String jsonFilePath) to read the contents of JSON to
Dataset.spark.read().json(jsonPath) 4. Use Dataset.toJavaRDD() to convert Dataset to
JavaRDD.spark.read().json(jsonPath).toJavaRDD()
Example : Spark ? Read JSON file to RDD
Following is a Java Program to read JSON file to Spark RDD and print the contents of it.
employees.json
{"name":"Michael", "salary":3000} {"name":"Andy", "salary":4500} {"name":"Justin", "salary":3500} {"name":"Berta", "salary":4000} {"name":"Raju", "salary":3000}
JSONtoRDD.java
JSONtoRDD.java
import org.apache.spark.api.java.JavaRDD; import org.apache.spark.sql.Row; import org.apache.spark.sql.SparkSession;
public class JSONtoRDD { public static void main(String[] args) { // configure spark SparkSession spark = SparkSession .builder() .appName("Spark Example - Read JSON to RDD") .master("local[2]") .getOrCreate();
// read list to RDD String jsonPath = "data/employees.json"; JavaRDD items = spark.read().json(jsonPath).toJavaRDD();
items.foreach(item -> { System.out.println(item);
}); } }
Output
[Michael,3000] [Andy,4500] [Justin,3500] [Berta,4000] [Raju,3000]
Conclusion
In this Spark Tutorial, we have learnt to read JSON file to Spark RDD with the help of an example Java program.
Learn Apache Spark
Apache Spark Tutorial Install Spark on Ubuntu Install Spark on Mac OS Scala Spark Shell - Example Python Spark Shell - PySpark Setup Java Project with Spark Spark Scala Application - WordCount Example
Spark Python Application Spark DAG & Physical Execution Plan Setup Spark Cluster Configure Spark Ecosystem Configure Spark Application Spark Cluster Managers
Spark RDD Spark RDD Spark RDD - Print Contents of RDD Spark RDD - foreach Spark RDD - Create RDD Spark Parallelize Spark RDD - Read Text File to RDD Spark RDD - Read Multiple Text Files to Single RDD Spark RDD - Read JSON File to RDD Spark RDD - Containing Custom Class Objects Spark RDD - Map Spark RDD - FlatMap Spark RDD - Filter Spark RDD - Distinct Spark RDD - Reduce
Spark Dataseet Spark - Read JSON file to Dataset Spark - Write Dataset to JSON file Spark - Add new Column to Dataset Spark - Concatenate Datasets
Spark MLlib (Machine Learning Library) Spark MLlib Tutorial KMeans Clustering & Classification Decision Tree Classification Random Forest Classification
Random Forest Classification Naive Bayes Classification Logistic Regression Classification Topic Modelling
Spark SQL Spark SQL Tutorial Spark SQL - Load JSON file and execute SQL Query
Spark Others Spark Interview Questions
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- comparing sas and python a coder s perspective
- delta lake cheatsheet databricks
- introduction to binary logistic regression
- ts flint documentation
- advanced analytics with sql and mllib
- networkx tutorial stanford university
- spark read json file to rdd example tutorial kart
- 1 5 https 21ot5o
- spark programming spark sql
- python sort array by second column