Data Processing using Pyspark

Data Processing using Pyspark In [1]: #import SparkSession from pyspark.sql import SparkSession #create spar session object spark=SparkSession.builder.appName('data_mining').getOrCreate() In [2]: # Load csv Dataset df=spark.read.csv('adult.csv',inferSchema=True,header=True) #columns of dataframe df.columns In [4]: #number of records in ... ................
................