Apache Spark Notes

distinct returns a new unique Dataframe filter(conditionExpr) filters based on given sql expression groupBy(col1, cols) groups DF using specified columns ... Spark DataFrame: is a programming abstraction in sparkSQL: a distributed collection of data organized into named columns and scales to … ................
................