Programming and frameworks for ML Python for data analyis

[Pages:162]Programming and frameworks for ML Python for data analyis


About Me

Big Data Consultant at Santander / Big Data Lecturer More than 20 years of experience in different environments,

technologies, customers, countries ... Passionate about data and technology Enthusiastic about Big Data world and NoSQL



Reading / Writing data Exploring a DataFrame Renaming columns Filtering Columns Filtering Rows Sorting Data Adding new columns Deleting Data Grouping Data Concatenating Data Joining Data Pivot Tables


Reading / Writing data

Pandas supports the integration with many file formats or data sources out of the box (csv, excel, sql, json, parquet,...).


Reading data in text format

The function pd.read_csv() allows you to read a file and store it in a DataFrame

With the default options, files must have a header (first row) and the separator is a comma

The file could be both on disk and on the network The file can be compressed!


Reading data in text format

Sometimes the separator is something different to a comma, for example a pipe (|).

Use the sep argument to change the separator


Reading data in text format

We can use the header parameter to tell pandas where the header is located (None if the data don't have a header)


Reading data in text format

If the file don't have a header we can specify the columns' names with the names parameter



In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download