Csvcols Documentation
csvcols Documentation
Release 0.3 David Ormsbee
September 11, 2012
1 Recommendations 2 Warnings 3 Reference 4 Indices and tables Python Module Index
CONTENTS
3 5 7 11 13
i
ii
csvcols Documentation, Release 0.3
This library takes a column-oriented approach towards CSV data. Everything is stored internally as Unicode, and everything is outwardly immutable. It has support for:
? Parsing CSV files, including some Excel exported quirks ? Selecting and renaming columns ? Transforming documents by column ? Re-sorting a document by columns or rows ? Creating new documents by appending old ones together ? Merging rows CSV files are everywhere and every language has a library to read them row by row. But sometimes that's not the best way to look at it. You often want to make manipulations, transform or make rule checks on certain columns. If you keep the row by row model, then you just end up trying to jam everything into a single pass over the data. Or maybe you suck up everything into a 2D data structure and edit it in several passes. But then you start having side-effects, and you're not sure what changed what. Then you want to add a new rule that requires data from an older pass through the data, and you start making temporary data structures to hold the values of special columns or rows. I've had the 800 lb gorilla version of this thrown on my lap. It's a maintenance nightmare, and my frustrations with the code base inspired the creation of this library. The library in a nutshell:
import csvcols from csvcols import Column, S # S = shorthand for Selector
# Read Document from file. If encoding is not specified, UTF-8 is assumed. raw_shipping_doc = csvcols.load("shipping_orders.csv", encoding='latin-1')
# Select a subset of the columns and make them into a new Document. While # we're doing this, we can rename or transform Columns. users_doc = raw_shipping_doc.select(
S("email", transform=unicode.lower), S("BILLING_LAST", rename="last_name", transform=unicode.title), S("BILLING_FIRST", rename="first_name"), ("CUSTOM 1", "special_notes"), # We can use tuples for renames as well "country" # Or simple strings if we don't want to do any transforms )
# If the email, last name, and first initial match, merge the records # together, and keep the longer first name. By default, this sorts as well. merged_doc = users_doc.merge_rows_on(
lambda row: (row.email, row.last_name, row.first_name[0]), lambda r1, r2: r1 if len(r1.first_name) > len(r2.first_name) else r2 )
# Create a new Column based on existing data. is_edu_user_col = Column("Y" if s.endswith(".edu") else "N"
for s in merged_doc.email)
# Append this new column to the doc (note: this creates a new doc) final_doc = merged_doc + ("is_edu_user", is_edu_user_col)
print cvscols.dumps(final_doc)
CONTENTS
1
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related searches
- history and physical documentation guide
- medical student documentation and cms
- documentation guidelines for medical students
- history and physical documentation guid
- completed assessment documentation examples
- cms medical student documentation 2018
- medical student documentation guidelines 2019
- student documentation in medical records
- cms student documentation requirements
- free printable homeschool documentation forms
- employee conversation documentation template
- cms surgery documentation guidelines