Pyvot – A Python to/from Excel Connector
Click this to view: http://www.youtube.com/watch?v=Gu31s7kXPOg
Pyvot connects familiar data-exploration and visualization tools in Excel with the powerful data analysis and transformation capabilities of Python, with an emphasis on tabular data. It provides a minimal and Pythonic interface to Excel, smoothing over the
pain points in using the existing Excel object model as exposed via COM.
We’ll now present a quick tour of Pyvot’s key features and workflow. For a more thorough treatment, make sure to check out the full
Python to Excel and back again
Pyvot enables easy transit of data between Python and Excel.
For example, we can move a list of Python values to Excel, view and manipulate the data, and retrieve the new version. In a symmetric and equally common usage, we can grab and process Excel values in Python, and display the result to Excel.
>>> xl.Workbook() # Create empty Excel workbook, stored as the 'active' workbook
>>> a = range(1,10)
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> xl.view(a) # show Python enumerable in Excel (active workbook)
The Python array is copied into Excel. By default, Pyvot picks an open range on the active worksheet.
Retrieving data from Excel requires an xl.Range object, which represents the cells on which to operate. The
xl.view function returns the range that it selected to store the data:
>>> r = xl.view(range(1,10))
<ColumnVector range object for $A$2:$A$1048576 (visible only)>
[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0]
>>> sum(r) # ranges are iterable
Note that view chose the entirety of column A (sans header) to store our list, but
Range.get assumed we only wanted its useful data.
Now we can manipulate the Excel range, and fetch the new results back. For example, suppose we delete some rows using the Excel UI and change the values like so:
>>> r.get() # get updated values
[1.0, 20.0, 8.0, 9.0]
The current selection is also available as a Range. This allows Excel to act as an interface for selecting interesting data:
Mapping Excel columns
You can also run map operations on Excel data. This can be useful for writing Python functions to clean the data and return new computed columns.
For example, suppose we want to combine the city and state columns in a single column:
>>> def Merge(city, state):
... return city + ", " + state
>>> wCities = xl.Workbook("cities.xlsx")
>>> xl.map(Merge, wCities.get("City"), wCities.get("State"))
<ColumnVector range object for $N$2:$N$51 (visible only)>
Note that we retrieved Range objects for the table columns by name. Pyvot can obtain ranges for table columns, named ranges, and plain A1:B2-style references.
xl.map works just like the built-in map function, but inserts the results as a new column in Excel. It will infer a column name from the function name, and it will pick a column related to its arguments. Note
below that “Merge” becomes a new column in the source table:
Pyvot supports join operations on Excel tables.
xl.join(rangeDest, rangeSrc) takes in 2 Ranges. It infers the containing tables for each range. For each value in rangeDest, it will do the lookup in
rangeSrc and merge the row into the table in
For example, suppose we want to city crime data from in “cities.xlsx” with population data from “population.xlsx”.
First we load (or connect to) the workbooks containing tables we want to join:
>>> wCities = xl.Workbook(“cities.xlsx”)
>>> wPop = xl.Workbook("Population.xlsx")
Then we compute a primary key column, if necessary. In this case, we’re joining on multiple columns (both city and state), so we’ll merge them into a single column for the join:
>>> xl.map(Merge, wPop.get("City"), wPop.get("State"))
<ColumnVector range object for $D$2:$D$51 (visible only)>
The same operation is then applied on wCities.
The two mapped columns act as keys for the join:
>>> xl.join(wCities.get("Merge"), wPop.get("Merge"))
Destination table after join
Pyvot requires CPython 2.6 or
2.7 with the Python for Windows extensions (pywin32) installed, and at the minimum, Office 2010.
If a clean Python session can import the win32com module, Pyvot is ready to be installed:
PS C:\> python
Python 2.7 (r27:82525, Jul 4 2010, 09:01:59) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import win32com
Installing from PTVS
Python Tools for Visual Studio now ships an
add-on for Pyvot integration. This add-on includes a Pyvot project template as well as support for installing Pyvot into multiple interpreters.
First, install Pyvot into the desired interpreters from the Tools menu (keep in mind that Pyvot requires CPython):
Then, create a new Pyvot project:
The new project will need to be configured to use an interpreter for which Pyvot has been installed.
Installing with setuptools
If you have setuptools installed, you can install a source or binary distribution of Pyvot with
easy_install. easy_install is usually in
<Python directory, ex. C:\Python27>\Scripts
To install the latest version from PyPI:
To install an already-downloaded source (.zip) or binary (.egg) distribution: