[Note: this project is no longer supported!]

 

NumPy and SciPy for IronPython / .Net

The Python NumPy and SciPy packages have been ported to run under IronPython, Microsoft’s Python implementation for .NET.  These packages implement a fast and flexible multi-dimensional array package (NumPy) and a large collection of scientific and numerical algorithms built on top of the array (SciPy).  Both packages consist of a heavily optimized native core implementation of the algorithms and a custom .NET interface for integrating the packages into IronPython.

The IronPython ports of NumPy and SciPy are full .NET ports and include custom C#/C interfaces to a common native C core. This means that the full functionality is available not only to IronPython but to all .NET languages such as C# or F# by directly accessing the C# interface objects or sometimes by evaluating IronPython expressions from other .NET languages.  This means that a multi-dimensional array object (ndarray) can be passed seamlessly between IronPython and C# or F# code. Further, the ndarray object implements the standard IEnumerable interface, allowing the array object to often be used with existing code that is not specific to NumPy.

This release represents an early version of the architecture for the NumPy 2.0 release where there is a common, platform-independent core and multiple interface layers (currently CPython and IronPython/.NET).  This version attempts to maintain full compatibility with earlier versions of NumPy, and in general this goal has been achieved.  However, some inconsistencies were necessarily as a result of the port and the re-architecting of the NumPy core.  See ‘What You Should Know’ below regarding this release.

The first part of this document provides a quick overview of the capabilities of NumPy and SciPy for those unfamiliar with the packages.  These examples attempt to highlight a range of capabilities of these packages but are by no means complete.  Please see http://numpy.org and http://scipy.org for the complete documentation.  The second part of the document describes the known limitations, incompatibilities, and future work for these packages.

Installing NumPy and SciPy on IronPython

NumPy and SciPy are bundled as pre-built “eggs” common to Python software distributions.  The files include the Python source code and pre-built binary images in an easy-to-install package.

For download and installation instructions please see http://www.enthought.com/repo/.iron/.

NumPy Examples

NumPy is the fundamental package needed for scientific computing with Python. It contains among other things:

  • a powerful N-dimensional array object
  • sophisticated (broadcasting) functions
  • useful linear algebra, Fourier transform, and random number capabilities.

Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data. This allows NumPy to seamlessly and speedily integrate with a wide variety of data formats.

This section provides a brief tour of some of the capabilities of NumPy, particularly the array functionality.

Getting Started

All of the following examples assume that IronPython (2.7rc1 or later) and NumPy for .NET have been installed and the packages loaded into python (“import numpy as np”). Please see http://www.enthought.com/repo/.iron/ for instructions on downloading and installing NumPy.

Indexing and Views

The NumPy package takes full advantage of the flexibility and expressiveness of the Python range syntax to provide powerful indexing capabilities. When a NumPy array is indexed, the package typically returns a view into the original data and avoids copying it.  This allows the selection of subsets of data to be performed very quickly, and modifications to be made using the much clearer syntax of the subset.

The following example creates a 10x10 2-d array and selects one row and one column from it.  Both the row and col variables are views into the array so when col[2] is set to a new value, the original array is modified:

x = np.arange(100).reshape((10,10)) # Now have a 10x10 array

row = x[3]                          # Y is [30..39], view not copy

col = x[:,3]                        # Z is [3, 13, ... 39], not a copy

col[2] = 42

print "x[2, 3] = %d" % x[2, 3]      # x[2,3] has been modified

Note, too, in the first line that the array is originally created as a 1-d, 100-element array and then a new shape is applied.  NumPy provides powerful capabilities to “look at” the same data array in multiple ways as will be shown in later examples.

Continuing the same example, the entire column can be modified.  Now that the 3rd column in the array has been modified:

col += 100                          # Modify third column of X

print "x = %s" % z                  #  0,  1,  2, 103,  4,  5, ...

                                    # 10, 11, 12, 113, 14, 15, ...

NumPy fully supports the Python range syntax of start:step:increment.  Using this syntax, we can select every third element of the array starting with the second:

x = np.arange(100)

y = x[1::3]

Again, this is simply a view into the original array so modifications to this array are reflected in the original.  The view makes it much easier to operate on the selected subset of data.  For example, if we wanted to multiply every third element by 2:

y = x[1::3]

for i in range(len(y)): y[i] *= 2

Or more simply:

x[1::3] *= 2

There are two advantages to this approach.  First, it avoids the off-by-one class of errors common in for-next loops in C by using the array indexing syntax.  Second, in the single-line version, iteration over the array is performed in native code, making it the fastest version, too.

Fancy Indexing

Going further, NumPy supports the use of ‘mask arrays’ as array indices.  Mask arrays are arrays of Boolean values and when used as an array index, NumPy selects elements where the mask is true.

For example:

x = np.arange(100)

y = (x % 3) == 0              # y is bool array w/ true where (x%3 == 0)

z = x[y]                      # z is elements of X where Y is true.

In the second line, the result of applying the double equal operator to the result of the modulus operator is a Boolean array indicating each index where (x%3) is 0.  Using ‘y’ as an index into ‘x’ produces an array containing only the values that are even multiples of 3 (z). In this case it is important to note that ‘z’ is a copy of x because there is no way to simply adjust the number of dimensions and strides to make a new view of ‘x’ since the mask array can be an arbitrary pattern.  In this case, changes to ‘z’ will not be reflected in ‘x’.

A slightly more complex example uses fancy indexing to extract the subset of values that are perfect squares and passes the result as a C# function which can access it through the IEnumerable interface:

x = np.arange(100)

y = np.sqrt(x)

MyCSharpClass.DoSomething(x[y == np.trunc(y)])

C vs. Fortran Ordering and Interoperability

By default NumPy stores the data using a memory layout common to the C language (most rapidly changing index last) vs the standard Fortran memory layout (more rapidly changing index first). However, NumPy supports arrays with both layouts transparently and makes it simple to convert between the two.  Selecting Fortran memory layout makes it easier to integrate with legacy Fortran code:

# 10x10 2d array of ints, convert to Fortran order, floats

x = np.arange(100).reshape((10,10))      

y = np.array(x, dtype=np.float32, order='Fortran')

# Pass IntPtr of array data to C# object

MyCSharpFortWrapr.DoSomething(y.UnsafeAddress)

Record arrays and Type Conversions

Record arrays are a variation on indexing in that they allow each element of the array to consist of multiple fields of different types. For example, the following array contains 10 elements consisting of a string, an integer, and a floating point value:

x = np.zeros(10, dtype=('S,i,f'))

Array fields can be accessed using the string ‘f0’, ‘f1’, … ‘fN’ or the fields can be explicitly named:

x = np.zeros(10, dtype=[('name', 'S'), ('quantity', 'i'), ('weight', 'f')])

Accessing each row of the record array returns the fields as a tuple:

name, qty, weight = np[0]

Alternately, each field can be accessed as a column and extracted into an array.  For example, the weighted average weight of the inventory items:

avgWeight = np.sum(x['quantity'] * x['weight']) / np.sum(x['quantity'])

And similar to indexing earlier, referencing fields of an array produces a view so modifications to the view are reflected in the original.  The weights can all be converted to from kg to lbs like this:

x['weight'] *= 2.206

One of the particularly powerful uses of record arrays is the ability to apply the record structure to an existing memory layout, such as one read from a file.  Say we have a binary file that is a simple dump of an array of the following structure:

struct {

    float latitude;

    float longitude;

    int16 elevation;

    int16 surfaceType;

};

This file can be quickly loaded into a NumPy array for processing.  For example:

x = np.fromfile('data.dat', dtype=[('lat', 'f4'), ('lng', 'f4'),('elev', 'i2'), ('surface', 'i2')])
maxElev = np.max(x['elev'])

Further, it is not uncommon to find data written using a machine architecture different than the current one, such as big- vs little-endian.  By default NumPy uses the native machine format but can be instructed to use big-endian with the greater-than symbol prefix (>) or little-endian with the less-than symbol prefix (<).  The following code reads the above data file assuming it was written on a big-endian machine and extracts a copy of the elevation data into native-format 32-bit integers:

x = np.fromfile('data.dat', dtype=[('lat', '>f4'), ('lng', '>f4'), ('elev', '>i2'), ('surface', '>i2')])

elev = np.array(x['elev'], dtype='i4')          # 'elev' is now a copy

As mentioned before, selecting a field out of an array produces a view so writes to the view are reflected back to the original.  This is true even if the byte format is non-native; the byte swaps are handled transparently:

x['elev'][2] = 1500

Signal Processing, Linear Algebra, and Other Packages

A full description of the FFT, linear algebra (linalg), and other packages provided by NumPy are beyond the scope of this document and the reader is referred to http://numpy.org for complete documentation.  However, a final short example will demonstrate how some of the above concepts can be combined with these packages.

The FFTPack module provides a range of fast Fourier transforms that operate on NumPy arrays.  The following example reads data from a hypothetical data file containing signal data.  The data is stored as a 32-bit int plus two 16-bit ints all in big-endian format.  The following code extracts the middle value and performs a basic, 1-d FFT transform on it. To make it useful without the hypothetical file, it also generates some synthetic data.

SciPy Overview

SciPy (pronounced "Sigh Pie") is large Python library for mathematics, science, and engineering. The SciPy library depends on NumPy, which provides convenient and fast N-dimensional array manipulation. The SciPy library is built to work with NumPy arrays, and provides many user-friendly and efficient numerical routines such as routines for numerical integration and optimization. NumPy and SciPy are easy to use, but powerful enough to be depended upon by some of the world's leading scientists and engineers. A full description of the SciPy library can be found at http://scipy.org.

Here is a brief look at some of the SciPy modules available on IronPython:

  • Constants: a large collection of physical and numerical constant values
  •  FFTPack: Fourier and inverse Fourier transforms plus convolutions
  • Linalg: Linear algebra library
  • Special: Collection of special function such as Bessel and Elliptic functions
  •  Stats: Discrete and continuous statistics functions
import numpy as np
import numpy.fft as fft

x = np.fromfile('signal.dat', dtype=('>i4,>i2,>i2'))

# Copy, convert to native floats, rescale
sig = np.array(x['f1'], dtype=np.float32) /  32768

# Alternate: since no file is provided, we can make up some data:
data = [math.sin(np.pi / 8 * i) for i in range(320)]
noise = [np.random.rand() * 0.02 for i in range(320)]
sig = np.array(data, dtype='f')
sig += np.array(noise)

f = fft.fft(sig)

print "Strongest frequency = %d n" % np.argmax(np.abs(f))

What You Should Know

Some SciPy packages have not been ported yet - Many of the most important SciPy packages have been ported to IronPython but some packages are still in-progress. The following packages are included in this release: stats, special, linalg, fftpack, constants, and misc.  The remaining packages are in-progress and will be added as the ports are completed and passing regression tests. 

Memory corruption in chararray package - In the NumPy chararray package (numpy.char), there is a memory corruption issue that has not yet been tracked down.  This does not impact arrays of strings in general, just the numpy.char package and surfaces in a couple of that package’s regression tests. 

‘Bytes’ type is not recognized as a string type - In CPython strings can be both unicode and ASCII.  However, in .NET all strings are unicode and the bytes type is represented as a byte[] type, which does not implement the string interface.  In most cases everything works as expected, but occasionally an error will be reported.  In this case, it is necessary to manually convert to either a bytes or string type as needed.  Because of this issue, numpy.compat.asbytes() in IronPython returns a string type instead of a bytes type. 

numpy.fromfile does not support file handles - In CPython numpy.fromfile accepts both file names (strings) and file handles. Because IronPython file handles are implemented as .NET streams and the NumPy file reading code is in the (native) core, these streams can not be passed into the NumPy core.  For now, only file names are accepted.  

numpy.fromstring only support text strings - The CPython version of numpy.fromstring supports both text and binary strings.  Because .NET strings are unicode, only text strings are currently supported. 

Pickling from transposed and dis-continuous arrays is not implemented - This functionality has not yet been implemented. 

numpy._dotblas CBlas library has not yet been ported - This module must still be ported. The SciPy linear algebra package scipy.linalg can be used instead. 

ndarray.resize() can not check for multiple references - IronPython uses a garbage collector instead of reference counting so there is no way for the resize function to check to see whether there are multiple references to the underlying data.  Using resize() on an array referenced by other objects is illegal and can cause memory issues.  This function should only be used on a new copy of an array or avoided. 

The __array_struct__ and __array_interface__ attributes are not implemented - These attributes provide methods for other objects to present an array interface to NumPy. If external classes implement these interfaces it will not cause a problem, but the NumPy array on IronPython does not currently implement these interfaces.  This breaks some regression tests in both NumPy and SciPy. 

Buffer Support is weakly implemented - The current version of IronPython does not implement a full version of the buffer protocol in CPython.  NumPy provides and implements the interface IExtBufferProtocol, but this is not a standard part of IronPython.   

Long double type is not supported - The .NET platform does not have support a long double type so it has not been implemented in this release. 

[would you like to help port NumPy/SciPy to Mono? Let’s talk!]

Last edited Jul 17 at 5:05 PM by Ptools, version 9

Comments

mariantrnka Oct 21, 2013 at 7:46 AM 
Hi Guys, i found a bug the numpy ironpython implementation.
The problem is in "fromstring" function. I used it to read wav data, but it fails sometimes for negative numbers.
Here is example:
np.fromstring('\x00\x80','<i2') -> array([16128], dtype=int16)
but the correct output should be -> array([-32768], dtype=int16)

Please, can you check that problem?

tibel Jul 26, 2012 at 1:27 PM 
Is this project still alive? As there are no new updates since a year.
Also, is there an official development page with source, issue tracker, wiki and forum for "NumPy and SciPy for IronPython"?

BCartolo Jul 19, 2012 at 11:22 PM 
The file http://www.enthought.com/repo/.iron/ironpkg-1.0.0.py does not exist in the server.
Anyone has a backup?

snickell Apr 30, 2012 at 9:06 PM 
FYI to those struggling with 64-bit (or embedding through the DLR in a 64-bit .NET app): the last commit to the numpy-refactor repository was "Updates to support 64-bit builds. Builds but getting crash in test...". That was 10 mos ago. So it looks like 64-bit is not yet working, and its likely nobody is working on it.

You can check for further commits here:
https://github.com/numpy/numpy-refactor/commits/

snickell Apr 30, 2012 at 9:02 PM 
As far as I can tell, 64-bits is NOT supported. Can anyone confirm this? Unfortunately, at least for me, this limits the usefulness of the library for embedding in existing .NET applications. Too bad, I'd really love to wield the power of SciPy from my C# apps!

springdaniel Feb 17, 2012 at 1:54 PM 
The code is open source and available in a public git repository. See this blog post for more information:

http://blog.enthought.com/python/scipy-for-net/

MennaMostafa1988 Dec 6, 2011 at 5:57 PM 
Hello,
I get this error when I import numpy:
Error: The type initializer for 'NumpyDotNet.NpyCoreApi' threw an exception.
at numpy.core.multiarray$89.numpy.core.multiarray(FunctionCode $functionCode)

My machine is Windows Server HPC edition, 64 bits.
Framework 4
IronPython 2.7 (tried it on 2.7.1 as well)
using Sho 2.0.5
I tried adding IronPython paths to sys but in vain.

Tried the same on windows 7 machine 32 bits and the import worked.
But supposedly 64 bits is supported as well, right?

Thanks in advance.

Lysander12 Aug 15, 2011 at 12:22 PM 
Hi,

I am also would like to ask that it is open source or free to use in commercal in future when numpy will be integrated?
Is there any mail list fot it?

Thanks

And a technical question:
np.float32 term works in ironpython console
But using from a c# hosted ironpython scipt I can use it somehow? Or I have to type like this: np.ScalarFloat32
I could figure it using ildasm.exe but there are more methods like np.fromfile whose pair I do not find.

muchado Jul 15, 2011 at 8:55 AM 
I am interested in being able to use the matrix functions with visual basic, preferably from within the free Visual Basic Studio. Is this possible and would I need to install IronPython first? I am hoping that there is a simple way to just call the functions without anything else. I notice that the requirements for development are much less (only Visual Studio 2010 and Windows SDK / Windows Installer SDK). It seems, however, that if I wanted to "use" the library, I would have to install a lot of high performance components. Are they really necessary? Is it not possible to get answers (albeit slowly) using the basic installation?

abhijitr Jun 8, 2011 at 10:02 PM 
This may be a dumb question, but is this project open source? I don't see sources for Numpy/Scipy in the pytools codeplex repository. Is the code hosted somewhere else?

It's been great so far, but I've been running into some bugs and would like to contribute fixes.

jasonmccampbell Jun 2, 2011 at 4:43 PM 
@geniedelsh We just pushed updated beta-2 binaries for NumPy and SciPy for .NET. (See http://enthought.com/repo/.iron/) These are built against the released IronPython 2.7 DLLs and might solve the problems you are seeing. The beta-2 release also includes new SciPy modules (NDImage, ODR, and Special) as well as many bug fixes. I'm not sure if it will address the embedding issue but at least it's one variable we can eliminate.

Regards,
Jason

striges May 18, 2011 at 10:06 AM 
@ghoumard
Try to use assembly binding.
Create the application configuration "ipy.exe.config" for IronPython and add following lines:
<?xml version ="1.0"?>
<configuration>
<runtime>
<assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
<dependentAssembly>
<assemblyIdentity name="IronPython" publicKeyToken="7f709c5b713576e1" culture="neutral" />
<bindingRedirect oldVersion="2.7.0.30"
newVersion="2.7.0.40"/>
</dependentAssembly>
</assemblyBinding>
</runtime>
</configuration>

ghoumard May 13, 2011 at 9:11 PM 
Hi,

it's been a month now and it's still not working with the 2.7.0.40 version :-(
Can we expect something to be fixed ?
Is there workaround of any kind ?

heavenwing Apr 11, 2011 at 3:40 AM 
@geniedelsh
I have same problem

geniedelsh Apr 3, 2011 at 5:13 PM 
Hi Guys. Thanks so much for this. This is awesome. Everything works great thus far, except when I embed IronPython in my application. For some reason, when using this numpy/scipy straight out of the ironpython console (i.e. by calling ipy.exe), it works perfectly. But when I call 'import scipy' from my embedded version (referencing the same installed assemblies, currently RC1), I get a DLL error, as though it has been built with a reference to 'IronPython, Version=2.7.0.20, Culture=neutral, PublicKeyToken=7f709c5b713576e1'. If I use the later version (2.7.0.40), it asks for 2.7.0.30, and if I use 2.7.0.20, it wants one less. Any ideas are welcome...