Wfastcgi, Python 3, Classic ASP and UTF-8

Feb 7 at 8:04 AM
Edited Feb 7 at 8:06 AM
Hello to everyone,
this is my question. Hope to post in the right place.
I have an IIS 6 Server which runs a classic ASP project. My company now wants to gradually migrate to a Python version of this project, so I've managed to create a WSGI application with 2 files: wfastcgi.py (ported for Python 3) and an "app.py" created by me that routes the requests recevied from wfastcgi to my project script.
The workflow is something like this:
User Request > IIS 6 > WFASTCGI.PY > APP.PY > (Project Python scripts)

Everything works fine, except for a single, basilar, worrying thing: charset encoding.

wfastcgi encode and decode the requests in "ISO-8859-1" and things goes well if I use this charset in app.py and project scripts (or using ISO-8859-15 cause in my country we need characters like Euro ecc). There are however some cases of encoding exceptions (can't know which character causes the problem at this time).

Now I have this requisites:
  • must mantain IIS6
  • must have hybrid project with ASP pages that post requests to python pages and vice-versa
  • must ensure that uploads does not corrupt files
  • must perform CRUD actions on SQLServer 2k/2k8
So, what is the best configuration of charset to use python 3 and iis the way I need it? Can I use UTF-8 encode for "app.py" and projects scripts, ensuring the correct GET/POST/UPLOAD variables between ASP>PY>ASP calls?
Coordinator
Feb 7 at 3:47 PM
This is absolutely the right place. Welcome! (And good on you for choosing Python 3 - migration from Python 2.x is slow, but it will be worth it :) )

The first thing I'll let you know is that we've done quite a bit of work on wfastcgi.py since the last release, so you may want to grab an updated version from source control - when we release 2.1 Alpha, there will also be a dedicated installer for it.

Python 3.3 support was the main reason we made so many changes, as well as better adhering to the WSGI specification. Encoding was right at the top of the list of things to fix, and hopefully we've managed it. It will be great if you can try it out and see if it works.

The most important thing about using it with Python 3 is that you should always deal with bytes, not str. The encoding you mention has to be used for the FastCGI headers, but we were incorrectly applying it to data as well. Now as long as your WSGI handler only returns bytes, we won't modify them, and the wsgi.input and wsgi.data values should always be bytes (now that we don't encode/decode them).

There are a few parameters that are treated differently due to the encoding (such as SCRIPT_NAME). We follow mod_wsgi's approach as described at http://wsgi.readthedocs.org/en/latest/python3.html - in short, SCRIPT_NAME, PATH_INFO and QUERY_STRING are decoded using ISO-8859-15 and the original bytes are stored as wsgi.script_name etc. I believe IIS defaults to ISO-8859-15 for parameters, so the decoding we do should never fail, but if it's a problem then we'll happily help fix it. (Possibly we can pick up the encoding from an environment variable rather than hard-coding it? I haven't looked too far into that.)

Feel free to make any modifications you like to the script, and we're happy to accept contributions back if you make any changes.
Coordinator
Feb 14 at 4:32 PM
Just in case you weren't signed up for the announcements, we've released PTVS 2.1 Alpha now, which includes our improved version of WFastCGI.

We'll probably make more changes up until 2.1 final, so any feedback you may have would be greatly appreciated, and has an excellent chance of being fixed in the official release.