Tools for Web testing, Web recording/playback, nose unittest extensions, and code coverage.
Posted by patrick sur mars 17, 2007
Other web testing resources:
- Demo 1: Testing CherryPy
- Demo 2: Testing CherryPy without exec’ing a process
- Demo 3: Basic code coverage analysis with figleaf
- Demo 4: More interesting code coverage analysis with nose and figleafsections
- Demo 5: Writing a simple twill extension to do form « fuzz » testing
- Demo 6: Django fixtures for twill/wsgi_intercept
- Demo 7: Recording and examining a Django session with scotch
- Demo 8: Convert the Django session into a twill script
- Demo 9: Replaying the Django session from the recording
At PyCon ’07, I gave a talk on testing tools in which I « performed » nine live demos of features from twill, scotch, pinocchio, and figleaf. These are tools for Web testing, Web recording/playback, nose unittest extensions, and code coverage.
This is the source code for that talk.
The Python code should work on any UNIX-like machine; I gave my talk with my MacBook, running Python 2.3.
Note that the only demo that didn’t work during the talk was the first one, in which I used ‘subprocess’ to start a CherryPy server. Since this is contraindicated IMO (I suggest using wsgi_intercept; see Demo 2) I was more or less happy that it failed. (It failed because I made a last minute adjustment to the command that ran app.py, and didn’t test it before the talk… ;(
You can get the entire source code (including this text) at
The latest version of this README should always be accessible at
This was billed as an « intermediate » talk at PyCon, and I jumped right into code rather than giving a detailed introduction. This document follows that strategy — so go look at the code and read what I have to say afterwards!
You will need twill 0.9b1 (the latest available through easy_install), nose 0.9.1 (the latest available through easy_install), scotch (latest), figleaf (latest), and pinocchio (latest). You will also need to have CherryPy 3.x and Django 0.95 installed.
You may need to adjust your PYTHONPATH to get everything working. Check out env.sh to see what I put in my path before running everything.
Purpose: show how to use twill to test a (very simple!) CherryPy site.
twill is a simple Web scripting language that lets you automate Web browsing (and Web testing!) Here I’ll show you how to use it to run a few simple tests on a simple CherryPy site.
The CherryPy application under test is in cherrypy/app.py. You can check it out for yourself by running python app.py; this will set up a server on http://localhost:8080/.
The URLs to test are: http://localhost:8080/, which contains a form; http://localhost:8080/form/, which processes the form submit; and http://localhost:8080/page2/, which is just a simple « static » page.
The twill test script for this app is in cherrypy/simple-test.twill. All it does is go to the main page, confirm that it loaded successfully and contains the words « Type something », fills in the form with the string « python is great », and submits it. The final command verifies that the output is as expected.
If you wanted to run all of this stuff manually, you would type the following (in UNIX):
python app.py &twill-sh -u http://localhost:8080/ simple-test.twill
So, how do you do it with twill and nose?
Take a look at the unit test source code in cherrypy/demo1/tests.py. This is a nose test that you can run by typing
nosetests -w demo1/
from within the cherrypy/ subdirectory.
Try running it.
You should see a single ‘.’, indicating success:
.----------------------------------------------------------------------Ran 1 test in 2.069sOK
nosetests -w demo1/ -v
You should see
tests.test ... ok----------------------------------------------------------------------Ran 1 test in 2.069sOK
So yes, it all works!
Briefly, what is happening here is that:
- tests.py is discovered and imported by nose (because it has the magic ‘test’ in its name); then setup(), test(), and teardown() are run (in that order) because they are names understood by nose.
- setup executes the application app.py, capturing its stdout and stderr into a file-like object (which is accessible as pipe.stdout). setup has to wait a second for app.py to bind the port, and then sets the URL of the Web server appropriately.
- test then runs the twill script via twill.execute_file, passing it the initial URL to go to.
- teardown calls a special URL, exit, on the Web app; this causes the app to shut down (by raising SystemExit). It then waits for the app to exit.
A few notes:
setup and teardown are each run once, before and after any test functions. If you added in another test function — e.g. test2 — it would have access to url and pipe and an established Web server.
Note that url is not a hardcoded string in the test; it’s available as a global variable. This lets any function in this module (and any module that can import tests) adjust to a new URL easily.
Also note that url is not hardcoded into the twill script, for the same reason. In fact, because this twill script doesn’t alter anything on the server (mainly because the server is incredibly dumb ;) you could imagine using this twill script as a lifebeat detection for the site, too, i.e. to check if the site is minimally alive and processing Web stuff properly.
What if the Web server is already running, or something else is running on the port?
More generally, what happens when the Popen call goes awry? How do you debug it?
(Answer: you’ve got to figure out how to get ahold of the stdout/stderr and print it out to the environment, which can be a bit ugly.)
What happens if /exit doesn’t work, in teardown?
(Answer: the unit tests hang.)
Notes 4-6 are the reasons why you should think about using the wsgi_intercept module (discussed in Demo 2) to test your Web apps.
Purpose: demonstrate the use of wsgi_intercept.
The use of subprocess in Demo 1 was a big ugly problem: once you shell out to a command, doing good error handling is difficult, and you’re at the mercy of the environment. But you needed to do this to run the Web server, right?
Well, yes and no. If your goal was to test the entire Web stack — from your OS socket recv, through the CherryPy Web server, down to your application — then you really need to do things this way.
But that’s silly. In general, your unit and functional tests should be testing your code, not CherryPy and your OS; the time for testing that everything works together is later, during your staging and end-to-end testing phase(s). Generally speaking, though, your OS and Web server are not going to be simple things to test and you’re better off worrying about them separately from your code. So let’s focus on your code.
Back to the basic question: how do you test your app? Well, there’s a nifty new Python standard for Web app/server interaction called WSGI. WSGI lets you establish a nicely wrapped application object that you can serve in a bunch of ways. Conveniently, twill understands how to talk directly to WSGI apps. This is easier to show than it is to explain: take a look at cherrypy/demo2/tests.py. The two critical lines are in setup(),
wsgi_app = cherrypy.tree.mount(app.HelloWorld(), '/')twill.add_wsgi_intercept('localhost', 80, lambda: wsgi_app)
The first line asks CherryPy to convert your application into a WSGI application object, wsgi_app. The second line tells twill to talk directly to wsgi_app whenever a twill function asks for localhost:80.
Does it work?
Well, you can try it easily enough:
nosetests -w demo2/ -v
and you should see
tests.test ... ok----------------------------------------------------------------------Ran 1 test in 0.827sOK
So, yes, it does work!
Note that the test itself is the same, so you can actually use the test script simple-test.twill to do tests however you want — you just need to change the test fixtures (the setup and teardown code).
Note also that it’s quite a bit faster than demo1, because it doesn’t need to wait for the server to start up.
And, finally, it’s much less error prone. There’s really no way for any other process to interfere with the one running the test, and no network port is bound; wsgi_intercept completely shunts the networking code through to the WSGI app.
(For those of you who unwisely use your own Web testing frameworks, wsgi_intercept is a generic library that acts at the level of httplib, and it can work with every Python Web testing library known to mankind, or at least to me. See the wsgi_intercept page for more information.)
Purpose: demonstrate simple code-coverage analysis with figleaf.
Let’s move on to something else — code coverage analysis!
The basic idea behind code coverage analysis is to figure out what lines of code are (and more importantly aren’t) being executed under test. This can help you figure out what portions of your code need to be tested (because they’re not being tested at all).
figleaf does this by hooking into the CPython interpreter and recording which lines of code are executed. Then you can use figleaf’s utilities to do things like output an HTML page showing which lines were and weren’t executed.
Again, it’s easier to show than it is to explain, so read on!
First, start the app with figleaf coverage:
Now, run the twill script (in other window):
twill-sh -u http://localhost:8080/ simple-test.twill
Then CTRL-C out of the app.py Web server, and run
This will create a directory html/; open html/app.py.html in a Web browser. You should see a bunch of green lines (indicating that these lines of code were executed) and two red lines (the code for page2 and exit). There’s your basic coverage analysis!
Note that class and function definitions are executed on import, which is why def page2(self): is green; it’s just the contents of the functions themselves that aren’t executed.
If you open html/index.html you’ll see a general summary of code files executed by the Python command you ran.
Purpose: demonstrate the figleafsections plugin that’s part of pinocchio.
The figleafsections plugin to pinocchio lets you do a slightly more sophisticated kind of code analysis. Suppose you want to know which of your tests runs what lines of code? (This could be of interest for several reasons, but for now let’s just say « it’s neat », OK?)
For this demo, I’ve constructed a new pair of unit tests: take a look at cherrypy/demo3/tests.py. The first test function (test()) is identical to Demo 2, but now there’s a new test function — test2(). All that this function does is exercise the page2 code in the CherryPy app.
Now run the following commands in the cherrypy/ directory:
rm .figleafnosetests -v --with-figleafsections -w demo3annotate-sections ./app.py
This runs the tests with a nose plugin that keeps track of which tests are executing what sections of app.py, and then annotates app.py with the results. The annotated file is app.py.sections; take a look at it!
When you look at app.py.sections you should see something like this:
-- all coverage --| tests.test2| | tests.test | | | | #! /usr/bin/env python | import cherrypy | | class HelloWorld(object): | def index(self): + | return """<form method='POST' action='/form'> | Type something: <input type='text' name='inp'> | <input type='submit'> | </form>""" | index.exposed = True | | def form(self, inp=None, **kw): + | return "You typed: \"%s\"" % (inp,) | form.exposed = True | | def page2(self): + | return "This is page2." | page2.exposed = True | | def exit(self): | raise SystemExit | exit.exposed = True | | if __name__ == '__main__': | cherrypy.quickstart(HelloWorld())
What this output shows is that tests.test executed the index() and form() functions, while tests.test2 executed the page2() function only — just as you know from having read cherrypy/demo3/tests.py. Neat, eh?
See my blog post on the subject for some more discussion of how this can be useful.
Purpose: show how easy it is to write twill extensions.
Since twill is written in Python, it’s very easy to extend with Python. All you need to do is write a Python module containing the function(s) that you want to use within twill, and then call extend_with <module>. From that point on, those functions will be accessible from within twill. (Note that extension functions need to take string arguments, because the twill mini-language only operates on strings.)
For example, take a look at cherrypy/demo4/randomform.py. This is a simple extension module that lets you fill in form text fields with random values; the function fuzzfill takes a form name, a min/max length for the values, and an optional alphabet from which to build the values. You can call it like this:
extend_with randomformfuzzfill <form> 5 15 [ <alphabet> ]
If you look at the randomform.py script, the only real trickiness in the script is where it uses the twill browser API to retrieve the form fields and fill them with text. Conveniently, this entire API is available to twill extension modules.
Let’s try running it! The twill script cherrypy/fuzz-test.twill is a simple script that takes the CherryPy HelloWorld application and fills in the main page form field with a random alphanumeric string. As in Demo 2, we can put this all together in a simple unit test framework; see cherrypy/demo4/tests.py for the actual code.
You can run the demo code in the usual way:
nosetests -w demo4/ -v
If you run it without output capture, you’ll even see the random text we inserted:
% nosetests -w demo4/ -v -stests.test ... 127.0.0.1 - - [15/Mar/2007:19:08:14] "GET / HTTP/1.1" 200 166 "" ""closing... ==> at http://localhost/ Imported extension module 'randomform'. (at /Users/t/iorich-dev/talk-stuff/cherrypy/demo4/randomform.pyc) 127.0.0.1 - - [15/Mar/2007:19:08:14] "GET / HTTP/1.1 " 200 166 "" "" closing... ==> at http://localhost/ fuzzfill: widget "inp" using value "0jX0vUXye0" Note: submit is using submit button: name="None", value="" 127.0.0.1 - - [15/Mar/2007:19:08:14] "POST /form HTTP/1.1 " 200 23 "" "" ok closing... You typed: "0jX0vUXye0" [15/Mar/2007:19:08:14] ENGINE CherryPy shut down ---------------------------------------------------------------------- Ran 1 test in 0.617s OK
(Look for the text after « You typed »…)
Purpose: show how to use wsgi_intercept and twill to test a simple Django app.
OK, I’ve shown you how to write automated tests for CherryPy Web apps. Let’s try it out for Django, now!
Take a look at django/demo/tests.py for some code.
The first function you should look at is actually the last function in the file: TestDjangoPollSite.test. This function goes to « /polls », clicks on the « pycon » choice in the poll, submits it, and verifies that « pycon » has received 1 vote. (Unlike the CherryPy demos, here we’re using the twill Python API, rather than the scripting language.)
Behind this fairly simple looking test function lies two layers of fixtures.
The TestDjangoPollSite.setup() function is run before the test() function, and it serves to reset the vote count in the database; it’s very much like a unittest fixture, in that it’s run prior to each test* function in TestDjangoPollSite. (If there were a teardown() function in the class, it would be run after each test* function.)
The tests.setup() and tests.teardown() serve the same purpose as their CherryPy analogs in Demo 2: setup() initializes Django and sets up the wsgi_intercept shunt mechanism so that twill can talk to the Django app directly through WSGI. In turn, teardown cleans up the WSGI shunt.
Demos 1/2 and Demo 6 collectively demonstrate (hah!) how easy it is to use twill to start testing your Django and CherryPy apps. Even the simple level of testing demonstrated here serves an important purpose: you can be sure that, at a minimum, your application is configured properly and handling basic HTTP traffic. (More complicated tests will depend on your application, of course.)
(Thanks to Mick for his post — I swiped his code!)
Purpose: show how to use scotch to record a Django test.
(For this demo, you’re going to need an extra shell window, e.g. an xterm or another ssh session.)
Make sure you have scotch installed, and then run run-recording-proxy. This sets up an HTTP proxy server on port 8000 that records traffic into memory (and saves into a file when done). You should see
** scotch proxy server running on 127.0.0.1 port 8000 ...** RECORDING to filename 'recording.pickle'
OK, now, in another shell, go into django/mysite/ and run python manage.py runserver localhost:8001. This runs the simple Django polling application on port 8001. You should see
Validating models...0 errors found.Django version 0.95.1, using settings 'mysite.settings'Development server is running at http://localhost:8001/ Quit the server with CONTROL-C.
Now go to your Web browser and open the URL http://localhost:8001/polls/. You should see a page with a link containing the link text « what’s up? » This tells you that the Django app is running.
Set your Web browser’s HTTP proxy to ‘localhost’, ‘8000’. Make sure that your proxy settings forward ‘localhost’ (by default, Firefox does not send localhost requests through the proxy mechanism).
All right, now hit reload! If everything is working right, you should see the same « polls » page, but this time you’ll be going through the scotch proxy server. Check out the window in which you ran scotch — it should say something like
REQUEST ==> http://localhost:8001/polls/++ RESPONSE: 200 OK++ (77 bytes of content returned)++ (response is text/html; charset=utf-8) (# 1)
If so, great! It’s all working! (If not, well… one of us did something wrong ;).
OK, now go back to your Web browser and click through the poll (select « what’s up? », and choose « pycon », and then hit « submit »).
You should see a bunch more output on the proxy screen, including something like this (after the form submit):
REQUEST ==> http://localhost:8001/polls/1/vote/(post form)choice: "3"++ RESPONSE: 302 FOUND ++ (0 bytes of content returned) ++ (response is text/html; charset=utf-8) (# 4) REQUEST ==> http://localhost:8001/polls/1/results/
Already you can see that this is moderately useful for « watching » HTTP sessions, right? (It gets better!)
OK, now hit CTRL-C in the proxy server shell, to cancel. It should say something like « saved 5 records! » These records are saved into the file recording.pickle by default, and you can look at some of the files in the scotch distribution (especially those under the bin/ directory) for some simple ideas of what to do with them.
All right, so you’ve seen that you can record HTTP traffic. But what can you do with the recording?
Purpose: use scotch to generate a twill script from the recording in Demo 7.
Well, one immediately useful thing you can do with the recording is generate a twill script from it! To do that, type
in the proxy window. You should get the following output:
# record 0go http://localhost:8001/polls/# record 2# referer = http://localhost:8001/polls/ go http://localhost:8001/polls/1/ # record 3 # referer = http://localhost:8001/polls/1/ fv 1 choice '3' submit
Don’t be shy — save this to a file and run it with twill-sh! It should work.
So that’s pretty convenient, right? It’s not a cure-all — generating tests from recording can get pretty ugly, and with scotch I don’t aim to provide a complete solution, but I do aim to provide you with something you can extend yourself. (There are lots of site-specific issues that make it likely that you’ll need to provide custom translation scripts that understand your URL structure — these aren’t terribly hard to write, but they are site specific.)
Purpose: use scotch to play back the Web traffic directly and compare.
OK, and now for the last demo: the ultimate regression test!
Leave the Django site running (or start it up again) and, in the proxy window, type play-recorded-proxy recording.pickle. This literally replays the recorded session directly to the Django Web app and compares the actual output with the expected output.
You should see something like this:
==> http://localhost:8001/polls/ ...... 200 OK (identical response)==> http://localhost:8001/polls/1 ... ... 301 MOVED PERMANENTLY (identical response) ==> http://localhost:8001/polls/1/ ... ... 200 OK (identical response) ==> http://localhost:8001/polls/1/vote/ ... ... 302 FOUND (identical response) ==> http://localhost:8001/polls/1/results/ ... ... 200 OK ++ OUTPUT DIFFERS OLD OUTPUT: ==== <h1>what's up?</h1><ul> <li>the blue sky -- 0 votes</li> <li>not much -- 0 votes</li> <li>pycon -- 2 votes</li> </ul> ==== NEW OUTPUT: ==== <h1>what's up?</h1> <ul> <li>the blue sky -- 0 votes</li> <li>not much -- 0 votes</li> <li>pycon -- 5 votes</li> </ul> ====
What’s happening is clear: because we’re not resetting the database to a clean state, the vote counts are being incremented each time we run the recording — after all, in each recording we’re pushing the « submit » button after selecting « pycon ».
Anyway, this is a kind of neat regression test: does your Web site still return the same values it should? Note that it’s very fragile, of course: if your pages have date/time stamps, or other content that changes dynamically due to external conditions, you’re going to have write custom filter routines that ignore that in the comparisons. But it’s at least a neat concept.
(Again, I should note that this is neat, but it’s not clear to me how useful it is. scotch is very much a programmer’s toolkit at the moment, and I’m still feeling my way through its uses. I do have some other ideas that I will reveal by next year’s PyCon…)
I hope you enjoyed this romp through a bunch of different testing packages. I find them useful and interesting, and I hope you do, too.
Note that this stuff is my hobby, not my job, and so I tend to develop in response to other people’s requests and neat ideas. Send me suggestions!