I'm using Python 2.7.9 in Windows.
I have a UTF-8-encoded python script file with the following contents:
# coding=utf-8
def test_func():
u"""
>>> test_func()
u'☃'
"""
return u'☃'
I get a curious failure when I run the doctest:
Failed example:
test_func()
Expected:
u'\u2603'
Got:
u'\u2603'
I see this same failure output whether I launch the doctests through the IDE I usually use (IDEA IntelliJ), or from the command line:
> x:\my_virtualenv\Scripts\python.exe -m doctest -v hello.py
I copied the lines under Expected
and Got
into WinMerge to rule out some subtle difference in the characters I couldn't spot; it told me they were identical.
However, if I redo the command line run, but redirect the output to a text file, like so:
> x:\my_virtualenv\Scripts\python.exe -m doctest -v hello.py > out.txt
the test still fails, but the resulting failure output is a bit different:
Failed example:
test_func()
Expected:
u'☃'
Got:
u'\u2603'
If I put the escaped unicode literal in my doctest:
# coding=utf-8
def test_func():
u"""
>>> test_func()
u'☃'
"""
return u'\\u2603'
the test passes. But as far as I can tell, u'\u2603'
and u'☃'
should evaluate to the same thing.
Really I have two questions about the failing case:
Expected
or Got
) incorrect for the value that the doctester has for that case? (i.e. x != eval(repr(x))
)When the tests include values that are likely to change in unpredictable ways, and where the actual value is not important to the test results, you can use the ELLIPSIS option to tell doctest to ignore portions of the verification value.
The doctest module programmatically searches Python code for pieces of text within comments that look like interactive Python sessions. Then, the module executes those sessions to confirm that the code referenced by a doctest runs as expected.
The doctest
module uses difflib
to differentiate between the result and the expected result. Like the following:
>>> import difflib
>>> variation = difflib.unified_diff('x', 'x')
>>> list(variation)
[]
>>> variation = difflib.unified_diff('x', 'y')
>>> list(variation)
['--- \n', '+++ \n', '@@ -1 +1 @@\n', '-x', '+y']
Under the hood, the doctest
module formats the result and expected result several times. Your problem seems to be an interpretation mistake caused by the string encodings. What gets printed to the console has been formatted (using %s
), thus getting rid of any visible differences; making them look identical.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With