I have a csv parser module that parses a very specific type of .csv file and extracts fields from it. Now to test this module I'm writing some unit tests. Traditionally, to test the module I would create a sample csv file of the same format but with limited entries and then pass in this file to the module and check the output. Obviously that's not very good because there's a dependency on the test file itself.
What is the right way to go about this? I've read about the mock module and how it can mock things. But I have no idea how I can mock a specific file.
You did not supply any test data, so I hope my random examples translate well enough into your problem. In short: If you don't want to create temporary file objects during your tests (which is a reasonable constrain, imo), use StringIO
. The mock
module has a significant entry hurdle, so unless you want its fancier mock
ing abilities, there is no need to use it.
from io import StringIO
from csv import reader # this should import your custom parser instead
in_mem_csv = StringIO("""\
col1,col2,col3
1,3,foo
2,5,bar
-1,7,baz""") # in python 2.7, put a 'u' before the test string
test_reader = reader(in_mem_csv, delimiter=',', quotechar='|')
for line in test_reader:
print(line)
# whatever you need to test to make sure the csv reader works correctly
Output:
['col1', 'col2', 'col3']
['1', '3', 'foo']
['2', '5', 'bar']
['-1', '7', 'baz']
Alternative string formatting
I just personally prefer triple strings to represent files, normal strings might be better in your case. See this example for how to conveniently break lines and not change the string's value.
in_mem_csv = StringIO(
"col1,col2,col3\n"
"1,3,foo\n"
"2,5,bar\n"
"-1,7,baz\n"
)
Below is an example of creating a mock CSV file using pandas
import pandas as pd
list = []
# It will create 100k records
for i in range(0,100000):
email = 'tester{i}@aeturnum.com'.replace("{i}",str(i))
phone = "0000000000"
phone = str(i) + phone[len(str(i)):]
fname = "test" + str(i)
lname = "test" + str(i)
dob = "199{a}-{a}-0{a}".replace("{a}",str(len(str(i))))
list.append((fname, lname, email, phone, dob, str(i)))
columns = ['First Name', 'Last Name', 'Email Address', 'Phone Number','Date Of Birth','Current Loyalty Point Total']
df = pd.DataFrame(list, columns = columns)
print(df)
df.to_csv('user_data_100k.csv', index = False)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With