Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to mock a csv file

I have a csv parser module that parses a very specific type of .csv file and extracts fields from it. Now to test this module I'm writing some unit tests. Traditionally, to test the module I would create a sample csv file of the same format but with limited entries and then pass in this file to the module and check the output. Obviously that's not very good because there's a dependency on the test file itself.

What is the right way to go about this? I've read about the mock module and how it can mock things. But I have no idea how I can mock a specific file.

like image 829
kronosjt Avatar asked Dec 15 '17 10:12

kronosjt


2 Answers

You did not supply any test data, so I hope my random examples translate well enough into your problem. In short: If you don't want to create temporary file objects during your tests (which is a reasonable constrain, imo), use StringIO. The mock module has a significant entry hurdle, so unless you want its fancier mocking abilities, there is no need to use it.

from io import StringIO
from csv import reader  # this should import your custom parser instead

in_mem_csv = StringIO("""\
col1,col2,col3
1,3,foo
2,5,bar
-1,7,baz""")  # in python 2.7, put a 'u' before the test string
test_reader = reader(in_mem_csv, delimiter=',', quotechar='|')
for line in test_reader:
    print(line)
    # whatever you need to test to make sure the csv reader works correctly

Output:

['col1', 'col2', 'col3']
['1', '3', 'foo']
['2', '5', 'bar']
['-1', '7', 'baz']

Alternative string formatting

I just personally prefer triple strings to represent files, normal strings might be better in your case. See this example for how to conveniently break lines and not change the string's value.

in_mem_csv = StringIO(
    "col1,col2,col3\n"
    "1,3,foo\n"
    "2,5,bar\n"
    "-1,7,baz\n"
)
like image 130
Arne Avatar answered Oct 03 '22 20:10

Arne


Below is an example of creating a mock CSV file using pandas

import pandas as pd
list = []
# It will create 100k records
for i in range(0,100000):

email = 'tester{i}@aeturnum.com'.replace("{i}",str(i))

phone = "0000000000"
phone = str(i) + phone[len(str(i)):] 

fname = "test" + str(i)
lname = "test" + str(i)

dob = "199{a}-{a}-0{a}".replace("{a}",str(len(str(i))))

list.append((fname, lname, email, phone, dob, str(i)))

columns = ['First Name', 'Last Name', 'Email Address', 'Phone Number','Date Of Birth','Current Loyalty Point Total']

df = pd.DataFrame(list, columns = columns)

print(df)

df.to_csv('user_data_100k.csv', index = False)
like image 45
Nazar Ahmed Amjad Avatar answered Oct 03 '22 19:10

Nazar Ahmed Amjad