I'm trying to figure out what the best approach would be to parse a csv file in Java. Now each line will have an X amount of information. For example, the first line can have up to 5 string words (with commas separating them) while the next few lines can have maybe 3 or 6 or what ever.
My problem isn't reading the strings from the file. Just to be clear. My problem is what data structure would be best to hold each line and also each word in that line?
At first I thought about using a 2D array, but the problem with that is that array sizes must be static (the 2nd index size would hold how many words there are in each line, which can be different from line to line).
Here's the first few lines of the CSV file:
0,MONEY
1,SELLING
2,DESIGNING
3,MAKING
DIRECTOR,3DENT95VGY,EBAD,SAGHAR,MALE,05/31/2011,null,0,10000,07/24/2011
3KEET95TGY,05/31/2011,04/17/2012,120050
3LERT9RVGY,04/17/2012,03/05/2013,132500
3MEFT95VGY,03/05/2013,null,145205
DIRECTOR,XKQ84P6CDW,AGHA,ZAIN,FEMALE,06/06/2011,null,1,1000,01/25/2012
XK4P6CDW,06/06/2011,09/28/2012,105000
XKQ8P6CW,09/28/2012,null,130900
DIRECTOR,YGUSBQK377,AYOUB,GRAMPS,FEMALE,10/02/2001,12/17/2007,2,12000,01/15/2002
You could use a Map<Integer, List<String>>
. The keys being the line numbers in the csv file, and the List being the words in each line.
An additional point: you will probably end up using List#get(int)
method quite often. Do not use a linked list if this is the case. This is because get(int)
for linked list is O(n). I think an ArrayList
is your best option here.
Edit (based on AlexWien's observation):
In this particular case, since the keys are line numbers, thus yielding a contiguous set of integers, an even better data structure could be ArrayList<ArrayList<String>>
. This will lead to faster key retrievals.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With