I am trying to read in a CSV file into my VB.net application using the following code:
While Not EOF(1)
Input(1, dummy)
Input(1, phone_number)
Input(1, username)
Input(1, product_name)
Input(1, wholesale_cost)
Input(1, dummy)
Input(1, dummy)
End While
My CSV file (as text) looks like this:
Customer Name,Phone Number,Username,Product,Wholesale Cost,Sales Price,Gross Profit, Customer Reference
,00000000000,00000000000,Product Name,25.00,35.00,10.00,
,00000000000,00000000000,Product Name,1.00,1.40,0.40,
As you can see, not all fields are always included and therefore an error displays when reading the file because it cannot reach the end of the line.
How can I handle this type of file?
Sometimes the fields will be there on some lines, and others not.
UPDATE
I have tried the answer that Zenacity provided but when trying to read using sArray(1)
inside the loop it returns
Index was outside the bounds of the array
The most common CSV import errors include: The file size is too large - The CSV import tool of the program you're using might have a file size requirement. To reduce the file size, you can delete unnecessary data values, columns, and rows.
Change separator when saving Excel file as CSV To force it to use a different delimiter, proceed with the following steps: Click File > Options > Advanced. Under Editing options, clear the Use system separators check box. Change the default Decimal separator.
One thing that you should come to grips with is that those Filexxxx
methods are all but officially and formally deprecated. When using them, Intellisense pops up with:
...The My feature gives you better productivity and performance in file I/O operations than FileOpen. For more information, see Microsoft.VisualBasic.FileIO.FileSystem.
They are talking about My.Computer.FileSystem
but there are some even more useful NET methods.
The post doesnt reveal how the data will be stored, but if it is an array of any sort and/or a structure, those are at least suboptimal if not also outdated. This will store it in a class so that the numeric data can be stored as numbers and a List
will be used in place of an array.
I made a quick file similar to yours with some random data: {"CustName", "Phone", "UserName", "Product", "Cost", "Price", "Profit", "SaleDate", "RefCode"}
:
Ziggy Aurantium,132-5562,,Cat Food,8.26,9.95,1.69,08/04/2016,
Catrina Caison,899-8599,,Knife Sharpener,4.95,6.68,1.73,10/12/2016,X-873-W3
,784-4182,,Vapor Compressor,11.02,12.53,1.51,09/12/2016,
Note: this is a bad way to parse a CSV. There are lots of problems that can arise doing it this way; plus it takes more code. It is presented because it is a simple way to not have to deal with the missing fields. See The Right Way
' form/class level var:
Private SalesItems As List(Of SaleItem)
SaleItem
is a simple class to store the elements you care about. SalesItems
is a collection which can store only SaleItem
objects. The properties in that class allow Price and Cost to be stored as Decimal
and the date as a DateTime
.
' temp var
Dim item As SaleItem
' create the collection
SalesItems = New List(Of SaleItem)
' load the data....all of it
Dim data = File.ReadAllLines("C:\Temp\custdata.csv")
' parse data lines
' Start at 1 to skip a Header
For n As Int32 = 0 To data.Length - 1
Dim split = data(n).Split(","c)
' check if it is a good line
If split.Length = 9 Then
' create a new item
item = New SaleItem
' store SOME data to it
item.CustName = split(0)
item.Phone = split(1)
' dont care anout user name (2)
item.Product = split(3)
' convert numbers
item.Price = Convert.ToDecimal(split(4))
item.Cost = Convert.ToDecimal(split(5))
' dont use the PROFIT, calculate it in the class (6)
' convert date
item.SaleDate = Convert.ToDateTime(split(7))
' ignore nonexistant RefCode (8)
' add new item to collection
' a List sizes itself as needed!
SalesItems.Add(item)
Else
' To Do: make note of a bad line format
End If
Next
' show in DGV for approval/debugging
dgvMem.DataSource = SalesItems
Result:
Notes
It is generally a bad idea to store something which can be simply calculated. So the Profit
property is:
Public ReadOnly Property Profit As Decimal
Get
Return (Cost - Price)
End Get
End Property
It can never be "stale" if the cost or price is updated.
As shown, using the resulting collection can be displayed to the user very easily. Given a DataSource
, the DataGridView
will create the columns and populate the rows.
String.Split(c)
is a very bad idea because if the product is: "Hose, Small Green"
it will chop that up and treat it as 2 fields. There are a number of tools which will do nearly all the work for you:
Aside from the class, all the above could be done in just a few lines using CSVHelper:
Private CustData As List(Of SaleItem)
...
Using sr As New StreamReader("C:\Temp\custdata.csv", False),
csv = New CsvReader(sr)
csv.Configuration.HasHeaderRecord = True
CustData = csv.GetRecords(Of SaleItem)().ToList()
End Using
Two or three lines of code to read, parse, and create a collection of 250 items.
Even if you want to do it manually for some reason, CSVHelper can help. Rather than create a List(Of SaleItem)
for you, you can use it to read and parse the data:
... like above
csv.Configuration.HasHeaderRecord = True
Do Until csv.Read() = False
For n As Int32 = 0 To csv.Parser.FieldCount - 1
DoSomethingWith(csv.GetField(n))
Next
Loop
This will return the fields to you one by one. It wont convert any dates or prices, but it wont choke on missing data elements either.
Resources
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With