Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cant move to next line while reading csv file

Tags:

java

csv

scala

I have a class that helps me to read a csv file, and another class that creates an object of each line of the csv, so then I can run some actions for each line separately. using this for automation.

From some reason after one line my program stops...it worked before so I dont know what is wrong..

this is my csv reader class:

import java.io.File
import com.github.tototoshi.csv.CSVReader
import jxl.{Cell, Workbook}

import scala.collection.mutable

trait DataSource {

  def read (fileName: String): Seq[Map[String, String]]
}

object CsvDataSource extends DataSource {
  import com.github.tototoshi.csv.CSVFormat
  import com.github.tototoshi.csv.Quoting
  import com.github.tototoshi.csv.QUOTE_MINIMAL

  implicit object MyFormat extends CSVFormat {
    val delimiter: Char = '\t'
    val quoteChar: Char = '"'
    val escapeChar: Char = '"'
    val lineTerminator: String = "\r\n"
    val quoting: Quoting = QUOTE_MINIMAL
    val treatEmptyLineAsNil: Boolean = false
  }

  override def read(file: String): Seq[Map[String, String]] = {
    val reader = CSVReader.open(file, "UTF-16")(MyFormat)
    reader.iteratorWithHeaders.toSeq
  }
}

this is the PurchaseInfo class which is creating an object of each line of the csv:

case class PurchaseInfo(
                         something1: String,
                         something2: String,
                         something3: String,
                         something4: String) {
}


object PurchaseInfo {

    private def changeDateFormat(dateInString: String): String = {
    //System.out.println(dateInString)
    val formatter: SimpleDateFormat = new SimpleDateFormat("MMM dd, yyyy")
    val formatter2: SimpleDateFormat = new SimpleDateFormat("dd/MM/yyyy")
    val date: Date = formatter.parse(dateInString)
    return formatter2.format(date).toString
  }

    def fromDataSource (ds: DataSource)(fileName: String): Seq[PurchaseInfo] = {

      ds.read(fileName).map { c =>
        PurchaseInfo(
          something1 = c("Supplier Address Street Number"),
          something2 = c("Supplier Address Route"),
          something3 = c("Supplier Address Locality"),
          something4 = c("Supplier Address Postal Code")
        )
      }
    }
}

Now, in the class where I perform all the actions there is one method called insertData that get a sequence of purchaseInfos and calls another method with each purchaseInfo inside this seq....

def insertData (purchaseInfos: Seq[PurchaseInfo]) = {

    //logging in and then getting directed to the right path (where we start the invoices automation)
    login()

    val res = purchaseInfos.map { case purchaseInfo =>
      println(purchaseInfo.invoiceNumber)
      (purchaseInfo, Try(addInvoiceFlow(purchaseInfo)))
    }
    res
  }

the problem is that insertData calls addInvoiceFlow only one with the first purchaseInfo and stops...why? I checked and there are 34 lines so there is no prob with the csv file..

this is written in scala but java can help too :)

like image 256
nick shmick Avatar asked Sep 07 '15 11:09

nick shmick


People also ask

How do I create a new line in csv?

New Line Characters Windows standard CR+LF or Unix-like systems (Linux, Mac) standard LF may be used as new line character. (Reference: https://en.wikipedia.org/wiki/Newline.)

How do you go to the next line in a CSV file in Java?

You just have to change csvWriter. print("world"); to csvWriter. println("world"); . The next print going to be in next new line.

How do I open a CSV file without first row?

Line 1: We import the Pandas library as a pd. Line 2: We read the csv file using the pandas read_csv module, and in that, we mentioned the skiprows=1, which means skipping the first line while reading the csv file data. Line 4: Now, we print the final dataframe result shown in the above output without the header row.


2 Answers

You have a series of Stream.map. The first iterator.toSeq is just toStream.

iteratorWithHeaders.toSeq map PurchaseInfo.apply map addInvoiceFlow

insertData will not eagerly evaluate the invocations of addInvoiceFlow, but only the head element.

scala> (1 to 10).toStream map { i => println(s"Hi, $i") ; i + 1}
Hi, 1
res0: scala.collection.immutable.Stream[Int] = Stream(2, ?)

So insertData is returning this partly evaluated stream.

You can force the evaluation:

scala> res0.force
Hi, 2
Hi, 3
Hi, 4
Hi, 5
Hi, 6
Hi, 7
Hi, 8
Hi, 9
Hi, 10
res1: scala.collection.immutable.Stream[Int] = Stream(2, 3, 4, 5, 6, 7, 8, 9, 10, 11)

There's also this issue in case you have a parse error. See this comment.

like image 190
som-snytt Avatar answered Oct 24 '22 18:10

som-snytt


I suspect that you close the input file somehow before you finish reading it. I can't tell for sure because you don't provide the code that calls insertData. To test this hypothesis try to materialize the content of your file in read method by changing

reader.iteratorWithHeaders.toSeq

to

reader.iteratorWithHeaders.toList

If it will work after that, it means that you close the CSVReader before you consume your data.


Update: in my original answer I was right about the fix, but not right in my explanation. As @som-snytt correctly pointed out in his answer, Stream.map does not realize the stream, it merely defines an additional element transformation that should be made when a stream is actually realized. Therefore, in some case it may be useful to not realize the stream at reading point (thus creating intermediate Maps that are carried around), but rather do it after map, when realization will directly give you PurchaseInfos, i.e.

ds.read(fileName).map { c => PurchaseInfo(...)}.force
like image 36
Tim Avatar answered Oct 24 '22 18:10

Tim