I am using scala 2.7.7, and wanted to parse CSV file and store the data in SQLite database.
I ended up using OpenCSV java library to parse the CSV file, and using sqlitejdbc library.
Using these java libraries makes my scala code looks almost identical to that of Java code (sans semicolon and with val/var)
As I am dealing with java objects, I can't use scala list, map, etc, unless I do scala2java conversion or upgrade to scala 2.8
Is there a way I can simplify my code further using scala bits that I don't know?
val filename = "file.csv";
val reader = new CSVReader(new FileReader(filename))
var aLine = new Array[String](10)
var lastSymbol = ""
while( (aLine = reader.readNext()) != null ) {
if( aLine != null ) {
val symbol = aLine(0)
if( !symbol.equals(lastSymbol)) {
try {
val rs = stat.executeQuery("select name from sqlite_master where name='" + symbol + "';" )
if( !rs.next() ) {
stat.executeUpdate("drop table if exists '" + symbol + "';")
stat.executeUpdate("create table '" + symbol + "' (symbol,data,open,high,low,close,vol);")
}
}
catch {
case sqle : java.sql.SQLException =>
println(sqle)
}
lastSymbol = symbol
}
val prep = conn.prepareStatement("insert into '" + symbol + "' values (?,?,?,?,?,?,?);")
prep.setString(1, aLine(0)) //symbol
prep.setString(2, aLine(1)) //date
prep.setString(3, aLine(2)) //open
prep.setString(4, aLine(3)) //high
prep.setString(5, aLine(4)) //low
prep.setString(6, aLine(5)) //close
prep.setString(7, aLine(6)) //vol
prep.addBatch()
prep.executeBatch()
}
}
conn.close()
If you have a simple CSV file, an alternative would be not to use any CSV library at all, but just simply parse it in Scala, for example:
case class Stock(line: String) {
val data = line.split(",")
val date = data(0)
val open = data(1).toDouble
val high = data(2).toDouble
val low = data(3).toDouble
val close = data(4).toDouble
val volume = data(5).toDouble
val adjClose = data(6).toDouble
def price: Double = low
}
scala> import scala.io._
scala> Source.fromFile("stock.csv") getLines() map (l => Stock(l))
res0: Iterator[Stock] = non-empty iterator
scala> res0.toSeq
res1: Seq[Stock] = List(Stock(2010-03-15,37.90,38.04,37.42,37.64,941500,37.64), Stock(2010-03-12,38.00,38.08,37.66,37.89,834800,37.89) //etc...
Which would have the advantage that you can use the full Scala collection API.
If you prefer to use parser combinators, there's also an example of a csv parser combinator on github.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With