Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract folder name and filename from FilePath using scala

I have streams of files being read from a directory and the filetree is of the form:

/repository/resources/2016-03-04/file.csv
/repository/resources/2016-03-04/file2.csv
/repository/resources/2016-03-05/file3.csv
/repository/resources/2016-03-05/file4.csv

Please, I need help using scala to extract the name of the date folder and the .csv files in the form:

2016-03-03 file.csv
2016-03-04 file2.csv
2016-03-05 file3.csv
2016-03-05 file4.csv
like image 758
Taiwotman Avatar asked Mar 13 '23 16:03

Taiwotman


2 Answers

As a supplement to what @PavelOliynyk suggested, here's what you can do:

val list = List(
  "/repository/resources/2016-03-04/file.csv",
  "/repository/resources/2016-03-04/file2.csv",
  "/repository/resources/2016-03-05/file3.csv",
  "/repository/resources/2016-03-05/file4.csv")

val datesAndFiles = list.map(_.split("/").takeRight(2).toList)

This is presuming that last two items in every string will be date and filename. I converted it to list so that you can easily pattern-match if you need to process it further, e.g. this is how you would get a tuple for each row:

val datesAndFileTuples = datesAndFiles.map({
  case date :: file :: Nil => (date, file)
})

That gives you a tuple for each date-file pair. If you'd rather separate them into dates and files (each in their own list), you can do this:

val (dates :: files :: Nil) = datesAndFiles.transpose

which gives you back two lists, one with dates and one with file names.

like image 163
slouc Avatar answered Mar 23 '23 01:03

slouc


You can try this solution, but I would advise to play with regular expressions to extract folder name. This would add validation functionality to your code.

val fileName : String = "/repository/resources/2016-03-05/file4.csv"
val result = fileName.split("/")
println( result(3) )

And regexp solution will look like:

val fileName : String = "/repository/resources/2016-03-05/file5.csv"

val Pattern = "/([a-z]+)/([a-z]+)/([-0-9]+)/([a-z0-9.]+)".r
val Pattern(partA, partB, partC, partD) = fileName
println( partA )
println( partB )
println( partC )
println( partD )
like image 28
Pavel Avatar answered Mar 23 '23 02:03

Pavel