Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read CSV into array

Tags:

csv

julia

In Julia, using CSV.jl, it is possible to read a DataFrame from a .csv file:

using CSV

df = CSV.read("data.csv", delim=",")

However, how can I instead read a CSV file into an Vector{Float64} data type?

like image 345
Seanny123 Avatar asked Jan 28 '19 20:01

Seanny123


2 Answers

You can use the DelimitedFiles module from stdlib:

julia> using DelimitedFiles

julia> s = """
       1,2,3
       4,5,6
       7,8,9"""
"1,2,3\n4,5,6\n7,8,9"

julia> b = IOBuffer(s)
IOBuffer(data=UInt8[...], readable=true, writable=false, seekable=true, append=false, size=17, maxsize=Inf, ptr=1, mark=-1)

julia> readdlm(b, ',', Float64)
3×3 Array{Float64,2}:
 1.0  2.0  3.0
 4.0  5.0  6.0
 7.0  8.0  9.0

I am showing you the example reading from IOBuffer to be fully reproducible, but you can also read data from file. In the docstring of readdlm you can find more details about the available options.

Notice that you will get Matrix{Float64} not Vector{Float64}, but I understand that this is what you wanted. If not then in order to convert a matrix into a vector you can call vec function on it after reading the data in.

EDIT

This is how you can read back a Matrix using CSV.jl:

julia> df = DataFrame(rand(2,3))
2×3 DataFrame
│ Row │ x1        │ x2       │ x3       │
│     │ Float64   │ Float64  │ Float64  │
├─────┼───────────┼──────────┼──────────┤
│ 1   │ 0.0444818 │ 0.570981 │ 0.608709 │
│ 2   │ 0.47577   │ 0.675344 │ 0.500577 │

julia> CSV.write("test.csv", df)
"test.csv"

julia> CSV.File("test.csv") |> Tables.matrix
2×3 Array{Float64,2}:
 0.0444818  0.570981  0.608709
 0.47577    0.675344  0.500577
like image 112
Bogumił Kamiński Avatar answered Nov 23 '22 16:11

Bogumił Kamiński


You can convert your DataFrame to a Matrix of a certain type. If there is no missing data this should work. If there is missing data, simply omit the type in convert.

arr = convert(Matrix{Float64}, df)

You can call vec on the result to get a vector if that is really what you want.

Depending on the file, I would go with readdlm as suggested in the previous answer.

like image 42
hckr Avatar answered Nov 23 '22 18:11

hckr