Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Julia : How to convert vector of type string to type numeric (Float64)

In Julia 1.1 I want to convert a vector of type string to type numeric (Float64) here is the vector:

string = ["2.2", "3,3", "4.4"];

I tried the following line without success:

x = convert(Float64, string)
x = convert(DataVector{Float64}, string)
x = map(x->parse(Float64,x),string)
x = parse(Float64,string)
x = Float64(string)
like image 848
ecjb Avatar asked Mar 05 '23 15:03

ecjb


1 Answers

The simplest is:

julia> s = ["2.2", "3.3", "4.4"];

julia> parse.(Float64, s)
3-element Array{Float64,1}:
 2.2
 3.3
 4.4

but map will also work:

julia> map(x->parse(Float64,x), s)
3-element Array{Float64,1}:
 2.2
 3.3
 4.4

The problem in your original example is twofold:

  • the second string "3,3" is an invalid Floa64 number (it has a wrong decimal delimiter);
  • while valid, I would recommend you not to use string as a name for a variable as it will overshadow string function from Base.

Additionally if your original strings have comma as a decimal delimiter then you can run replace on them first, e.g. here I broadcast it over a vector:

julia> s = ["2.2", "3,3", "4,4"];

julia> replace.(s, [','=>'.'])
3-element Array{String,1}:
 "2.2"
 "3.3"
 "4.4"

EDIT: as indicated by DNF it is actually a bit faster to write eiter:

replace.(s, (','=>'.',))

or

replace.(s, Ref(','=>'.'))

The general rule is that you are doing broadcasting because you have used a . so all your arguments should be broadcastable. Because a Pair, in our case, ','=>'.', is not treated as broadcastable we have to wrap it in a one-element container that is broadcastable.

The first approach was to wrap it in a one element array using [ and ] which is a bit inefficient, because it allocates a new temporary array.

You can use a one element tuple for this wrapping it in ( and ,) (note the comma before ), without it the pattern will not work correctly). This approach will not allocate memory.

Finally you can use an in-built Ref function, which will create an object of type Base.RefValue{Pair{Char,Char}} in this case that is seen by Julia as a 0-dimensional one element container (this is a bit more advanced topic that you can start exploring in this section of the Julia manual). This approach also will not allocate memory.

Over what you can broadcast is described here in the Julia manual.

Additional cases:

Array of strings and missings

For this you need latest Missings.jl (run up command in the package manager):

julia> s = ["2.2", "3.3", "4.4", missing]
4-element Array{Union{Missing, String},1}:
 "2.2"
 "3.3"
 "4.4"
 missing

julia> passmissing(parse).(Float64, s)
4-element Array{Union{Missing, Float64},1}:
 2.2
 3.3
 4.4
  missing

Array of string and NaN

This should not happen in practice as you are missing strings and floats in one vector, but you can do this like this (I have added 5.5 to the vector to show you that the solution is not NaN specific but in general can ingest any string or any Float64):

julia> s = ["2.2", "3.3", "4.4", NaN, 5.5]
5-element Array{Any,1}:
    "2.2"
    "3.3"
    "4.4"
 NaN
   5.5

julia> [v isa Float64 ? v : parse(Float64, v) for v in s]
5-element Array{Float64,1}:
   2.2
   3.3
   4.4
 NaN
   5.5
like image 124
Bogumił Kamiński Avatar answered Apr 07 '23 15:04

Bogumił Kamiński