I am trying to move from R to Julia.
So I have a dataset with 2 columns of prices and 2 conditional columns telling me if the price is "cheap" or "expensive".
So I want to count how many "cheap" or "expensive" entries are.
So using the package DataStructures
I got this:
using DataStructures
counter(df.p_orellana)
Accumulator{Union{Missing, String}, Int64} with 3 entries:
"expensive" => 18
missing => 2
"cheap" => 22
This would be the same as the table()
function in R.
Is there any way to make these values proportions?
In R it would be to prop.Table()
function, but I am not sure how to do it with Julia.
I would like to have:
Accumulator{Union{Missing, String}, Int64} with 3 entries:
"expensive" => 0.4285
missing => 0.0476
"cheap" => 0.5238
Thanks in advance!
Use the FreqTables.jl package.
Here is an example:
julia> using FreqTables
julia> data = [fill("expensive", 18); fill(missing, 2); fill("cheap", 22)];
julia> freqtable(data)
3-element Named Vector{Int64}
Dim1 │
──────────┼───
cheap │ 22
expensive │ 18
missing │ 2
julia> proptable(data)
3-element Named Vector{Float64}
Dim1 │
──────────┼─────────
cheap │ 0.52381
expensive │ 0.428571
missing │ 0.047619
The results are shown in sorted order. If you would like other order use the CategoricalArrays.jl package additionally and set an appropriate ordering of levels:
julia> using CategoricalArrays
julia> cat_data = categorical(data, levels=["expensive", "cheap"]);
julia> freqtable(cat_data)
3-element Named Vector{Int64}
Dim1 │
────────────┼───
"expensive" │ 18
"cheap" │ 22
missing │ 2
julia> proptable(cat_data)
3-element Named Vector{Float64}
Dim1 │
────────────┼─────────
"expensive" │ 0.428571
"cheap" │ 0.52381
missing │ 0.047619
Adding a base Julia approach.
The function tableprop
can be put into ~/.julia/config/startup.jl
to load automatically.
function tableprop(data::Vector)
uniq = unique(data)
res = [sum(data .=== i) for i in uniq]
try
DataFrame(data=uniq, count=res, prop=res/sum(res))
catch
hcat(uniq, res, res/sum(res))
end
end
julia> using DataFrames # just for pretty print
julia> tableprop(df)
3×3 DataFrame
Row │ data count prop
│ String? Int64 Float64
─────┼───────────────────────────
1 │ cheap 5 0.5
2 │ expensive 3 0.3
3 │ missing 2 0.2
julia> df = ["cheap","expensive",missing,"cheap","cheap",
"expensive","expensive","cheap",missing,"cheap"]
10-element Vector{Union{Missing, String}}:
"cheap"
"expensive"
missing
"cheap"
"cheap"
"expensive"
"expensive"
"cheap"
missing
"cheap"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With