#Let the CSV contain the two columns "Age" and "Gender" where:
Age = [30, 24, 55, 61, 70, 21]
Gender = [Male, Female, Male, Male, Male, Female]
#I want it to show me all the values (and the amount of the values) of Age that correspond to the Gender="Male" and the same for "Female"
using DataFrames
#So this is what I try
julia> df= CSV.read(raw"Clocation)", DataFrame)
julia> df. Age
6-element Vector{Int64}:
30
24
55
61
70
21
#Adjusted for the example
julia> df. Age, Gender
ERROR: UndefVarError: Gender not defined
Stacktrace:
[1] top-level scope
@ REPL[26]:1
#What I want is 'df.Age, Gender=Male', but this doesn't work either and I'm really stuck :( Source: https://testdataframesjl.readthedocs.io/en/readthedocs/subsets/
#Any advice? Thank you in advance! #Edit: So then I try
julia> combine(groupby(df, :Age), :Gender=>"Male")
200×2 DataFrame
Row │ Age Male
│ Int64 String7
─────┼────────────────
1 │ 18 Male
2 │ 18 Male
3 │ 18 Male
4 │ 18 Female
5 │ 19 Male
6 │ 19 Male
7 │ 19 Male
8 │ 19 Female
9 │ 19 Male
10 │ 19 Female
11 │ 19 Male
12 │ 19 Male
13 │ 20 Female
14 │ 20 Male
15 │ 20 Female
16 │ 20 Male
17 │ 20 Male
18 │ 21 Male
19 │ 21 Female
20 │ 21 Female
21 │ 21 Female
22 │ 21 Female
23 │ 22 Female
24 │ 22 Male
25 │ 22 Female
26 │ 23 Female
27 │ 23 Female
28 │ 23 Female
⋮ │ ⋮ ⋮
173 │ 57 Male
174 │ 57 Female
175 │ 58 Female
176 │ 58 Male
177 │ 59 Male
178 │ 59 Male
179 │ 59 Male
180 │ 59 Male
181 │ 60 Male
182 │ 60 Female
183 │ 60 Female
184 │ 63 Male
185 │ 63 Female
186 │ 64 Male
187 │ 65 Female
188 │ 65 Male
189 │ 66 Female
190 │ 66 Male
191 │ 67 Male
192 │ 67 Female
193 │ 67 Male
194 │ 67 Male
195 │ 68 Female
196 │ 68 Female
197 │ 68 Male
198 │ 69 Male
199 │ 70 Male
200 │ 70 Male
144 rows omitted
#And now I'm just confused Source: https://discourse.julialang.org/t/how-to-count-the-number-of-categories-present-in-a-column-of-a-dataframe/33244/3
julia> df. Age, Gender
where did you see this syntax?
This would be what you want?:
julia> df = DataFrame(Age = [30, 24, 55, 61, 70, 21], Gender = ["Male", "Female", "Male", "Male", "Male", "Female"]);
julia> df[df.Gender .== "Male", :]
4×2 DataFrame
Row │ Age Gender
│ Int64 String
─────┼───────────────
1 │ 30 Male
2 │ 55 Male
3 │ 61 Male
4 │ 70 Male
julia> df.Age[df.Gender .== "Male"]
4-element Vector{Int64}:
30
55
61
70
Apart from the answer by jling which is a simplest one here are the alternatives.
Using groupby you can create a division of the rows of the data frame by the grouping columns:
julia> gdf = groupby(df, :Gender)
GroupedDataFrame with 2 groups based on key: Gender
First Group (4 rows): Gender = "Male"
Row │ Age Gender
│ Int64 String
─────┼───────────────
1 │ 30 Male
2 │ 55 Male
3 │ 61 Male
4 │ 70 Male
⋮
Last Group (2 rows): Gender = "Female"
Row │ Age Gender
│ Int64 String
─────┼───────────────
1 │ 24 Female
2 │ 21 Female
julia> gdf[("Male",)]
4×2 SubDataFrame
Row │ Age Gender
│ Int64 String
─────┼───────────────
1 │ 30 Male
2 │ 55 Male
3 │ 61 Male
4 │ 70 Male
julia> gdf[("Female",)]
2×2 SubDataFrame
Row │ Age Gender
│ Int64 String
─────┼───────────────
1 │ 24 Female
2 │ 21 Female
If you would want only one subset you can also use filter or subset (that do a similar thing but with a different syntax):
julia> filter(:Gender => ==("Male"), df)
4×2 DataFrame
Row │ Age Gender
│ Int64 String
─────┼───────────────
1 │ 30 Male
2 │ 55 Male
3 │ 61 Male
4 │ 70 Male
julia> subset(df, :Gender => ByRow(==("Male")))
4×2 DataFrame
Row │ Age Gender
│ Int64 String
─────┼───────────────
1 │ 30 Male
2 │ 55 Male
3 │ 61 Male
4 │ 70 Male
Finally you can consider using DataFramesMeta.jl that probably is a bit easier to understand:
julia> using DataFramesMeta
julia> @subset(df, :Gender .== "Male")
4×2 DataFrame
Row │ Age Gender
│ Int64 String
─────┼───────────────
1 │ 30 Male
2 │ 55 Male
3 │ 61 Male
4 │ 70 Male
julia> @rsubset(df, :Gender == "Male") # "r" prefix stands for "row" so you do not need to broadcast the operation
4×2 DataFrame
Row │ Age Gender
│ Int64 String
─────┼───────────────
1 │ 30 Male
2 │ 55 Male
3 │ 61 Male
4 │ 70 Male
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With