I'm trying to draw histograms of a DV (minutes) for each level of the IV (group) using the histBy function in the {psych} package.
My dataset contains 3 variables, all of which are numeric. There are no missing data in any case on any variable.
Here is my dataset:
structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44,45,
46, 47, 48, 49, 50),
group = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4,
4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5),
minutes = c(3, 3.5, 5, 2.5, 4, 4, 6, 2.5, 7, 3.5, 3, 2.5, 6, 4,
4, 3, 4.5, 6, 3.5, 5, 7, 5.5, 6, 4.5, 8, 6, 5.5, 8,
4, 5.5, 9, 12, 6.5, 8, 11.5, 5, 7, 9.5, 14, 6.5, 16,
18.5, 16, 10.5, 18.5, 15, 19.5, 11.5, 21, 18)),
row.names = c(NA, -50L),
spec = structure(
list(cols = list(ID = structure(
list(), class = c("collector_double","collector")),
group = structure(
list(), class = c("collector_double", "collector")),
minutes = structure(
list(), class = c("collector_double", "collector"))),
default = structure(
list(), class = c("collector_guess", "collector")),
delim = ","),
class = "col_spec"),
class = c("spec_tbl_df", "tbl_df", "tbl", "data.frame")) -> anova1
When I run:
hist(anova1$minutes)
It works, and I get one histogram of the DV minutes but with all groups pooled together (variable is numeric there). I also graphed the grouping variable, just to prove "numeric-ness" of it, and it worked too.
But when I run:
histBy(minutes ~ group, data = anova1)
I get the error message "'x' must be numeric".
> histBy(minutes ~ group, data = anova1) Error in hist.default(x[, var], breaks = breaks, plot = FALSE) : 'x' must be numeric
I have tried running as.numeric on both minutes and group just to make sure they're really numeric (even though the dataframe already showed they were), but the error persisted. I even tried making "group" a factor, not numeric, but I just got the same error.
I have tried rearranging the contents of the histBy() command as per the help file examples, entering (anova1, "minutes", "group") (and again without the quotation marks), getting the same error.
I've tried being explicit about the variable names, despite already naming the dataframe. (e.g. (anova1, anova1$minutes, anova1$group), or just (anova1$minutes, anova1$group)), and got an even more inscrutable (to me) error.
psych::histby() does not work with tibbles. This should work:
psych::histBy(minutes ~ group, data = as.data.frame(anova1))

Created on 2025-07-29 with reprex v2.1.1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With