Perhaps it is just me, but I have always found str
unsatisfactory. It is frequently too verbose, yet not very informative in many occasions.
I actually really like the description of the function (?str
):
Compactly display the internal structure of an R object
and this bit in particular
Ideally, only one line for each ‘basic’ structure is displayed.
Only that, in many cases, the default str
implementation simply does not do justice to such description.
Ok, let's say it works partially good for data.frame
s.
library(ggplot2)
str(mpg)
> str(mpg)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 234 obs. of 11 variables:
$ manufacturer: chr "audi" "audi" "audi" "audi" ...
$ model : chr "a4" "a4" "a4" "a4" ...
$ displ : num 1.8 1.8 2 2 2.8 2.8 3.1 1.8 1.8 2 ...
$ year : int 1999 1999 2008 2008 1999 1999 2008 1999 1999 2008 ...
$ cyl : int 4 4 4 4 6 6 6 4 4 4 ...
$ trans : chr "auto(l5)" "manual(m5)" "manual(m6)" "auto(av)" ...
$ drv : chr "f" "f" "f" "f" ...
$ cty : int 18 21 20 21 16 18 18 18 16 20 ...
$ hwy : int 29 29 31 30 26 26 27 26 25 28 ...
$ fl : chr "p" "p" "p" "p" ...
$ class : chr "compact" "compact" "compact" "compact" ...
Yet, for a data.frame
it's not as informative as I would like. In addition to class, it would be very useful that it shows number of NA values, and number of unique values, for example.
But for other objects, it quickly becomes unmanageable. For example:
gp <- ggplot(mpg, aes(x = displ, y = hwy)) +
geom_point()
str(gp)
> str(gp)
List of 9
$ data :Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 234 obs. of 11 variables:
..$ manufacturer: chr [1:234] "audi" "audi" "audi" "audi" ...
..$ model : chr [1:234] "a4" "a4" "a4" "a4" ...
..$ displ : num [1:234] 1.8 1.8 2 2 2.8 2.8 3.1 1.8 1.8 2 ...
..$ year : int [1:234] 1999 1999 2008 2008 1999 1999 2008 1999 1999 2008 ...
..$ cyl : int [1:234] 4 4 4 4 6 6 6 4 4 4 ...
..$ trans : chr [1:234] "auto(l5)" "manual(m5)" "manual(m6)" "auto(av)" ...
..$ drv : chr [1:234] "f" "f" "f" "f" ...
..$ cty : int [1:234] 18 21 20 21 16 18 18 18 16 20 ...
..$ hwy : int [1:234] 29 29 31 30 26 26 27 26 25 28 ...
..$ fl : chr [1:234] "p" "p" "p" "p" ...
..$ class : chr [1:234] "compact" "compact" "compact" "compact" ...
$ layers :List of 1
..$ :Classes 'LayerInstance', 'Layer', 'ggproto' <ggproto object: Class LayerInstance, Layer>
aes_params: list
compute_aesthetics: function
compute_geom_1: function
compute_geom_2: function
compute_position: function
compute_statistic: function
data: waiver
draw_geom: function
geom: <ggproto object: Class GeomPoint, Geom>
aesthetics: function
default_aes: uneval
draw_group: function
draw_key: function
draw_layer: function
draw_panel: function
extra_params: na.rm
handle_na: function
non_missing_aes: size shape
parameters: function
required_aes: x y
setup_data: function
use_defaults: function
super: <ggproto object: Class Geom>
geom_params: list
inherit.aes: TRUE
layer_data: function
map_statistic: function
mapping: NULL
position: <ggproto object: Class PositionIdentity, Position>
compute_layer: function
compute_panel: function
required_aes:
setup_data: function
setup_params: function
super: <ggproto object: Class Position>
print: function
show.legend: NA
stat: <ggproto object: Class StatIdentity, Stat>
compute_group: function
compute_layer: function
compute_panel: function
default_aes: uneval
extra_params: na.rm
non_missing_aes:
parameters: function
required_aes:
retransform: TRUE
setup_data: function
setup_params: function
super: <ggproto object: Class Stat>
stat_params: list
subset: NULL
super: <ggproto object: Class Layer>
$ scales :Classes 'ScalesList', 'ggproto' <ggproto object: Class ScalesList>
add: function
clone: function
find: function
get_scales: function
has_scale: function
input: function
n: function
non_position_scales: function
scales: list
super: <ggproto object: Class ScalesList>
$ mapping :List of 2
..$ x: symbol displ
..$ y: symbol hwy
$ theme : list()
$ coordinates:Classes 'CoordCartesian', 'Coord', 'ggproto' <ggproto object: Class CoordCartesian, Coord>
aspect: function
distance: function
expand: TRUE
is_linear: function
labels: function
limits: list
range: function
render_axis_h: function
render_axis_v: function
render_bg: function
render_fg: function
train: function
transform: function
super: <ggproto object: Class CoordCartesian, Coord>
$ facet :List of 1
..$ shrink: logi TRUE
..- attr(*, "class")= chr [1:2] "null" "facet"
$ plot_env :<environment: R_GlobalEnv>
$ labels :List of 2
..$ x: chr "displ"
..$ y: chr "hwy"
- attr(*, "class")= chr [1:2] "gg" "ggplot"
Whaaattttt???, what happened to "Compactly display". That's not compact!
And it can be worse, crazy scary, for example, for S4 objects. If you want try this:
library(rworldmap)
newmap <- getMap(resolution = "coarse")
str(newmap)
I do not post the output here because it is too much. It does not even fit in the console buffer!
How can you possibly understand the internal structure of the object with such a NON-compact display? It's just too many details and you easily get lost. Or at least I do.
Well, all right. Before someone tells me, hey checkout ?str
and tweak the arguments, that's what I did. Of course it can get better, but I am still kind of disappointed with str
.
The best solution I've got is to create a function that do this
if(isS4(obj)){
str(obj, max.level = 2, give.attr = FALSE, give.head = FALSE)
} else {
str(obj, max.level = 1, give.attr = FALSE, give.head = FALSE)
}
This displays compactly the top level structures of the object. The output for the sp object above (S4 object) becomes much more insightful
Formal class 'SpatialPolygonsDataFrame' [package "sp"] with 5 slots
..@ data :'data.frame': 243 obs. of 49 variables:
..@ polygons :List of 243
.. .. [list output truncated]
..@ plotOrder :7 135 28 167 31 23 9 66 84 5 ...
..@ bbox :-180 -90 180 83.6
..@ proj4string:Formal class 'CRS' [package "sp"] with 1 slot
So now you can see there are 5 top level structures, and you can investigate them further individually.
Similar for the ggplot object above, now you can see
List of 9
$ data :Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 234 obs. of 11 variables:
$ layers :List of 1
$ scales :Classes 'ScalesList', 'ggproto'
$ mapping :List of 2
$ theme : list()
$ coordinates:Classes 'CoordCartesian', 'Coord', 'ggproto'
$ facet :List of 1
$ plot_env :
$ labels :List of 2
Although this is much better, I still feel it could be much more insightful. So, perhaps someone has felt the same way and created a nice function that is more informative and still compactly displays the information. Anyone?
In such situation I use glimpse from the tibble package which is less verbose and briefly descriptive of the data structure.
library(tibble)
glimpse(gp)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With