I'm using dplyr and I have a grouped data.frame. I tried to drop a column with the select
function in this grouped_df, but got the error message
> tbl %>% select(-names)
Error: corrupt 'grouped_df', contains 42 rows, and 965 rows in groups
My data is below.
> print(tbl_df(tbl), n = 1000)
Source: local data frame [42 x 15]
household names x2003 x2004 x2005 x2006 x2007 x2008 x2009 x2012 last.avail last.avail.year absChange.last annChange.last translation
(chr) (fctr) (int) (int) (int) (int) (int) (int) (int) (int) (int) (dbl) (int) (dbl) (fctr)
1 all households bostad 59280 61850 62760 63210 66950 73340 72350 77750 77750 2012 18470 0.030594980 Accomodation
2 all households fritid och kultur 45140 46140 49260 48640 49720 55120 53970 61170 61170 2012 16030 0.034341864 Leisure and culture
3 all households transport 41930 40430 45870 48850 47280 50250 42650 49940 49940 2012 8010 0.019614408 Transportation
4 all households köpta livsmedel 28420 30000 29130 30420 30750 34130 34780 34570 34570 2012 6150 0.022004509 Bought Groceries
5 all households hyra/avgift för hyres-/borätt (inkl garage) 27310 27720 28860 30000 28990 29660 30740 NA 30740 2009 3430 0.019914330 Rent for accomodation
6 all households hushållstjänster 11360 12030 13200 12390 8520 10250 13530 22900 22900 2012 11540 0.081007165 Household services
7 cohabit with child bostad 78240 83040 81390 79180 90490 95630 100060 100980 100980 2012 22740 0.028754709 Accomodation
8 cohabit with child fritid och kultur 67110 67640 67290 64600 74290 71890 77200 81180 81180 2012 14070 0.021373640 Leisure and culture
9 cohabit with child transport 58350 62440 70010 69560 68730 75290 65510 71340 71340 2012 12990 0.022584342 Transportation
10 cohabit with child köpta livsmedel 45190 45660 45720 44980 48250 52880 52770 52710 52710 2012 7520 0.017250361 Bought Groceries
11 cohabit with child hushållstjänster 19840 21380 25690 21430 17190 19060 24730 37440 37440 2012 17600 0.073108900 Household services
12 cohabit with child räntor (brutto) 27090 25230 24390 24500 28510 36030 33080 NA 33080 2009 5990 0.033854485 Rents (net)
13 cohabit without child bostad 60340 63230 63560 61760 67100 74160 70440 78510 78510 2012 18170 0.029679783 Accomodation
14 cohabit without child fritid och kultur 51120 48780 57700 57320 57620 67220 62460 68400 68400 2012 17280 0.032884345 Leisure and culture
15 cohabit without child transport 49740 46310 55580 57730 56770 54910 52720 59360 59360 2012 9620 0.019839931 Transportation
16 cohabit without child köpta livsmedel 31130 33700 31900 33000 33990 37330 37980 37090 37090 2012 5960 0.019654591 Bought Groceries
17 cohabit without child drift av bil 24370 21790 25170 27530 25140 28180 26650 NA 26650 2009 2280 0.015017696 Car expenses
18 cohabit without child hushållstjänster 11650 12400 12260 12310 8580 11920 13950 26370 26370 2012 14720 0.095016005 Household services
19 other cohabit with child fritid och kultur 67680 75550 78020 75800 88870 80070 84490 116020 116020 2012 48340 0.061715253 Leisure and culture
20 other cohabit with child bostad 73850 68740 84800 86510 89290 106540 89650 100580 100580 2012 26730 0.034920030 Accomodation
21 other cohabit with child transport 66950 79620 75730 77800 81010 93790 77960 98660 98660 2012 31710 0.044022982 Transportation
22 other cohabit with child köpta livsmedel 54070 53790 50680 51440 53720 64170 62050 63690 63690 2012 9620 0.018360752 Bought Groceries
23 other cohabit with child drift av bil 32690 34180 37530 36200 38280 38990 36390 NA 36390 2009 3700 0.018031437 Car expenses
24 other cohabit with child hushållstjänster 15690 21000 20810 20370 9990 11880 19710 32460 32460 2012 16770 0.084128145 Household services
25 other households bostad 62860 68680 69950 72840 70700 91510 84480 86020 86020 2012 23160 0.035466655 Accomodation
26 other households fritid och kultur 49940 48530 55280 57970 54470 61130 65280 67920 67920 2012 17980 0.034758001 Leisure and culture
27 other households transport 50590 41980 57370 64960 52780 61460 59770 59630 59630 2012 9040 0.018435074 Transportation
28 other households köpta livsmedel 35370 35210 35360 41560 35040 43770 45940 43270 43270 2012 7900 0.022652258 Bought Groceries
29 other households drift av bil 21440 21580 25640 30070 28260 30070 32010 NA 32010 2009 10570 0.069079862 Car expenses
30 other households hyra/avgift för hyres-/borätt (inkl garage) 29550 32320 25170 24600 29480 35290 25920 NA 25920 2009 -3630 -0.021607942 Rent for accomodation
31 single parent bostad 67890 67250 71200 75210 71000 73490 74710 81820 81820 2012 13930 0.020953501 Accomodation
32 single parent fritid och kultur 34900 35860 43600 46770 43540 46160 45840 51000 51000 2012 16100 0.043049627 Leisure and culture
33 single parent hyra/avgift för hyres-/borätt (inkl garage) 43360 44020 45160 49430 45370 44090 48740 NA 48740 2009 5380 0.019685026 Rent for accomodation
34 single parent transport 27230 30810 28810 28410 30500 30390 29360 34890 34890 2012 7660 0.027925124 Transportation
35 single parent köpta livsmedel 26420 27910 28160 29100 28310 33020 35910 33740 33740 2012 7320 0.027546212 Bought Groceries
36 single parent hushållstjänster 9490 11690 13770 8650 7250 10390 11490 17140 17140 2012 7650 0.067891620 Household services
37 single parent without child bostad 45660 47110 48750 50850 51610 55720 56020 61090 61090 2012 15430 0.032876143 Accomodation
38 single parent without child fritid och kultur 28270 31890 31140 30210 28480 35650 32840 41770 41770 2012 13500 0.044329701 Leisure and culture
39 single parent without child hyra/avgift för hyres-/borätt (inkl garage) 31900 32160 33010 36300 34300 35330 37800 NA 37800 2009 5900 0.028687635 Rent for accomodation
40 single parent without child transport 26730 22980 24530 29310 28440 31680 20150 28800 28800 2012 2070 0.008322088 Transportation
41 single parent without child köpta livsmedel 15330 16930 16150 17630 17280 18390 19370 19580 19580 2012 4250 0.027561531 Bought Groceries
42 single parent without child hushållstjänster 6570 6590 6840 7080 3780 4300 7000 12310 12310 2012 5740 0.072257733 Household services
What is the issue and how can this be resolved?
In order to drop the column which ends with certain label we will be using select() function along with ends_with() function by passing the column label inside the ends_with() function as shown below. Dropping the column name which ends with “cyl” is accomplished using ends_with() function and select() function.
Deleting a column using dplyr is very easy using the select() function and the - sign. For example, if you want to remove the columns “X” and “Y” you'd do like this: select(Your_Dataframe, -c(X, Y)) .
%>% is called the forward pipe operator in R. It provides a mechanism for chaining commands with a new forward-pipe operator, %>%. This operator will forward a value, or the result of an expression, into the next function call/expression. It is defined by the package magrittr (CRAN) and is heavily used by dplyr (CRAN).
Use dplyr to Drop Multiple Columns Using a Function in R As usual, to drop columns, we use the ! operator. In the example, we use a simple custom function to select all columns with more than 10. The code drops these and returns the remaining columns.
If the variable to drop is used as a grouping variable, we need to ungroup
before using that variable in the select
. In the current dplyr
version (dplyr_0.4.3
) this is the case, but it may or may not change in the future dplyr
versions
tbl %>%
ungroup() %>%
select(-names)
As an example of corrupted grouped data, suppose if we try to remove column 'y' from 'df3'
dat3 %>%
select(-y)
#Error: corrupt 'grouped_df', contains 1100 rows, and 1000 rows in groups
By checking the str(dat3)
str(dat3)
#Classes ‘grouped_df’, ‘tbl_df’, ‘tbl’ and 'data.frame': 1100 obs. of 2 variables:
# $ group: Factor w/ 3 levels "A","B","C": 2 3 2 2 2 2 1 2 2 1 ...
# $ y : num 1.396 -0.892 1.065 0.801 -0.368 ...
# - attr(*, "vars")=List of 1
# ..$ : symbol group
# - attr(*, "drop")= logi TRUE
# - attr(*, "indices")=List of 3
# ..$ : int 6 9 12 13 14 16 18 21 25 27 ...
# ..$ : int 0 2 3 4 5 7 8 10 11 15 ...
# ..$ : int 1 17 24 28 35 37 39 43 47 49 ...
# - attr(*, "group_sizes")= int 323 365 312
# - attr(*, "biggest_group_size")= int 365
# - attr(*, "labels")='data.frame': 3 obs. of 1 variable:
# ..$ group: Factor w/ 3 levels "A","B","C": 1 2 3
# ..- attr(*, "vars")=List of 1
# .. ..$ : symbol group
# ..- attr(*, "drop")= logi TRUE
we find that attr
are added by rbind
ing, but instead if we use bind_rows
dat4 <- bind_rows(dat1, dat2)
str(dat4)
#Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 1100 obs. of 2 variables:
# $ group: chr "B" "C" "B" "B" ...
# $ y : num 1.396 -0.892 1.065 0.801 -0.368 ...
We can remove the 'y' column from 'dat4'
dat4 %>%
select(-y)
As the OP didn't show how the 'tbl' got created, we can only assume that it was created using some methods which corrupted by the dataset by adding attributes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With