Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove empty tibbles from a list of tibbles returned by get_friends function?

I am using the get_friends function of rtweet package to get the list of user_id's of the friends of a set of focal users who are sampled from participants in a Twitter discourse. The function returns a list of tibbles.

Each tibble has two columns - one with the focal user's user_id and the second with user_id's of the focal users friends. Since every user has different number of friends, the number of rows in each tibble is different.

My problem: The accounts of some of the focal users are now non-existent due to reasons unknown. Because of this the list has empty tibbles which look like this:

> userFriends[[88]]
# A tibble: 0 x 0

A non-empty tibble looks like this:

> userFriends[2]
[[1]]
# A tibble: 32 x 2
                 user            user_id
                <chr>              <chr>
 1 777937999917096960           49510236
 2 777937999917096960           60489018
 3 777937999917096960         3190203961
 4 777937999917096960          118756393
 5 777937999917096960         2338104343
 6 777937999917096960          122453931
 7 777937999917096960          452830010
 8 777937999917096960           60937837
 9 777937999917096960 923106269761851392
10 777937999917096960          416882361
# ... with 22 more rows

I want my code to identify these empty tibbles and subset the list without these tibbles.

I used the nrow function on these tibbles to find the number of friends each focal user had.

nFriends <- as.numeric(lapply(userFriends, nrow))

I took the indices where this value is zero as the empty tibbles and removed them using subsetting technique as follows:

nullIndex <- nFriends!=0
userFriendsFinal <- userFriends[nullIndex]

This seems to work as of now. But this way I also removing users with zero friends (although very unlikely) along with users who no longer exist or accessible through the API. I want to make sure that I am removing only those who are not accessible or do not exist. Please help.

like image 809
Sunil Reddy Avatar asked Nov 05 '25 04:11

Sunil Reddy


2 Answers

Hi you can use the discard function from the purrr package:

Here is small example:

library(purrr)
mylist <- list( a = tibble(n = numeric()),
      b = tibble(n = 1:4))
discard(mylist, function(z) nrow(z) == 0)
$b
# A tibble: 4 x 1
      n
  <int>
1     1
2     2
3     3
4     4
like image 102
Cettt Avatar answered Nov 08 '25 02:11

Cettt


We can use Filter with nrow, which will remove all entries with 0 number of rows, i.e.

Filter(nrow, userFriends)
like image 26
Sotos Avatar answered Nov 08 '25 02:11

Sotos