I have a big Excel table (18k rows and 400 cols) which has some rows highlighted using different colors. Is there a way to filter the rows by colors using openxlsx
?
I first loaded the workbook
wb <- loadWorkbook(file = "Items Comparison.xlsx")
getStyles(wb)
df <- read.xlsx(wb, sheet = 1)
I see the styles used in the workbook using getStyles(wb)
, but not sure how to use that information to filter all cells for each column by colors.
[[1]]
A custom cell style.
Cell formatting: GENERAL
Font name: Tahoma
Font size: 9
Font colour: #FFFFFF
Font decoration: BOLD
Cell borders: Top: thin, Bottom: thin, Left: thin, Right: thin
Cell border colours: #4E648A, #4E648A, #4E648A, #4E648A
Cell vert. align: top
Cell fill foreground: rgb: #384C70
Cell fill background: rgb: #384C70
wraptext: TRUE
[[2]]
A custom cell style.
Cell formatting: GENERAL
Font name: Tahoma
Font size: 9
Font colour: #FFFFFF
Font decoration: BOLD
Cell borders: Top: thin, Bottom: thin, Left: thin, Right: thin
Cell border colours: #4E648A, #4E648A, #4E648A, #4E648A
Cell vert. align: top
Cell fill foreground: rgb: #384C70
Cell fill background: rgb: #384C70
wraptext: TRUE
What can I do to filter data by fill colors?
UPDATE
Based on @Henrik solution, I tried to use his code but I kept getting error. So, to understand what was going on, I printed the output of x$style$fill$fillFg
rgb
"FF384C70"
rgb
"FF384C70"
NULL
NULL
NULL
NULL
NULL
NULL
NULL
NULL
NULL
NULL
rgb
"FF384C70"
NULL
NULL
NULL
rgb
"FFFFFF00"
rgb
"FFFFFF00"
theme
"0"
theme
"0"
rgb
"FFFFFF00"
NULL
theme
"2"
theme tint
"4" "0.79998168889431442"
theme
"8"
theme
"8"
rgb
"FFFFC000"
rgb
"FFFFC000"
theme tint
"5" "0.39997558519241921"
theme tint
"5" "0.39997558519241921"
theme tint
"9" "0.39997558519241921"
theme tint
"5" "0.79998168889431442"
rgb
"FFFFFF00"
rgb
"FF384C70"
NULL
NULL
NULL
rgb
"FF384C70"
rgb
"FF384C70"
[[1]]
rgb
"FF384C70"
[[2]]
rgb
"FF384C70"
[[3]]
NULL
[[4]]
NULL
[[5]]
NULL
[[6]]
NULL
[[7]]
NULL
[[8]]
NULL
[[9]]
NULL
[[10]]
NULL
[[11]]
NULL
[[12]]
NULL
[[13]]
rgb
"FF384C70"
[[14]]
NULL
[[15]]
NULL
[[16]]
NULL
[[17]]
rgb
"FFFFFF00"
[[18]]
rgb
"FFFFFF00"
[[19]]
theme
"0"
[[20]]
theme
"0"
[[21]]
rgb
"FFFFFF00"
[[22]]
NULL
[[23]]
theme
"2"
[[24]]
theme tint
"4" "0.79998168889431442"
[[25]]
theme
"8"
[[26]]
theme
"8"
[[27]]
rgb
"FFFFC000"
[[28]]
rgb
"FFFFC000"
[[29]]
theme tint
"5" "0.39997558519241921"
[[30]]
theme tint
"5" "0.39997558519241921"
[[31]]
theme tint
"9" "0.39997558519241921"
[[32]]
theme tint
"5" "0.79998168889431442"
[[33]]
rgb
"FFFFFF00"
[[34]]
rgb
"FF384C70"
[[35]]
NULL
[[36]]
NULL
[[37]]
NULL
[[38]]
rgb
"FF384C70"
[[39]]
rgb
"FF384C70"
I'm still confused why there're only 39 items. total number of rows is variable but not 39. I'm also not understanding the operation - is it rowwise or columnwise?
library(tidyxl)
formats <- xlsx_formats( "./temp/test_file.xlsx" )
cells <- xlsx_cells( "./temp/test_file.xlsx" )
#what colors are used?
formats$local$fill$patternFill$fgColor$rgb
# [1] NA "FFC00000" "FF00B0F0" NA
#find rows fo cells with red background
cells[ cells$local_format_id %in%
which( formats$local$fill$patternFill$fgColor$rgb == "FFC00000"),
"row" ]
# [1] 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With