Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Separate a column into new columns based on the number of leading spaces

These reports are coming from quickbooks, downloaded as Excel files. Notice that the left column is this nested hierarchy based on the left spacing.

I need to separate Description column into separate columns based on the number of leading spaces on the left.

As I've been working with financial reports recently, these are super common and extremely difficult to work with. Is there a package or function for importing this type of data?

enter image description here

Here is example reproducible input dataframe:

df1 <- structure(list(Description = c("asset", " current asset", "   bank acc", 
                                      "    banner", "    clearing",
                                      "   total bank accounts",
                                      " total current assets"),
                 Total = c(NA, NA, NA, 10L, 20L, 30L, 30L)),
            .Names = c("Description", "Total"), 
            class = "data.frame", 
            row.names = c(NA, -7L))
like image 260
Super_John Avatar asked Aug 09 '18 01:08

Super_John


People also ask

How do you split a column based on?

Select the column you want to split. Ensure the column is a text data type. Select Home > Split Column > By Number of Characters. The Split a column by Number of Characters dialog box appears.

How do you separate Excel Data by spaces?

Click the “Data” tab in the ribbon, then look in the "Data Tools" group and click "Text to Columns." The "Convert Text to Columns Wizard" will appear. In step 1 of the wizard, choose “Delimited” > Click [Next]. A delimiter is the symbol or space which separates the data you wish to split.

How do I separate Data from one column into separate columns in R?

To split a column into multiple columns in the R Language, we use the separator() function of the dplyr package library. The separate() function separates a character column into multiple columns with a regular expression or numeric locations.


1 Answers

You can try tidyxl and unpivotr for these Excel wrangling tasks. Here are the docs:

  • unpivotr: https://github.com/nacnudus/unpivotr
  • tidyxl: https://nacnudus.github.io/tidyxl/

Here's a nice tutorial: https://blog.davisvaughan.com/2018/02/16/tidying-excel-cash-flow-spreadsheets-in-r/

like image 149
Matt Dancho Avatar answered Nov 14 '22 21:11

Matt Dancho