Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split multi valued cells in more than one column into rows (Open Refine)

Tags:

openrefine

I have been cleaning a table on Open Refine. I now have it like this:

REF                 Handle      Size        Price
2002, 2003          t-shirt1    M, L        23
3001, 3002, 3003    t-shirt2    S, M, L     24

I need to split those multivalued cells in REF and Size so that I get:

REF                 Handle      Size        Price
2002                t-shirt1    M           23
2003                t-shirt1    L           23  
3001                t-shirt2    S           24  
3002                t-shirt2    M           24
3003                t-shirt2    L           24

Is it possible to do this in Open Refine? The "Split multi-valued cells..." command only takes care of one column. Thank you, Ana Rita

like image 605
AnaRita Avatar asked Sep 28 '15 20:09

AnaRita


People also ask

How do you split a cell into 5 rows?

Split cells In the table, click the cell that you want to split. Click the Layout tab. In the Merge group, click Split Cells. In the Split Cells dialog, select the number of columns and rows that you want and then click OK.


1 Answers

Yes, it's possible :

  • Split the 1st column using ", " as separator.
  • Move column 2 at position one
  • display your project as record (not row)
  • Split column 3 using ", " as separator
  • Fill down columns 4 and 2
  • reorder columns

Here's my recipe in GREL :

[
  {
    "op": "core/row-removal",
    "description": "Remove rows",
    "engineConfig": {
      "facets": [
        {
          "invert": false,
          "expression": "row.starred",
          "selectError": false,
          "omitError": false,
          "selectBlank": false,
          "name": "Starred Rows",
          "omitBlank": false,
          "columnName": "",
          "type": "list",
          "selection": [
            {
              "v": {
                "v": true,
                "l": "true"
              }
            }
          ]
        }
      ],
      "mode": "row-based"
    }
  },
  {
    "op": "core/multivalued-cell-split",
    "description": "Split multi-valued cells in column Column 1",
    "columnName": "Column 1",
    "keyColumnName": "Column 1",
    "separator": ", ",
    "mode": "plain"
  },
  {
    "op": "core/column-move",
    "description": "Move column Column 2 to position 0",
    "columnName": "Column 2",
    "index": 0
  },
  {
    "op": "core/multivalued-cell-split",
    "description": "Split multi-valued cells in column Column 3",
    "columnName": "Column 3",
    "keyColumnName": "Column 2",
    "separator": ", ",
    "mode": "plain"
  },
  {
    "op": "core/fill-down",
    "description": "Fill down cells in column Column 4",
    "engineConfig": {
      "facets": [],
      "mode": "record-based"
    },
    "columnName": "Column 4"
  },
  {
    "op": "core/fill-down",
    "description": "Fill down cells in column Column 2",
    "engineConfig": {
      "facets": [],
      "mode": "record-based"
    },
    "columnName": "Column 2"
  },
  {
    "op": "core/column-reorder",
    "description": "Reorder columns",
    "columnNames": [
      "Column 1",
      "Column 2",
      "Column 3",
      "Column 4"
    ]
  }
]

Hervé

like image 122
hpiedcoq Avatar answered Sep 21 '22 12:09

hpiedcoq