I have a problem when adding row numbers using Apache Pig. The problem is that I have a STR_ID column and I want to add a ROW_NUM column for the data in STR_ID, which is the row number of the STR_ID.
For example, here is the input:
STR_ID
------------
3D64B18BC842
BAECEFA8EFB6
346B13E4E240
6D8A9D0249B4
9FD024AA52BA
How do I get the output like:
STR_ID | ROW_NUM
----------------------------
3D64B18BC842 | 1
BAECEFA8EFB6 | 2
346B13E4E240 | 3
6D8A9D0249B4 | 4
9FD024AA52BA | 5
Answers using Pig or Hive are acceptable. Thank you.
In Hive:
Query
select str_id,row_number() over() from tabledata;
Output
3D64B18BC842 1
BAECEFA8EFB6 2
346B13E4E240 3
6D8A9D0249B4 4
9FD024AA52BA 5
Facebook posted a number of hive UDFs including NumberRows. Depending on your hive version (I believe 0.8) you may need to add an attribute to the class (stateful=true).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With