Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to change case of whole column to lowercase?

I want to Change case of whole column to Lowercase in Spark Dataset

        Desired Input
        +------+--------------------+
        |ItemID|       Category name|
        +------+--------------------+
        |   ABC|BRUSH & BROOM HAN...|
        |   XYZ|WHEEL BRUSH PARTS...|
        +------+--------------------+

        Desired Output
        +------+--------------------+
        |ItemID|       Category name|
        +------+--------------------+
        |   ABC|brush & broom han...|
        |   XYZ|wheel brush parts...|
        +------+--------------------+

I tried with collectAsList() and toString(), which is slow and complex procedure for very large dataset.

I also found a method 'lower' but didnt get to know how to get it work in dasaset Please suggest me a simple or effective way to do the above. Thanks in advance

like image 844
Shreeharsha Avatar asked Apr 19 '17 16:04

Shreeharsha


People also ask

How do I convert entire columns to lowercase in Excel?

This formula converts the name in cell A2 from uppercase to proper case. To convert the text to lowercase, type =LOWER(A2) instead. Use =UPPER(A2) in cases where you need to convert text to uppercase, replacing A2 with the appropriate cell reference. Now, fill down the formula in the new column.

How do you make a column lowercase?

Select a blank cell which is adjacent to the cell you want to make uppercase or lowercase. 2. For making cell text uppercase, please enter the formula =UPPER(B2) into the formula bar, and then press the Enter key. And for making cell lowercase, enter the formula =LOWER(B2).

Can you convert lowercase to uppercase for entire column?

Select the "Formulas" tab > Select the "Text" drop-down list in the "Function Library" group. Select "LOWER" for lowercase and "UPPER" for uppercase. Next to the "Text" field, click the spreadsheet icon. Click the first cell in the row or column that you would like to change the text case.

How do I change everything to lowercase?

To use a keyboard shortcut to change between lowercase, UPPERCASE, and Capitalize Each Word, select the text and press SHIFT + F3 until the case you want is applied.


3 Answers

I Got it (use Functions#lower, see Javadoc)

import org.apache.spark.sql.functions.lower

        String columnName="Category name";
        src=src.withColumn(columnName, lower(col(columnName)));
        src.show();

This replaced old column with new one retaining the whole Dataset.

        +------+--------------------+
        |ItemID|       Category name|
        +------+--------------------+
        |   ABC|brush & broom han...|
        |   XYZ|wheel brush parts...|
        +------+--------------------+
like image 191
Shreeharsha Avatar answered Oct 18 '22 18:10

Shreeharsha


Use lower function from org.apache.spark.sql.functions

For instance:

df.select($"q1Content", lower($"q1Content")).show

The output.

+--------------------+--------------------+
|           q1Content|    lower(q1Content)|
+--------------------+--------------------+
|What is the step ...|what is the step ...|
|What is the story...|what is the story...|
|How can I increas...|how can i increas...|
|Why am I mentally...|why am i mentally...|
|Which one dissolv...|which one dissolv...|
|Astrology: I am a...|astrology: i am a...|
| Should I buy tiago?| should i buy tiago?|
|How can I be a go...|how can i be a go...|
|When do you use  ...|when do you use  ...|
|Motorola (company...|motorola (company...|
|Method to find se...|method to find se...|
|How do I read and...|how do i read and...|
|What can make Phy...|what can make phy...|
|What was your fir...|what was your fir...|
|What are the laws...|what are the laws...|
|What would a Trum...|what would a trum...|
|What does manipul...|what does manipul...|
|Why do girls want...|why do girls want...|
|Why are so many Q...|why are so many q...|
|Which is the best...|which is the best...|
+--------------------+--------------------+
like image 25
Alberto Bonsanto Avatar answered Oct 18 '22 18:10

Alberto Bonsanto


You can do it like this in Scala:

import org.apache.spark.sql.functions._

val dfAfterLowerCase = dfInitial.withColumn("column_name", lower(col("column_name")))
dfAfterLowerCase.show()
like image 2
Yauheni Leaniuk Avatar answered Oct 18 '22 17:10

Yauheni Leaniuk