Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split a string in pandas row and insert new rows by enlarging the dataframe

I have the following DataFrame:

no word status
0 0 one to_check
1 1 two to_check
2 2 :) emoticon
3 3 dr. to_check
4 4 "future" to_check
5 5 to to_check
6 6 be to_check

I want to iterate trough each row to find quotes at word initial and final positions and create a DataFrame like this:

no word status
0 0 one to_check
1 1 two to_check
2 2 :) emoticon
3 3 dr. to_check
4 4 " quotes
5 4 future word
6 4 " quotes
7 5 to to_check
8 6 be to_check

I can strip quotes and split the word into three pieces but I got the this DataFrame, it overwrites the last two rows:

no word status
0 0 one to_check
1 1 two to_check
2 2 :) emoticon
3 3 dr. to_check
4 4 " quotes
5 4 future word
6 4 " quotes

I tried df.loc[index], df.iloc[index], df.at[index] but none of them helped me to extend the number of rows in the DataFrame.

Is it possible to add new rows at specific index without overwriting last two rows?

like image 835
Taner Sezer Avatar asked Sep 14 '21 01:09

Taner Sezer


People also ask

How do I split a string into multiple rows in pandas?

To split cell into multiple rows in a Python Pandas dataframe, we can use the apply method. to call apply with a lambda function that calls str. split to split the x string value. And then we call explode to fill new rows with the split values.

How do you split a Dataframe string?

split() Pandas provide a method to split string around a passed separator/delimiter. After that, the string can be stored as a list in a series or it can also be used to create multiple column data frames from a single separated string.

How do I enlarge pandas Dataframe?

Pandas DataFrame - expanding() functionThe expanding() function is used to provide expanding transformations. Minimum number of observations in window required to have a value (otherwise result is NA). Set the labels at the center of the window.

How to split a string column in a pandas Dataframe?

You can use the following basic syntax to split a string column in a pandas DataFrame into multiple columns: #split column A into two columns: column A and column B df[[' A ', ' B ']] = df[' A ']. str. split (', ', 1, expand= True) The following examples show how to use this syntax in practice. Example 1: Split Column by Comma

How do I add a new row to a pandas Dataframe?

If you have your own data to follow along with, feel free to do so (though your results will, of course, vary): We have four records and three different columns, covering a person’s Name, Age, and Location. The easiest way to add or insert a new row into a Pandas DataFrame is to use the Pandas .append () method.

How to explode a list of comma separated strings in pandas?

Pandas >= 0.25 Series and DataFrame methods define a.explode () method that explodes lists into separate rows. See the docs section on Exploding a list-like column. Since you have a list of comma separated strings, split the string on comma to get a list of elements, then call explode on that column.

How to split a column in a Dataframe using tidy_split?

Column order and names are retained. def tidy_split(df, column, sep='|', keep=False): """ Split the values of a column and expand so the new DataFrame has one split value per row. Filters rows where the column is missing.


1 Answers

In your case you can split then explode

out = df.assign(word = df.word.str.split(r'(\")')).explode('word').\
           loc[lambda x : x['word']!='']
   no    word    status
0   0     one  to_check
1   1     two  to_check
2   2      :)  emoticon
3   3     dr.  to_check
4   4       "  to_check
4   4  future  to_check
4   4       "  to_check
5   5      to  to_check
6   6      be  to_check

For change the status

out['status'] = np.where(out['word'].eq('"'), 'quotes',out['status'])
like image 99
BENY Avatar answered Oct 19 '22 16:10

BENY