To add a prefix/suffix to a dataframe, I usually do the following..
For instance, to add a suffix '@',
df = df.astype(str) + '@'
This has basically appended a '@' to all cell values.
I would like to know how to remove this suffix. Is there a method available with the pandas.DataFrame class directly that removes a particular prefix/suffix character from the entire DataFrame ?
I've tried iterating through the rows (as series) while using rstrip('@') as follows:
for index in range(df.shape[0]):
    row = df.iloc[index]
    row = row.str.rstrip('@')
Now, in order to make dataframe out of this series,
new_df = pd.DataFrame(columns=list(df))
new_df = new_df.append(row)
However, this doesn't work. Gives empty dataframe.
Is there something really basic that I am missing?
Pandas DataFrame add() Method The add() method adds each value in the DataFrame with a specified value. The specified value must be an object that can be added to the values of the DataFrame.
You can create a DataFrame and append a new row to this DataFrame from dict, first create a Python Dictionary and use append() function, this method is required to pass ignore_index=True in order to append dict as a row to DataFrame, not using this will get you an error.
You could use applymap to apply your string method to each element:
df = df.applymap(lambda x: str(x).rstrip('@'))
Note: I wouldn't expect this to be as fast as the vectorized approach: pd.Series.str.rstrip i.e. transforming each column separately
You can use apply and the str.strip method of pd.Series:
In [13]: df
Out[13]:
       a       b      c
0    dog   quick    the
1   lazy    lazy    fox
2  brown   quick    dog
3  quick     the   over
4  brown    over   lazy
5    fox   brown  quick
6  quick     fox    the
7    dog  jumped    the
8   lazy   brown    the
9    dog    lazy    the
In [14]: df = df + "@"
In [15]: df
Out[15]:
        a        b       c
0    dog@   quick@    the@
1   lazy@    lazy@    fox@
2  brown@   quick@    dog@
3  quick@     the@   over@
4  brown@    over@   lazy@
5    fox@   brown@  quick@
6  quick@     fox@    the@
7    dog@  jumped@    the@
8   lazy@   brown@    the@
9    dog@    lazy@    the@
In [16]: df = df.apply(lambda S:S.str.strip('@'))
In [17]: df
Out[17]:
       a       b      c
0    dog   quick    the
1   lazy    lazy    fox
2  brown   quick    dog
3  quick     the   over
4  brown    over   lazy
5    fox   brown  quick
6  quick     fox    the
7    dog  jumped    the
8   lazy   brown    the
9    dog    lazy    the
Note, your approach doesn't work because when you do the following assignment in your for-loop:
row = row.str.rstrip('@')
This merely assigns the result of row.str.strip to the name row without mutating the DataFrame. This is the same behavior for all python objects and simple name assignment:
In [18]: rows = [[1,2,3],[4,5,6],[7,8,9]]
In [19]: print(rows)
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
In [20]: for row in rows:
    ...:     row = ['look','at','me']
    ...:
In [21]: print(rows)
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
To actually change the underlying data structure you need to use a mutator method:
In [22]: rows
Out[22]: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
In [23]: for row in rows:
    ...:     row.append("LOOKATME")
    ...:
In [24]: rows
Out[24]: [[1, 2, 3, 'LOOKATME'], [4, 5, 6, 'LOOKATME'], [7, 8, 9, 'LOOKATME']]
Note that slice-assignment is just syntactic sugar for a mutator method:
In [26]: rows
Out[26]: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
In [27]: for row in rows:
    ...:     row[:] = ['look','at','me']
    ...:
    ...:
In [28]: rows
Out[28]: [['look', 'at', 'me'], ['look', 'at', 'me'], ['look', 'at', 'me']]
This is analogous to pandas loc or iloc based assignment.
You could make this real easy and just use pandas.DataFrame.replace() method to replace all "@" with a "":
df.replace("@", "")
If you are worried about "@" being replaced not just at the end of your values, you could use regex:
df.replace("@$", "", regex=True) 
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With