I have a column FileName
in pandas dataframe which consists of strings containing filenames of the form . The filename can contain dots('.') in them. For example, a a.b.c.d.txt
is a txt file. I just want to have another column FileType
column containing only the file extensions.
Sample DataFrame:
FileName
a.b.c.d.txt
j.k.l.exe
After processing:
FileName FileType
a.b.c.d.txt txt
j.k.l.exe exe
I tried the following:
X['FileType'] = X.FileName.str.split(pat='.')
This help me split the string on .
. But how do I get the last element i.e. the file extension?
Something like
X['FileType'] = X.FileName.str.split(pat='.')[-1]
X['FileType'] = X.FileName.str.split(pat='.').pop(-1)
did not give the desired output.
Use the os. path. splitext() method to split a filename on the name and extension, e.g. filename, extension = os.
We can use Python os module splitext() function to get the file extension. This function splits the file path into a tuple having two values - root and extension.
If you need to create or unpack lists in your DataFrames, you can make use of the Series. str. split() and df. explode() methods respectively.
ext is the extension of file file.
Option 1apply
df['FileType'] = df.FileName.apply(lambda x: x.split('.')[-1])
Option 2
Use str
twice
df['FileType'] = df.FileName.str.split('.').str[-1]
Option 2b
Use rsplit
(thanks @cᴏʟᴅsᴘᴇᴇᴅ)
df['FileType'] = df.FileName.str.rsplit('.', 1).str[-1]
All result in:
FileName FileType
0 a.b.c.d.txt txt
1 j.k.l.exe exe
Python 3.6.4, Pandas 0.22.0
If you don't want to split the extension from the filename, then I would recommend a list comprehension—
str.rsplit
df['FileType'] = [f.rsplit('.', 1)[-1] for f in df.FileName.tolist()]
df
FileName FileType
0 a.b.c.d.txt txt
1 j.k.l.exe exe
If you want to split the path and the filename, there are a couple of options.
os.path.splitext
import os
pd.DataFrame(
[os.path.splitext(f) for f in df.FileName],
columns=['Name', 'Type']
)
Name Type
0 a.b.c.d .txt
1 j.k.l .exe
str.extract
df.FileName.str.extract(r'(?P<FileName>.*)(?P<FileType>\..*)', expand=True)
Name Type
0 a.b.c.d .txt
1 j.k.l .exe
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With