pd.dataframe.apply() create multiple new columns

Question

I have a bunch of files where I want to open, read the first line, parse it into several expected pieces of information, and then put the filenames and those data as rows in a dataframe. My question concerns the recommended syntax to build the dataframe in a pandanic/pythonic way (the file-opening and parsing I already have figured out).

For a dumbed-down example, the following seems to be the recommended thing to do when you want to create one new column:

df = pd.DataFrame(files, columns=['filename'])
df['first_letter'] = df.apply(lambda x: x['filename'][:1], axis=1)

but I can't, say, do this:

df['first_letter'], df['second_letter'] = df.apply(lambda x: (x['filename'][:1], x['filename'][1:2]), axis=1)

as the apply function creates only one column with tuples in it.

Keep in mind that, in place of the lambda function I will place a function that will open the file and read and parse the first line.

joris · Accepted Answer

You can put the two values in a Series, and then it will be returned as a dataframe from the apply (where each series is a row in that dataframe). With a dummy example:

In [29]: df = pd.DataFrame(['Aa', 'Bb', 'Cc'], columns=['filenames'])

In [30]: df
Out[30]:
  filenames
0        Aa
1        Bb
2        Cc

In [31]: df['filenames'].apply(lambda x : pd.Series([x[0], x[1]]))
Out[31]:
   0  1
0  A  a
1  B  b
2  C  c

This you can then assign to two new columns:

In [33]: df[['first', 'second']] = df['filenames'].apply(lambda x : pd.Series([x[0], x[1]]))

In [34]: df
Out[34]:
  filenames first second
0        Aa     A      a
1        Bb     B      b
2        Cc     C      c

pd.dataframe.apply() create multiple new columns

Tags:

pandas

Nathan Lloyd

1 Answers

joris

Recent Activity

Donate For Us

pd.dataframe.apply() create multiple new columns

Tags:

pandas

Nathan Lloyd

1 Answers

joris

Related questions

Recent Activity

Donate For Us