Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas selecting discontinuous columns from a dataframe

Tags:

python

pandas

I am using the following to select specific columns from the dataframe comb, which I would like to bring into a new dataframe. The individual selects work fine EG: comb.ix[:,0:1], but when I attempt to combine them using the + I get a bad result the 1st selection ([:,0:1]) getting stuck on the end of the dataframe and the values contained in original col 1 are wiped out while appearing at the end of the row. What is the right way to get just the columns I want? (I'd include sample data but as you may see, too many columns...which is why I'm trying to do it this way)

comb.ix[:,0:1]+comb.ix[:,17:342]
like image 728
dartdog Avatar asked Mar 23 '15 22:03

dartdog


People also ask

How do I select only certain columns in a DataFrame?

If you have a DataFrame and would like to access or select a specific few rows/columns from that DataFrame, you can use square brackets or other advanced methods such as loc and iloc .

How do I extract only certain columns in Python?

Selecting rows and columns from a pandas Dataframe If we know which columns we want before we read the data from the file we can tell read_csv() to only import those columns by specifying columns either by their index number (starting at 0) as a list to the usecols parameter.


2 Answers

NumPy has a nice module named r_, allowing you to solve it with the modern DataFrame selection interface, iloc:

df.iloc[:, np.r_[0:1, 17:342]]

I believe this is a more elegant solution.

It even support more complex selections:

df.iloc[:, np.r_[0:1, 5, 16, 17:342:2, -5:]]
like image 173
neves Avatar answered Sep 25 '22 07:09

neves


I recently solved it by just appending ranges

r1 = pd.Series(range(5))
r2 = pd.Series([10,15,20])
final_range = r1.append(r2)
df.iloc[:,final_range]

Then you will get columns from 0:5 and 10, 15, 20.

like image 36
David Hernandez Mendez Avatar answered Sep 26 '22 07:09

David Hernandez Mendez