I am using the following to select specific columns from the dataframe comb, which I would like to bring into a new dataframe. The individual selects work fine EG: comb.ix[:,0:1], but when I attempt to combine them using the + I get a bad result the 1st selection ([:,0:1]) getting stuck on the end of the dataframe and the values contained in original col 1 are wiped out while appearing at the end of the row. What is the right way to get just the columns I want? (I'd include sample data but as you may see, too many columns...which is why I'm trying to do it this way)
comb.ix[:,0:1]+comb.ix[:,17:342]
If you have a DataFrame and would like to access or select a specific few rows/columns from that DataFrame, you can use square brackets or other advanced methods such as loc and iloc .
Selecting rows and columns from a pandas Dataframe If we know which columns we want before we read the data from the file we can tell read_csv() to only import those columns by specifying columns either by their index number (starting at 0) as a list to the usecols parameter.
NumPy has a nice module named r_, allowing you to solve it with the modern DataFrame selection interface, iloc:
df.iloc[:, np.r_[0:1, 17:342]]
I believe this is a more elegant solution.
It even support more complex selections:
df.iloc[:, np.r_[0:1, 5, 16, 17:342:2, -5:]]
I recently solved it by just appending ranges
r1 = pd.Series(range(5))
r2 = pd.Series([10,15,20])
final_range = r1.append(r2)
df.iloc[:,final_range]
Then you will get columns from 0:5 and 10, 15, 20.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With