I would like to slice two columns in my data frame.
This is my code for doing this:
import pandas as pd
df = pd.read_csv('source.txt',header=0)
cidf=df.loc[:,['vocab','sumCI']]
print(cidf)
This is a sample of data:
ID vocab sumCI sumnextCI new_diff
450 statu 3.0 0.0 3.0
391 provid 4.0 1.0 3.0
382 prescript 3.0 0.0 3.0
300 lymphoma 2.0 0.0 2.0
405 renew 2.0 0.0 2.0
**Firstly I got this error: **
KeyError: “None of [['', '']] are in the [columns]”'
What I have tried:
header
with index 0
while reading the file,I tried to rename columns with this code:
df.rename(columns=df.iloc[0],inplace=True)
I also tried this:
df.columns = df.iloc[1]
df=df.reindex(df.index.drop(0))
Also tried comments in this link
None of the above resolved the issue.
By the print you posted, it seems like you have whitespaces as delimiters. pd.read_csv
will read using ,
as default separator, so you have to explicitly state it:
pd.read_csv('source.txt',header=0, delim_whitespace=True)
simply write code to create a new CSV file and use a new file
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
pd.read_csv('source.txt',header=0, delim_whitespace=True)
headers = ['ID','vocab','sumCI','sumnextCI','new_diff']
df.columns = headers
df.to_csv('newsource.txt')
Maybe you have white spaces around your column names, double check your csv file
You can try doing this:
pd.read_csv('source.txt',header=0, delim_whitespace=True)
If you have any white spaces in the data you're will get an error, so delim_whitespace
is included to remove those in case they're in the data.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With