Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to drop first row using pandas?

Tags:

python

pandas

I've searched at other questions related to dropping rows but could not find one that worked:

I have a CSV file exported from the tool screaming frog that looks like this:

Internal - HTML |               |             |
--------------- | --------------|-------------|
   Address      |   Content     | Status Code |
----------------|---------------|-------------|
www.example.com |   text/html   |   200       |

I want to remove the first row that contains 'Internal - HTML'. When analyzing it with df.keys() I get this information" Index(['Internal - HTML'], dtype='object').

I want to use the second row as the Index, which contains the correct column labels.

When I use the code:

a = pandas.read_csv("internal_html.csv", encoding="utf-8")
a.drop('Internal - HTML')
a.head(3)

I get this error: KeyError: 'Internal - HTML'

I also tried what was suggested here Remove index name in pandas and also tried resetting the index:

a = pandas.read_csv("internal_html.csv", encoding="utf-8")
a.reset_index(level=0, drop=True)
a.head(3)

None of the options above worked.

like image 926
Robert Padgett Avatar asked Jul 08 '17 14:07

Robert Padgett


People also ask

How do I delete first two rows of DataFrame pandas?

Using iloc[] to Drop First N Rows of DataFrame Use DataFrame. iloc[] the indexing syntax [n:] with n as an integer to select the first n rows from pandas DataFrame. For example df. iloc[n:] , substitute n with the integer number specifying how many rows you wanted to delete.

How do you remove a row in pandas?

Python pandas drop rows by index To remove the rows by index all we have to do is pass the index number or list of index numbers in case of multiple drops. to drop rows by index simply use this code: df. drop(index) . Here df is the dataframe on which you are working and in place of index type the index number or name.

How do you delete the first few rows in Python?

Use drop() to remove first N rows of pandas dataframe To make sure that it removes the rows only, use argument axis=0 and to make changes in place i.e. in calling dataframe object, pass argument inplace=True. We fetched the row names of dataframe as a sequence and passed the first N row names ( df.


1 Answers

You can add header as a parameter in the first call, to use column names and start of data :

a = pandas.read_csv("internal_html.csv", encoding="utf-8", header=1)
like image 156
PRMoureu Avatar answered Oct 17 '22 06:10

PRMoureu