I've got a large file with login information for a list of users. The problem is that the file includes other information in the Date
column. I would like to remove all rows that are not of type datetime
in the Date
column. My data resembles
df
:
Name | Date |
---|---|
name_1 | 2012-07-12 22:20:00 |
name_1 | 2012-07-16 22:19:00 |
name_1 | 2013-12-16 17:50:00 |
name_1 | 4345 # type = 'int' |
... | # type = 'float' |
name_2 | 2010-01-11 19:54:00 |
name_2 | 2010-02-06 12:10:00 |
... | |
name_2 | 2012-07-18 22:12:00 |
name_2 | 4521 |
... | |
name_5423 | 2013-11-23 10:21:00 |
... | |
name_5423 | 7532 |
I've tried modifying the solution to
Finding non-numeric rows in dataframe in pandas?
Remove rows where column value type is string Pandas
and How-should-I-delete-rows-from-a-DataFrame-in-Python-Pandas
to fit my needs.
The problem is that whenever I attempt the change I either get an error or the entire dataframe gets deleted
Use pandas. DataFrame. drop() method to delete/remove rows with condition(s).
You can delete a list of rows from Pandas by passing the list of indices to the drop() method. In this code, [5,6] is the index of the rows you want to delete. axis=0 denotes that rows should be deleted from the dataframe.
To delete rows and columns from DataFrames, Pandas uses the “drop” function. To delete a column, or multiple columns, use the name of the column(s), and specify the “axis” as 1. Alternatively, as in the example below, the 'columns' parameter has been added in Pandas which cuts out the need for 'axis'.
Use pd.to_datetime
with parameter errors='coerce'
to make non-dates into NaT
null values. Then you can drop those rows
df['Date'] = pd.to_datetime(df['Date'], errors='coerce')
df = df.dropna(subset=['Date'])
df
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With