Strip time from an object date in pandas

Tags:

I am having trouble with some dates from zipped xlsx files. These files are loaded into a sqlite database then exported as .csv. Each file is about 40,000 rows per day. The issue I run into is that pd.to_datetime does not seem to work on these objects (dates from Excel format is causing the issue I think - pure .csv files work fine with this command). This is fine actually - I do not need them to be in datetime format.

What I am trying to achieve is creating a column called ShortDate which is %m/%d/%Y. How can I do this on a datetime object (format is mm/dd/yyyy hh:mm:ss from Excel). I will then create a new column called RosterID which combines the EmployeeID field and the ShortDate field together into a unique ID.

I am very new to pandas and I am currently only using it to process .csv files (rename and select certain columns, create unique IDs to use in filters in Tableau, etc).

Click to copy

rep = pd.read_csv(r'C:\Users\Desktop\test.csv.gz', dtype = 'str', compression = 'gzip', usecols = ['etc','etc2'])
print('Read successfully.')
rep['Total']=1
rep['UniqueID']= rep['EmployeeID'] + rep['InteractionID']
rep['ShortDate'] = ??? #what do I do here to get what I am looking for?
rep['RosterID']= rep['EmployeeID'] + rep['ShortDate'] # this is my goal
print('Modified successfully.')

Here is some of the raw data from the .csv. Column names would be

Click to copy

InteractionID, Created Date, EmployeeID, Repeat Date
07927,04/01/2014 14:05:10,912a,04/01/2014 14:50:03
02158,04/01/2014 13:44:05,172r,04/04/2014 17:47:29
44279,04/01/2014 17:28:36,217y,04/07/2014 22:06:19

782

asked Oct 15 '14 16:10

trench

2 Answers

Create a new column, then just apply simple datetime functions using lambda and apply.

Click to copy

In [14]: df['Short Date']= pd.to_datetime(df['Created Date'])

In [15]: df
Out[15]: 
   InteractionID    Created Date EmployeeID     Repeat Date  \
0           7927  4/1/2014 14:05       912a  4/1/2014 14:50   
1           2158  4/1/2014 13:44       172r  4/4/2014 17:47   
2          44279  4/1/2014 17:28       217y  4/7/2014 22:06   

           Short Date  
0 2014-04-01 14:05:00  
1 2014-04-01 13:44:00  
2 2014-04-01 17:28:00  

In [16]: df['Short Date'] = df['Short Date'].apply(lambda x:x.date().strftime('%m%d%y'))

In [17]: df
Out[17]: 
   InteractionID    Created Date EmployeeID     Repeat Date Short Date  
0           7927  4/1/2014 14:05       912a  4/1/2014 14:50     040114   
1           2158  4/1/2014 13:44       172r  4/4/2014 17:47     040114   
2          44279  4/1/2014 17:28       217y  4/7/2014 22:06     040114

Then just concatenate the two columns. Convert the Short Date column to strings to avoid errors on concatenation of strings and integers.

Click to copy

In [32]: df['Roster ID'] = df['EmployeeID'] + df['Short Date'].map(str)

In [33]: df
Out[33]: 
   InteractionID    Created Date EmployeeID     Repeat Date Short Date  \
0           7927  4/1/2014 14:05       912a  4/1/2014 14:50     040114   
1           2158  4/1/2014 13:44       172r  4/4/2014 17:47     040114   
2          44279  4/1/2014 17:28       217y  4/7/2014 22:06     040114   

    Roster ID  
0  912a040114  
1  172r040114  
2  217y040114

answered Sep 22 '22 22:09

NullDev

You can apply a post-processing step that first converts the string to a datetime and then applies a lambda to keep just the date portion:

Click to copy

In [29]:

df['Created Date'] = pd.to_datetime(df['Created Date']).apply(lambda x: x.date())
df['Repeat Date'] = pd.to_datetime(df['Repeat Date']).apply(lambda x: x.date())
df


Out[29]:
   InteractionID Created Date EmployeeID Repeat Date
0           7927   2014-04-01       912a  2014-04-01
1           2158   2014-04-01       172r  2014-04-04
2          44279   2014-04-01       217y  2014-04-07

EDIT

After looking at this again, you can access just the date component using dt.date if your version of pandas is greater than 0.15.0:

Click to copy

In [18]:
df['just_date'] = df['Repeat Date'].dt.date
df

Out[18]:
   InteractionID        Created Date EmployeeID         Repeat Date  \
0           7927 2014-04-01 14:05:10       912a 2014-04-01 14:50:03   
1           2158 2014-04-01 13:44:05       172r 2014-04-04 17:47:29   
2          44279 2014-04-01 17:28:36       217y 2014-04-07 22:06:19   

    just_date  
0  2014-04-01  
1  2014-04-04  
2  2014-04-07

Additionally you can also do dt.strftime now rather than use apply to achieve the result you want:

Click to copy

In [28]:
df['short_date'] = df['Repeat Date'].dt.strftime('%m%d%Y')
df

Out[28]:
   InteractionID        Created Date EmployeeID         Repeat Date  \
0           7927 2014-04-01 14:05:10       912a 2014-04-01 14:50:03   
1           2158 2014-04-01 13:44:05       172r 2014-04-04 17:47:29   
2          44279 2014-04-01 17:28:36       217y 2014-04-07 22:06:19   

    just_date short_date  
0  2014-04-01   04012014  
1  2014-04-04   04042014  
2  2014-04-07   04072014

So generating the Roster Id's is now a trivial exercise of adding the 2 new columns:

Click to copy

In [30]:
df['Roster ID'] = df['EmployeeID'] + df['short_date']
df

Out[30]:
   InteractionID        Created Date EmployeeID         Repeat Date  \
0           7927 2014-04-01 14:05:10       912a 2014-04-01 14:50:03   
1           2158 2014-04-01 13:44:05       172r 2014-04-04 17:47:29   
2          44279 2014-04-01 17:28:36       217y 2014-04-07 22:06:19   

    just_date short_date     Roster ID  
0  2014-04-01   04012014  912a04012014  
1  2014-04-04   04042014  172r04042014  
2  2014-04-07   04072014  217y04072014

179

answered Sep 23 '22 22:09

EdChum

Related questions
                            
                                "2+2=5" Python edition
                            
                                AttributeError: 'module' object has no attribute 'celery'
                            
                                python list comprehension and extend()
                            
                                Get norm of numpy sparse matrix rows
                            
                                Python function to return a list or work as a generator?
                            
                                Find the lowest value that is not null using python
                            
                                Color coding cells in a table based on the cell value using Jinja templates
                            
                                How to allocate array size in Python [duplicate]
                            
                                Django: Any way to change "upload_to" property of FileField without resorting to magic?
                            
                                Get upload/download kbps speed
                            
                                How to check what month it is? Python
                            
                                How to check if an item is the last one of iteration?
                            
                                Delay between for loop iteration (python)
                            
                                What happens when I multiprocessing.pool.apply_async more times than I have processors
                            
                                Check Flask upload if user does not selected file
                            
                                Unable to Create ZIPfile using python
                            
                                Django using locals() [duplicate]
                            
                                Does python have header files like C/C++? [closed]
                            
                                Append Text (Single Letter) to the end of each line in a text file
                            
                                Print current UTC datetime with special format

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Strip time from an object date in pandas

Tags:

python

pandas

trench

People also ask

2 Answers

NullDev

EdChum

Recent Activity

Donate For Us