Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

convert Int64Index to Int

Tags:

pandas

I'm iterating through a dataframe (called hdf) and applying changes on a row by row basis. hdf is sorted by group_id and assigned a 1 through n rank on some criteria.

# Groupby function creates subset dataframes (a dataframe per distinct group_id).
grouped = hdf.groupby('group_id')

# Iterate through each subdataframe. 
for name, group in grouped:

    # This grabs the top index for each subdataframe
    index1 = group[group['group_rank']==1].index

    # If criteria1 == 0, flag all rows for removal
    if(max(group['criteria1']) == 0):    
        for x in range(rank1, rank1 + max(group['group_rank'])):
            hdf.loc[x,'remove_row'] = 1

I'm getting the following error:

TypeError: int() argument must be a string or a number, not 'Int64Index'

I get the same error when I try to cast rank1 explicitly I get the same error:

rank1 = int(group[group['auction_rank']==1].index)

Can someone explain what is happening and provide an alternative?

like image 803
Christopher Jenkins Avatar asked Oct 13 '15 19:10

Christopher Jenkins


People also ask

What is int64index in pandas?

Immutable sequence used for indexing and alignment. The basic object storing axis labels for all pandas objects. Int64Index is a special case of Index with purely integer labels. .

What is the difference between Int32 and Int64?

// Converted the Int32 value 340 to the Int64 value 340. // Converted the Int32 value 2147483647 to the Int64 value 2147483647. Converts the value of the specified 32-bit signed integer to an equivalent 64-bit signed integer. The 32-bit signed integer to convert. A 64-bit signed integer that is equivalent to value.

What is the difference between Int64 minvalue and Int64 maxValue?

value represents a number that is less than Int64.MinValue or greater than Int64.MaxValue. The following example attempts to convert each element in an array of numeric strings to a long integer.

What to do if Int64 conversion fails?

If provider is null, the NumberFormatInfo for the current culture is used. If you prefer not to handle an exception if the conversion fails, you can call the Int64.TryParse method instead. It returns a Boolean value that indicates whether the conversion succeeded or failed.


1 Answers

The answer to your specific question is that index1 is an Int64Index (basically a list), even if it has one element. To get that one element, you can use index1[0].

But there are better ways of accomplishing your goal. If you want to remove all of the rows in the "bad" groups, you can use filter:

hdf = hdf.groupby('group_id').filter(lambda group: group['criteria1'].max() != 0)

If you only want to remove certain rows within matching groups, you can write a function and then use apply:

def filter_group(group):
    if group['criteria1'].max() != 0:
        return group
    else:
        return group.loc[other criteria here]

hdf = hdf.groupby('group_id').apply(filter_group)

(If you really like your current way of doing things, you should know that loc will accept an index, not just an integer, so you could also do hdf.loc[group.index, 'remove_row'] = 1).

like image 200
Evan Wright Avatar answered Sep 22 '22 09:09

Evan Wright