Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: get first 10 elements of a series

I have a data frame with a column tfidf_sorted as follows:

   tfidf_sorted

0  [(morrell, 45.9736796), (football, 25.58352014...
1  [(melatonin, 48.0010051405), (lewy, 27.5842077...
2  [(blues, 36.5746634797), (harpdog, 20.58669641...
3  [(lem, 35.1570832476), (rottensteiner, 30.8800...
4  [(genka, 51.4667410433), (legendaarne, 30.8800...

The type(df.tfidf_sorted) returns pandas.core.series.Series.

This column was created as follows:

df['tfidf_sorted'] = df['tfidf'].apply(lambda y: sorted(y.items(), key=lambda x: x[1], reverse=True))

where tfidf is a dictionary.

How do I get the first 10 key-value pairs from tfidf_sorted?

like image 800
chintan s Avatar asked Oct 05 '16 06:10

chintan s


People also ask

How to access the elements of a series in Python – pandas?

how to Access the elements of a Series in python – pandas 1 Accessing Data from Series with Position in python pandas 2 Accessing first “n” elements & last “n” elements of series in pandas 3 Retrieve Data Using Label (index) in python pandas More ...

How to get the first 10 rows of pandas Dataframe?

Here we can see how to get the first 10 rows of Pandas DataFrame. In this program, we have pass ’10’ as an argument in df.head () function. To return the first 10 rows we can use DataFrame.head (). This method is used to return 10 rows of a given DataFrame or series.

How to get the initial period of the time series data?

The pandas series.first () method is supposed to return initial periods based on the dates. By applying this method we can get the initial periods of the time series data based on a date offset. It has a parameter called offset and also we can mention the length of the offset data to select the rows within the limit.

How do I get the value in a pandas Dataframe?

The following code shows how to get the value in a pandas Series that is a column in a pandas DataFrame By using the loc and values functions, we’re able to get the value ‘Spurs’ from the DataFrame. The following tutorials explain how to perform other common operations in pandas:


1 Answers

IIUC you can use:

from itertools import chain 

#flat nested lists
a = list(chain.from_iterable(df['tfidf_sorted']))
#sorting
a.sort(key=lambda x: x[1], reverse=True)
#get 10 top
print (a[:10])

Or if need top 10 per row add [:10]:

df['tfidf_sorted'] = df['tfidf'].apply(lambda y: (sorted(y.items(), key=lambda x: x[1], reverse=True))[:10])
like image 173
jezrael Avatar answered Sep 19 '22 03:09

jezrael