How Do I add a single item to a serialized panda series. I know it's not the most efficient way memory wise, but i still need to do that.
Something along:
>> x = Series() >> N = 4 >> for i in xrange(N): >> x.some_appending_function(i**2) >> print x 0 | 0 1 | 1 2 | 4 3 | 9
also, how can i add a single row to a pandas DataFrame?
append() function to append the passed series object at the end of this series object. Ignore the original index of the two series objects. Now we will use Series. append() function to append sr2 at the end of sr1 series.
iloc attribute enables purely integer-location based indexing for selection by position over the given Series object. Example #1: Use Series. iloc attribute to perform indexing over the given Series object.
Appending a multiple rows - Appending a list of Dictionaries to a DataFrame. You can also pass a list of Series or a list of Dictionaries to append multiple rows.
How to add single item. This is not very effective but follows what you are asking for:
x = p.Series() N = 4 for i in xrange(N): x = x.set_value(i, i**2)
produces x:
0 0 1 1 2 4 3 9
Obviously there are better ways to generate this series in only one shot.
For your second question check answer and references of SO question add one row in a pandas.DataFrame.
TLDR: do not append items to a series one by one, better extend with an ordered collection
I think the question in its current form is a bit tricky. And the accepted answer does answer the question. But the more I use pandas, the more I understand that it's a bad idea to append items to a Series one by one. I'll try to explain why for pandas beginners.
You might think that appending data to a given Series might allow you to reuse some resources, but in reality a Series is just a container that stores a relation between an index and a values array. Each is a numpy.array under the hood, and the index is immutable. When you add to Series an item with a label that is missing in the index, a new index with size n+1 is created, and a new values values array of the same size. That means that when you append items one by one, you create two more arrays of the n+1 size on each step.
By the way, you can not append a new item by position (you will get an IndexError) and the label in an index does not have to be unique, that is when you assign a value with a label, you assign the value to all existing items with the the label, and a new row is not appended in this case. This might lead to subtle bugs.
The moral of the story is that you should not append data one by one, you should better extend with an ordered collection. The problem is that you can not extend a Series inplace. That is why it is better to organize your code so that you don't need to update a specific instance of a Series by reference.
If you create labels yourself and they are increasing, the easiest way is to add new items to a dictionary, then create a new Series from the dictionary (it sorts the keys) and append the Series to an old one. If the keys are not increasing, then you will need to create two separate lists for the new labels and the new values.
Below are some code samples:
In [1]: import pandas as pd In [2]: import numpy as np In [3]: s = pd.Series(np.arange(4)**2, index=np.arange(4)) In [4]: s Out[4]: 0 0 1 1 2 4 3 9 dtype: int64 In [6]: id(s.index), id(s.values) Out[6]: (4470549648, 4470593296)
When we update an existing item, the index and the values array stay the same (if you do not change the type of the value)
In [7]: s[2] = 14 In [8]: id(s.index), id(s.values) Out[8]: (4470549648, 4470593296)
But when you add a new item, a new index and a new values array is generated:
In [9]: s[4] = 16 In [10]: s Out[10]: 0 0 1 1 2 14 3 9 4 16 dtype: int64 In [11]: id(s.index), id(s.values) Out[11]: (4470548560, 4470595056)
That is if you are going to append several items, collect them in a dictionary, create a Series, append it to the old one and save the result:
In [13]: new_items = {item: item**2 for item in range(5, 7)} In [14]: s2 = pd.Series(new_items) In [15]: s2 # keys are guaranteed to be sorted! Out[15]: 5 25 6 36 dtype: int64 In [16]: s = s.append(s2); s Out[16]: 0 0 1 1 2 14 3 9 4 16 5 25 6 36 dtype: int64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With