Pandas: append dataframe to another df

Tags:

python

pandas

I have a problem with appending of dataframe. I try to execute this code

df_all = pd.read_csv('data.csv', error_bad_lines=False, chunksize=1000000) urls = pd.read_excel('url_june.xlsx') substr = urls.url.values.tolist() df_res = pd.DataFrame() for df in df_all:     for i in substr:         res = df[df['url'].str.contains(i)]         df_res.append(res)

And when I try to save df_res I get empty dataframe. df_all looks like

ID,"url","used_at","active_seconds" b20f9412f914ad83b6611d69dbe3b2b4,"mobiguru.ru/phones/apple/comp/32gb/apple_iphone_5s.html",2015-10-01 00:00:25,1 b20f9412f914ad83b6611d69dbe3b2b4,"mobiguru.ru/phones/apple/comp/32gb/apple_iphone_5s.html",2015-10-01 00:00:31,30 f85ce4b2f8787d48edc8612b2ccaca83,"4pda.ru/forum/index.php?showtopic=634566&view=getnewpost",2015-10-01 00:01:49,2 d3b0ef7d85dbb4dbb75e8a5950bad225,"shop.mts.ru/smartfony/mts/smartfon-smart-sprint-4g-sim-lock-white.html?utm_source=admitad&utm_medium=cpa&utm_content=300&utm_campaign=gde_cpa&uid=3",2015-10-01 00:03:19,34 078d388438ebf1d4142808f58fb66c87,"market.yandex.ru/product/12675734/spec?hid=91491&track=char",2015-10-01 00:03:48,2 d3b0ef7d85dbb4dbb75e8a5950bad225,"avito.ru/yoshkar-ola/telefony/mts",2015-10-01 00:04:21,4 d3b0ef7d85dbb4dbb75e8a5950bad225,"shoppingcart.aliexpress.com/order/confirm_order",2015-10-01 00:04:25,1 d3b0ef7d85dbb4dbb75e8a5950bad225,"shoppingcart.aliexpress.com/order/confirm_order",2015-10-01 00:04:26,9

and urls looks like

url shoppingcart.aliexpress.com/order/confirm_order ozon.ru/?context=order_done&number= lk.wildberries.ru/basket/orderconfirmed lamoda.ru/checkout/onepage/success/quick mvideo.ru/confirmation?_requestid= eldorado.ru/personal/order.php?step=confirm

When I print res in a loop it doesn't empty. But when I try print in a loop df_res after append, it return empty dataframe. I can't find my error. How can I fix it?

440

asked Oct 02 '16 09:10

Petr Petrov

2 Answers

If you look at the documentation for pd.DataFrame.append

Append rows of other to the end of this frame, returning a new object. Columns not in this frame are added as new columns.

(emphasis mine).

Try

df_res = df_res.append(res)

Incidentally, note that pandas isn't that efficient for creating a DataFrame by successive concatenations. You might try this, instead:

all_res = [] for df in df_all:     for i in substr:         res = df[df['url'].str.contains(i)]         all_res.append(res)  df_res = pd.concat(all_res)

This first creates a list of all the parts, then creates a DataFrame from all of them once at the end.

111

answered Sep 30 '22 17:09

Ami Tavory

If we want append based on index:

df_res = pd.DataFrame(data = None, columns= df.columns)  all_res = []  d1 = df.ix[index-10:index-1,]     #it will take 10 rows before i-th index  all_res.append(d1)  df_res = pd.concat(all_res)

answered Sep 30 '22 16:09

Siddharth Raj

Related questions
                            
                                Insert element in Python list after every nth element
                            
                                Airflow - run task regardless of upstream success/fail
                            
                                Python: ufunc 'add' did not contain a loop with signature matching types dtype('S21') dtype('S21') dtype('S21')
                            
                                multiprocess or threading in python?
                            
                                What is a good size (in bytes) for a log file?
                            
                                What are Python metaclasses useful for?
                            
                                Testing Equivalence of xml.etree.ElementTree
                            
                                Uploading large files with Python/Django
                            
                                Why would shutil.copy() raise a permission exception when cp doesn't?
                            
                                install filter on logging level in python using dictConfig
                            
                                Sending messages with Telegram - APIs or CLI?
                            
                                Opening a .ipynb.txt File
                            
                                parametrize and running a single test in pytest
                            
                                How can you test that two dictionaries are equal with pytest in python
                            
                                Why 1//0.01 == 99 in Python?
                            
                                Can I use a class attribute as a default value for an instance method?
                            
                                How to make a list of n numbers in Python and randomly select any number?
                            
                                Find number of columns in csv file
                            
                                Neural Network training with PyBrain won't converge
                            
                                Can you create a Python list from a string, while keeping characters in specific keywords together?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With