Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create an array of dataframes in Python

I want to write a piece of code to create multiple arrays of dataFrames with their names in the format of word_0000, where the four digits are month and year. An example of what I'd like to do is to create the following dataframes:

df_0115, df_0215, df_0315, ... , df_1215
stat_0115, stat_0215, stat_0315, ... , stat_1215
like image 888
Ana Avatar asked Nov 25 '15 02:11

Ana


People also ask

How do you create a DataFrame array in Python?

How do you convert an array to a DataFrame in Python? To convert an array to a dataframe with Python you need to 1) have your NumPy array (e.g., np_array), and 2) use the pd. DataFrame() constructor like this: df = pd. DataFrame(np_array, columns=['Column1', 'Column2']) .

How do you create an array in Python?

In Python, you can create new datatypes, called arrays using the NumPy package. NumPy arrays are optimized for numerical analyses and contain only a single data type. You first import NumPy and then use the array() function to create an array. The array() function takes a list as an input.

Is DataFrame an array Python?

DataFrames and Series in Pandas Series are similar to one-dimensional NumPy arrays, with a single dtype, although with an additional index (list of row labels). DataFrames are an ordered sequence of Series, sharing the same index, with labeled columns.

What is DataFrame array?

A data frame is a table or a two-dimensional array-like structure in which each column contains values of one variable and each row contains one set of values from each column. Following are the characteristics of a data frame. The column names should be non-empty. The row names should be unique.


2 Answers

I suggest that you create a dictionary to hold the DataFrames. That way you will be able to index them with a month-day key:

import datetime as dt 
import numpy as np
import pandas as pd

dates_list = [dt.datetime(2015,11,i+1) for i in range(3)]
month_day_list = [d.strftime("%m%d") for d in dates_list]

dataframe_collection = {} 

for month_day in month_day_list:
    new_data = np.random.rand(3,3)
    dataframe_collection[month_day] = pd.DataFrame(new_data, columns=["one", "two", "three"])

for key in dataframe_collection.keys():
    print("\n" +"="*40)
    print(key)
    print("-"*40)
    print(dataframe_collection[key])

The code above prints out the following result:

========================================
1102
----------------------------------------
        one       two     three
0  0.896120  0.742575  0.394026
1  0.414110  0.511570  0.268268
2  0.132031  0.142552  0.074510

========================================
1103
----------------------------------------
        one       two     three
0  0.558303  0.259172  0.373240
1  0.726139  0.283530  0.378284
2  0.776430  0.243089  0.283144

========================================
1101
----------------------------------------
        one       two     three
0  0.849145  0.198028  0.067342
1  0.620820  0.115759  0.809420
2  0.997878  0.884883  0.104158
like image 138
Pedro M Duarte Avatar answered Oct 21 '22 14:10

Pedro M Duarte


df will have all the CSV files you need. df[0] to access first one

df=[]    
files = glob.glob("*.csv")
    for a in files:
        df.append( pd.read_csv(a))
like image 29
Malik Mussabeheen Noor Avatar answered Oct 21 '22 13:10

Malik Mussabeheen Noor