Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating a pandas data frame of a specific size

In R, I can do something like this:

myvec <- seq(from =  5, to = 10)^2
mydf <- data.frame(matrix(data = myvec, ncol = 3,byrow = TRUE))
> mydf
  X1 X2  X3
1 25 36  49
2 64 81 100

Notice I can specfiy the shape of the data frame by passing in an ncol parameter. I can then fill it either byrow or bycolumn (in this case by row).

If I were to replicate this in Python/Pandas, it's easy enough to create the sequence:

myData = [x**2 for x in range(5,11) ]

However, how do easily make a dataframe of the same size? I can do something like:

myDF = pd.DataFrame(data = myData)

But what would be the parameters to specify the column/row dimensions?

like image 823
user1357015 Avatar asked Aug 03 '17 01:08

user1357015


People also ask

How do you create an empty Dataframe of a certain size?

Create Empty Dataframe With Size You can create a dataframe with a specified size for both columns and rows. Use the range function to create a sequence of numbers and pass it to the index range or the columns range specify column and row sizes.

How do I change the size of a pandas Dataframe?

One way to make a pandas dataframe of the size you wish is to provide index and column values on the creation of the dataframe. This creates a dataframe full of nan's where all columns are of data type object. Show activity on this post. When calling reshape you are allowed to specify the length of one axis as -1 .

How do you define the size of a data frame?

Size and shape of a dataframe in pandas python: Size of a dataframe is the number of fields in the dataframe which is nothing but number of rows * number of columns. Shape of a dataframe gets the number of rows and number of columns of the dataframe.

What does size () do in pandas?

The size property returns the number of elements in the DataFrame. The number of elements is the number of rows * the number of columns.


2 Answers

One way to make a pandas dataframe of the size you wish is to provide index and column values on the creation of the dataframe.

df = pd.DataFrame(index=range(numRows),columns=range(numCols))

This creates a dataframe full of nan's where all columns are of data type object.

like image 138
Kevinj22 Avatar answered Sep 19 '22 06:09

Kevinj22


Use reshape to specify the number of columns (or rows):

import numpy as np
import pandas as pd

myvec = np.arange(5, 11)**2
mydf = pd.DataFrame(myvec.reshape(-1, 3))

yields

    0   1    2
0  25  36   49
1  64  81  100

When calling reshape you are allowed to specify the length of one axis as -1. reshape replaces the -1 with whatever integer makes sense. For example, if myvec.size is 6, and one axis is of length 3, then the other axis has to be of length 6/3 = 2. So the -1 is replaced by 2, and so myvec.reshape(-1, 3) returns an array of shape (2, 3) -- 2 row and 3 columns.

like image 23
unutbu Avatar answered Sep 20 '22 06:09

unutbu