Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Adding null values to a pandas dataframe

I have a pandas dataframe that is used to create a JSON which in turn is used to display a highcharts chart.

Pandas dataframe:

Date        colA    colB
12-Sep-14   20      40
13-Sep-14   50      10
14-Sep-14   12      -20
15-Sep-14   74      43

Is there a way to change some of the colA and colB values to null. The reason for this is that I ultimately need a JSON that looks something like this:

[
    [12-Sep-14, 20, 40],
    [13-Sep-14, null, null],
    [14-Sep-14, 12, -20],
    [15-Sep-14, 74, 43]
]

The reason for this is that I require a highcharts chart where certain plot points are blank. To do this, you specify the date followed by null.

So I need to somehow update certain values in the pandas dataframe so that once I convert it to a JSON using .to_json() then the json will contain the specified null values as per the example above.

Thanks for any suggestions.

like image 837
darkpool Avatar asked Nov 07 '14 16:11

darkpool


People also ask

How are null values stored in pandas DataFrame?

While making a Data Frame from a csv file, many blank columns are imported as null value into the Data Frame which later creates problems while operating that data frame. Pandas isnull() and notnull() methods are used to check and manage NULL values in a data frame.

How do you add a value to a null in Python?

Python uses the keyword None to define null objects and variables. While None does serve some of the same purposes as null in other languages, it's another beast entirely. As the null in Python, None is not defined to be 0 or any other value.

How do I add NaN columns in pandas?

There are multiple ways to add a new empty/blank column (single or multiple columns) to a pandas DataFrame by using assign operator, assign() , insert() and apply() methods. By using these you can add one or multiple empty columns with either NaN , None , Blank or Empty string values to all cells.

What is Isnull () SUM () pandas?

Count missing values in each row and column Since sum() calculate as True=1 and False=0 , you can count the number of missing values in each row and column by calling sum() from the result of isnull() . You can count missing values in each column by default, and in each row with axis=1 .


3 Answers

Try using NaN which is the Pandas missing value:

from numpy import nan 

df = pd.read_clipboard()
df.colA.iloc[1] = NaN

instead of NaN you could also use None. Note that neither of these terms are entered with quotes.

Then you can use to_json() to get your output:

df.to_json()
'{"Date":{"0":"12-Sep-14","1":"13-Sep-14","2":"14-Sep-14","3":"15-Sep-14"},"colA":{"0":20.0,"1":null,"2":12.0,"3":74.0},"colB":{"0":40,"1":10,"2":-20,"3":43}}'
like image 135
JD Long Avatar answered Sep 23 '22 01:09

JD Long


Does this work?

import pandas as pd
# Read in data frame from clipboard
df = pd.read_clipboard()
df = df.replace(df.iloc[1][1:],'null')

        Date  colA  colB
0  12-Sep-14    20    40
1  13-Sep-14  null  null
2  14-Sep-14    12   -20
3  15-Sep-14    74    43

Here, df.iloc[1] gives access to row 1

Finally,

df.to_json(orient='values').replace("\"","")

gives json without the ""

[[12-Sep-14,20,40],[13-Sep-14,null,null],[14-Sep-14,12,-20],[15-Sep-14,74,43]]
like image 31
user308827 Avatar answered Sep 24 '22 01:09

user308827


Code as below:

import numpy as np

# create null/NaN value with np.nan
df.loc[1, colA:colB] = np.nan

Here's the explanation:

  1. locate the entities that need to be replaced: df.loc[1, colA:colB] means selecting row 1 and columns from colA to colB;
  2. assign the NaN value np.nan to the specific location.
like image 24
Damon Roux Avatar answered Sep 25 '22 01:09

Damon Roux