Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: Converting excel file to JSON format

I am creating a ML model that will use a JSON file to understand the pattern and response format. As I have my data in excel format I converted it to JSON in python.

Here is the code:

import xlrd
from collections import OrderedDict
import simplejson as json
# Open the workbook and select the first worksheet
wb = xlrd.open_workbook('D:\\android\\testdata2.xlsx')
sh = wb.sheet_by_index(0)
# List to hold dictionaries
data_list = []
# Iterate through each row in worksheet and fetch values into dict
for rownum in range(1, sh.nrows):
    data = OrderedDict()
    row_values = sh.row_values(rownum)
    data['pattern'] = row_values[0]
    data['response'] = row_values[1]
    data_list.append(data)
# Serialize the list of dicts to JSON
j = json.dumps(data_list)
# Write to file
with open('data1.json', 'w') as f:
    f.write(j)

I am the getting the output as:

[{
    "pattern": "WALLSTENT NON COUVERTE ",
    "response": "ENDOPROTHESE STENT  VASCULAIRE "
}, {
    "pattern": "PRIMEADVANCED SURSCAN MRI ",
    "response": "NEUROSTIMULATEUR NERF VAGUE GAUCHE "
}, {
    "pattern": "AVASTIN  FLACON DE",
    "response": "BEVACIZUMAB"
}, {
    "pattern": "PERJETA SOLUTION A DILUER POUR PERFUSION",
    "response": "BRENTUXIMAB VEDOTIN"
}]

The desired output I am looking for is like this:

{
    "intents": [{
        "pattern": ["WALLSTENT, NON, COUVERTE "],
        "response": ["ENDOPROTHESE STENT  VASCULAIRE] "
    }, {
        "pattern": ["PRIMEADVANCED ,SURSCAN ,MRI"] ,
        "response": ["NEUROSTIMULATEUR NERF VAGUE GAUCHE "]
    }, {
        "pattern": ["AVASTIN , FLACON ,DE"],
        "response": ["BEVACIZUMAB"]
    }, {
        "pattern": ["PERJETA, SOLUTION, A, DILUER, POUR ,PERFUSION"],
        "response": ["BRENTUXIMAB VEDOTIN"]
    }]
}

What modification can I do in my function to get the output I am looking for.

like image 621
Pavan Rajput Avatar asked Dec 27 '18 05:12

Pavan Rajput


People also ask

How do I convert an Excel file to JSON?

There is no predefined method in Excel to convert the Excel data to JSON. You can either use online Excel to JSON conversion software or download an add-in from the Microsoft store for this task to get done.

Can Excel format JSON?

JavaScript Object Notation (JSON) is a common data format, and you can import it into Excel.

How do I import a JSON file into Excel using Python?

Basic Code Example to import JSON to Excel with Python Here is the easiest way to convert JSON data to an Excel file using Python and Pandas: import pandas as pd df_json = pd.read_json (‘DATAFILE.json’) df_json.to_excel (‘DATAFILE.xlsx’) Code language: Python (python)

How do I convert Excel data to JSON?

I prefer using xlrd to convert the Excel rows into a JSON format. Show activity on this post. There are 2 approaches for achieving the result: Using excel2json. It's quite a simple tool but might be helpful for you. First, install the package excel2json-3 using pip. Using pandas.

What is a JSON file in Python?

JSON (JavaScript Object Notation) is a data-interchange format that is human-readable text and is used to transmit data, especially between web applications and servers. The JSON files will be like nested dictionaries in Python. To convert a text file into JSON, there is a json module in Python.

How do I convert JSON to a tabular format?

You can use tools in Excel or coding languages like Python to flatten or convert JSON data to a tabular format. You can also try Gigasheet, a free online big data spreadsheet that automatically parses JSON and flattens most common JSON structures. You can get a free account that supports files up to 10gb and 5,000 columns wide.


2 Answers

That should do it:

import xlrd
from collections import OrderedDict
import simplejson as json
# Open the workbook and select the first worksheet
wb = xlrd.open_workbook('D:\\android\\testdata2.xlsx')
sh = wb.sheet_by_index(0)
# List to hold dictionaries
data_list = []
# Iterate through each row in worksheet and fetch values into dict
for rownum in range(1, sh.nrows):
    data = OrderedDict()
    row_values = sh.row_values(rownum)
    data['pattern'] = row_values[0]
    data['response'] = row_values[1]
    data_list.append(data)
data_list = {'intents': data_list} # Added line
# Serialize the list of dicts to JSON
j = json.dumps(data_list)
# Write to file
with open('data1.json', 'w') as f:
    f.write(j)

Note the added data_list = {'intents': data_list}.

like image 173
TheNavigat Avatar answered Sep 18 '22 03:09

TheNavigat


Give a shot to pyexcel_xlsx library in python. I have used this for converting xlsx to json. Sweet and simple one. And fast also as compared to other python libraries.

Sample code:

from pyexcel_xlsx import get_data;
import time;
import json;

data = get_data("D:\\android\\testdata2.xlsx")
sheetName = "Table A";

data_list = []
# Iterate through each row and append in above list
for i in range(0, len(data[sheetName])):
    data_list.append({
        'pattern' : data[sheetName][i][0],
        'response' : data[sheetName][i][1]
    })
data_list = {'intents': data_list} # Converting to required object
j = json.dumps(data_list)
# Write to file
with open('data1.json', 'w') as f:
    f.write(j)
like image 34
Yash Mochi Avatar answered Sep 20 '22 03:09

Yash Mochi