Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Nested JSON from CSV

Tags:

python

json

csv

I want to create a nested JSON based on this CSV File (it's only a snippet)

    Datum,Position,Herkunft,Entscheidungen insgesamt,Insgesamt_monat,Asylberechtigt,Asylberechtigt monat,Asylberechtigt Prozent,Flüchtling,Flüchtling monat,Flüchting Prozent,Gewährung von subisdiärem Schutz,Gewährung monat,Prozent,Abschiebungsverbot,Abschiebungsverbot monat,Prozent,Unbegrenzte Ablehnungen,Unbegrenzte Ablehnungen monat,Prozent,Ablehnung,Ablehnung monat,Prozent,sonstige Verfahrenserledigungen,,Prozent
    2015-10-01,4,Afghanistan,4540,483,37,1,0.8,1188,139,26.2,234,33,5.2,516,61,11.4,538,63,11.9,29,3,0.6,1998,183,44
    2015-09-01,4,Afghanistan,4057,397,36,8,0.9,1049,127,25.9,201,29,5,455,46,11.2,475,22,11.7,26,3,0.6,1815,162,44.7
    2015-08-01,5,Afghanistan,3660,320,28,1,0.8,922,155,25.2,172,12,4.7,409,43,11.2,453,22,12.4,23,2,0.6,1653,85,45.2
    2015-07-01,6,Afghanistan,3340,429,27,4,0.8,767,84,23,160,28,4.8,366,53,11,431,54,12.9,21,2,0.6,1568,204,46.9
    2015-06-01,6,Afghanistan,2911,639,23,2,0.8,683,184,23.5,132,41,4.5,313,64,10.8,377,74,13,19,3,0.7,1364,271,46.9
    2015-05-01,6,Afghanistan,2272,434,21,0,0.9,499,115,22,91,16,4,249,47,11,303,42,13.3,16,1,0.7,1093,213,48.1
    2015-04-01,6,Afghanistan,1838,462,21,4,1.1,384,75,20.9,75,17,4.1,202,44,11,261,60,14.2,15,4,0.8,880,258,47.9
    2015-03-01,5,Afghanistan,1376,527,17,8,1.2,309,123,22.5,58,42,4.2,158,58,11.5,201,70,14.6,11,1,0.8,622,225,45.2
    2015-02-01,5,Afghanistan,849,431,9,9,1.1,186,81,21.9,16,12,1.9,100,42,11.8,131,65,15.4,10,4,1.2,397,218,46.8
    2015-01-01,5,Afghanistan,418,418,0,0,0,105,105,25.1,4,4,1,58,58,13.9,66,66,15.8,6,6,1.4,179,179,42.8
    2015-10-01,2,Albanien,28011,7107,0,0,0,7,4,0,23,7,0.1,18,1,0.1,864,164,3.1,24688,6250,88.1,2411,681,8.6
    2015-09-01,2,Albanien,20904,7326,0,0,0,3,0,0,16,3,0.1,17,6,0.1,700,153,3.3,18438,6657,88.2,1730,507,8.3
    2015-08-01,2,Albanien,13578,3955,0,0,0,3,0,0,13,0,0.1,11,0,0.1,547,124,4,11781,3630,86.8,1223,201,9
    2015-07-01,3,Albanien,9623,4673,0,0,0,3,0,0,13,2,0.1,11,4,0.1,423,164,4.4,8151,4275,84.7,1022,228,10.6
    2015-06-01,3,Albanien,4950,2099,0,0,0,3,0,0.1,11,8,0.2,7,0,0.1,259,75,5.2,3876,1807,78.3,794,209,16
    2015-05-01,3,Albanien,2851,1210,0,0,0,3,0,0.1,3,3,0.1,7,0,0.2,184,52,6.5,2069,1001,72.6,585,154,20.5
    2015-04-01,3,Albanien,1641,799,0,0,0,3,0,0.2,0,0,0,7,1,0.4,132,49,8,1068,581,65.1,431,168,26.3
    2015-03-01,3,Albanien,842,331,0,0,0,3,1,0.4,0,0,0,6,3,0.7,83,12,9.9,487,212,57.8,263,103,31.2
    2015-02-01,4,Albanien,511,233,0,0,0,2,2,0.4,0,0,0,3,3,0.6,71,13,13.9,275,127,53.8,160,88,31.3
    2015-01-01,4,Albanien,278,278,0,0,0,0,0,0,0,0,0,0,0,0,58,58,20.9,148,148,53.2,72,72,25.9
    2015-05-01,10,Bosnien und Herzegowina,1822,227,0,0,0,1,0,0.1,0,0,0,5,2,0.3,12,0,0.7,1538,165,84.4,266,60,14.6
    2015-04-01,9,Bosnien und Herzegowina,1595,206,0,0,0,1,0,0.1,0,0,0,3,0,0.2,12,1,0.8,1373,166,86.1,206,39,12.9
    2015-03-01,9,Bosnien und Herzegowina,1389,341,0,0,0,1,0,0.1,0,0,0,3,1,0.2,11,4,0.8,1207,276,86.9,167,60,12
    2015-02-01,10,Bosnien und Herzegowina,1048,1048,0,0,0,1,1,0.1,0,0,0,2,2,0.2,7,7,0.7,931,931,88.8,107,107,10.2
    2015-10-01,7,Eritrea,5031,1153,16,2,0.3,3979,1070,79.1,326,30,6.5,19,5,0.4,23,2,0.5,5,1,0.1,663,43,13.2
    2015-09-01,8,Eritrea,3878,702,14,1,0.4,2909,519,75,296,148,7.6,14,0,0.4,21,1,0.5,4,1,0.1,620,32,16
    2015-08-01,8,Eritrea,3176,527,13,1,0.4,2390,505,75.3,148,7,4.7,14,2,0.4,20,0,0.6,3,-1,0.1,588,13,18.5
    2015-07-01,8,Eritrea,2649,542,12,2,0.5,1885,492,71.2,141,10,5.3,12,2,0.5,20,5,0.8,4,0,0.2,575,31,21.7
2015-10-01,10,Ungeklärt,2987,455,30,1,1,2249,441,75.3,2,0,0.1,2,0,0.1,27,0,0.9,268,33,9,409,-20,13.7
2015-09-01,10,Ungeklärt,2532,2147,29,22,1.1,1808,1503,71.4,2,2,0.1,2,2,0.1,27,23,1.1,235,206,9.3,429,389,16.9
2015-01-01,9,Ungeklärt,385,385,7,7,1.8,305,305,79.2,0,0,0,0,0,0,4,4,1,29,29,7.5,40,40,10.4

In this form

        "Irak": {}, 
"Mazedonien": {}, 
"Serbien": {}, 
"Ungekl\u221a\u00a7rt": {
    "Insgesamt_monat": [
        "455", 
        "455", 
        "2147", 
        "385"
    ], 
    "Position": [
        "10", 
        "10", 
        "10", 
        "9"
    ], 
    "Entscheidungen insgesamt": [
        "2987", 
        "2987", 
        "2532", 
        "385"
    ], 
    "Datum": [
        "2015-10-01", 
        "2015-10-01", 
        "2015-09-01", 
        "2015-01-01"
    ], 
    "Asylberechtigt": [
        "30", 
        "30", 
        "29", 
        "7"
    ]
}, 
"Albanien": {}, 
"Afghanistan": {}, 
"Kosovo": {}, 
"Summe 1 bis 10": {}, 
"Syrien,Arabische Republik": {}, 
"Eritrea": {}, 
"Bosnien und Herzegowina": {}, 
"Summe gesamt": {}, 
"Pakistan": {}, 
"Nigeria": {}, 
"Somalia": {}

This is my code

import csv
import json

output = {}
country =  { "Datum": [], "Position": [], "Entscheidungen insgesamt": [], "Insgesamt_monat": [], "Asylberechtigt": [] }
lastCountry = ""

with open('test.csv') as csv_file:
    for row in csv.DictReader(csv_file):

        country['Datum'].append(row['Datum'])
        country['Position'].append(row['Position'])
        country['Entscheidungen insgesamt'].append(row['Entscheidungen insgesamt'])
        country['Insgesamt_monat'].append(row['Insgesamt_monat'])
        country['Asylberechtigt'].append(row['Asylberechtigt'])

        if output.has_key(row['Herkunft']):
            output[row['Herkunft']].update(country)
        else:
            country.clear()
            country = {"Datum": [row['Datum']], "Position": [row['Position']], "Entscheidungen insgesamt": [row['Entscheidungen insgesamt']], "Insgesamt_monat": [row['Insgesamt_monat']], "Asylberechtigt": [row['Asylberechtigt']] }
            output[row['Herkunft']] = country

    print(json.dumps(output, indent=4))
#    with open('data.txt', 'w') as outfile:

As you can see all countries except one country don't get the data from the csv. Where is the mistake. How can I export the json? I'm actually copying the printed into my text Editor

like image 904
basedian Avatar asked Dec 15 '15 14:12

basedian


2 Answers

In your code, the problem is at the else clause: What you did:

  1. Reset country -- this remove the row you just updated
  2. Then update output, at this time, your country is already empty

What you need is to:

  1. Append country tooutput`
  2. Reset country
  3. Then update country with the current row

The order is important.

Here is the code:

import csv
import json

output = {}
country = {}

with open('test.csv') as csv_file:
    for row in csv.DictReader(csv_file):
        if not output.has_key(row['Herkunft']):
            output[row['Herkunft']] = country
            country = {"Datum": [], "Position": [], "Entscheidungen insgesamt": [], "Insgesamt_monat": [], "Asylberechtigt": [] }

        country['Datum'].append(row['Datum'])
        country['Position'].append(row['Position'])
        country['Entscheidungen insgesamt'].append(row['Entscheidungen insgesamt'])
        country['Insgesamt_monat'].append(row['Insgesamt_monat'])
        country['Asylberechtigt'].append(row['Asylberechtigt'])
        output[row['Herkunft']] = country

    output[row['Herkunft']] = country  # Catch the last country
    print json.dumps(output, indent=4)
like image 157
Hai Vu Avatar answered Sep 28 '22 16:09

Hai Vu


Your indentation is wrong. Now you open the outfile and write to it for every country. So each country overrides the output of the previous. [EDIT]: more problems. You use the country dict in a weird way there. Here is a better version.

import csv
import json

output = {}

with open('test.csv') as csv_file:
    for row in csv.DictReader(csv_file):
        if row['Herkunft'] in output:
            country = output[row['Herkunft']]
        else:
            country = { "Datum": [], "Position": [], "Entscheidungen insgesamt": [], "Insgesamt_monat": [], "Asylberechtigt": [] }
            output[row['Herkunft']] = country
        country['Datum'].append(row['Datum'])
        country['Position'].append(row['Position'])
        country['Entscheidungen insgesamt'].append(row['Entscheidungen insgesamt'])
        country['Insgesamt_monat'].append(row['Insgesamt_monat'])
        country['Asylberechtigt'].append(row['Asylberechtigt'])

print(json.dumps(output, indent=4))
with open('data.txt', 'w') as outfile:
    outfile.write(json.dumps(output, indent=4))
like image 44
RickyA Avatar answered Sep 28 '22 18:09

RickyA