Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert large CSV files to JSON [closed]

I don't mind if this is done with a separate program, with Excel, in NodeJS or in a web app.

It's exactly the same problem as described here:

Large CSV to JSON/Object in Node.js

It seems that the OP didn't get that answer to work (yet accepted it anyway?). I've tried working with it but can't seem to get it to work either.

In short: I'm working with a ~50,000 row CSV and I want to convert it to JSON. I've tried just about every online "csv to json" webapp out there, all crash with this large of a dataset.

I've tried many Node CSV to JSON modules but, again, they all crash. The csvtojson module seemed promising, but I got this error: FATAL ERROR: JS Allocation failed - process out of memory.

What on earth can I do to get this data in a useable format? As above, I don't mind if it's an application, something that works within Excel, a webapp or a Node module, so long as I either get a .JSON file or an object that I can work with within Node.

Any ideas?

like image 943
JVG Avatar asked Sep 12 '13 08:09

JVG


People also ask

Does JSON take more space than CSV?

CSV format is about half the size of the JSON and another format file. It helps in reducing the bandwidth, and the size of the below would be very less.

Can you convert a CSV to JSON?

JSON requires the data to be in a structure or a schema, which are not compatible with the CSV file structure. CSV to JSON Converter tool is designed to convert CSV files into JSON format in a very easy manner.

Why are JSON files larger than CSV?

This is because JSON doesn't have the notion of a schema. Every entry can be different. These extra characters occupy space (1 byte per letter). These probably add up to more data that your actual data.


1 Answers

You mentioned csvtojson module above and that is an open source project which I am maintaining.

I am sorry it did not work out for you and it was caused by a bug solved several months ago. I also added some extra lines in README for your scenario. Please check out Process Big CSV File in Command Line.

Please make sure you have the latest csvtojson release. (Currently it is 0.2.2)

You can update it by running

npm install -g csvtojson

After you've installed latest csvtojson, you just need to run:

csvtojson [path to bigcsvdata] > converted.json

This streams data from the csvfile. Or if you want to stream data from another application:

cat [path to bigcsvdata] | csvtojson > converted.json

They will output the same thing.

I have manually tested it with a csv file over 3 million records and it works without an issue.

I believe you just need a simple tool. The purpose of the lib is to relief stress like this. Please do let me know if you meet any problems next time so I could solve it in time.

like image 58
Keyang Avatar answered Sep 26 '22 08:09

Keyang