I'm trying to read data from a .csv file in Jupyter Notebook (Python)
.csv file is 8.5G, 70 million rows, and 30 columns
When I try to read .csv, I get errors.
Below are my codes
import pandas as pd
log = pd.read_csv('log_20100424.csv', engine = 'python')
I also tried using pyarrow, but it doesn't work.
import pandas as pd
from pyarrow import csv
log = csv.read('log_20100424.csv').to_pandas()
My question is :
How to read a huge(8.5G) .csv file in Jupyter Notebook
Is there any other way to read a huge .csv file?
My Laptop has 8 GB RAM, 64bit Windows 10, and i5-8265U 1.6Ghz.
Even if Pandas can handle huge data, Jupyter Notebook cannot. To read a huge CSV file, you need to work in chunks. I faced similar situation where the Jupyter Notebook kernel would die and I had to start over again. Try this -
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With