Questions Linux Laravel Mysql Ubuntu Git Menu

HTML CSS JAVASCRIPT SQL PYTHON PHP BOOTSTRAP JAVA JQUERY R React Kotlin

Way to read first few lines for pandas dataframe

Tags:

python

pandas

dataframe

csv

People also ask

How do you get the first few rows in a data frame?

You can use df. head() to get the first N rows in Pandas DataFrame. Alternatively, you can specify a negative number within the brackets to get all the rows, excluding the last N rows.

I think you can use the nrows parameter. From the docs:

nrows : int, default None

    Number of rows of file to read. Useful for reading pieces of large files

which seems to work. Using one of the standard large test files (988504479 bytes, 5344499 lines):

In [1]: import pandas as pd

In [2]: time z = pd.read_csv("P00000001-ALL.csv", nrows=20)
CPU times: user 0.00 s, sys: 0.00 s, total: 0.00 s
Wall time: 0.00 s

In [3]: len(z)
Out[3]: 20

In [4]: time z = pd.read_csv("P00000001-ALL.csv")
CPU times: user 27.63 s, sys: 1.92 s, total: 29.55 s
Wall time: 30.23 s

Related questions
                            
                                How to check if there exists a process with a given pid in Python?
                            
                                How to add a suffix (or prefix) to each column name?
                            
                                virtualenvwrapper and Python 3
                            
                                JavaScript function similar to Python range()
                            
                                How to convert column with dtype as object to string in Pandas Dataframe [duplicate]
                            
                                What is the correct way to document a **kwargs parameter?
                            
                                How to sort my paws?
                            
                                Unpacking, extended unpacking and nested extended unpacking
                            
                                Where is Python's sys.path initialized from?
                            
                                How to downgrade python from 3.7 to 3.6
                            
                                Why can't I use a list as a dict key in python?
                            
                                Use numpy array in shared memory for multiprocessing
                            
                                How to use a custom comparison function in Python 3?
                            
                                In python, why use logging instead of print?
                            
                                What is the difference between i = i + 1 and i += 1 in a 'for' loop? [duplicate]
                            
                                What values are valid in Pandas 'Freq' tags?
                            
                                Numpy array assignment with copy
                            
                                How to select Python version in PyCharm?
                            
                                Why declare unicode by string in python?
                            
                                What's the difference between MySQLdb, mysqlclient and MySQL connector/Python?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With