How to read specific columns from a CSV file with Python?

How to read specific columns from a csv file with csv module with Python? To read specific columns from a csv file with csv module with Python, we can loop through the rows read from the csv and then put them in a dict.

What is the difference between file and column_list in CSV file?

where, file is the csv file to read data and column_list is the list of all column names. Let’s take a look at the below example : sample.csv contains the below data :

How to get the number of columns in a CSV file?

To fetch column name, instead of using readlines () better use readline () to avoid loop & reading the complete file & storing it in the array. with open (csv_file, 'rb') as csvfile: # get number of columns line = csvfile.readline () first_item = line.split (',')

Do people still use the CSV module?

AFAIK, the only people who still use the csv module are those who have not yet discovered better tools for working with tabular data (pandas, petl, etc.), which is fine, but if you plan to work with a lot of data in your career from various strange sources, learning something like petl is one of the best investments you can make.

Read specific columns from a csv file with csv module?

People also ask

How read a specific column in csv using pandas?

This can be done with the help of the pandas. read_csv() method. We will pass the first parameter as the CSV file and the second parameter the list of specific columns in the keyword usecols. It will return the data of the CSV file of specific columns.

How do I read a specific column in a DataFrame in python?

You can use the loc and iloc functions to access columns in a Pandas DataFrame. Let's see how. If we wanted to access a certain column in our DataFrame, for example the Grades column, we could simply use the loc function and specify the name of the column in order to retrieve it.

The only way you would be getting the last column from this code is if you don't include your print statement in your for loop.

This is most likely the end of your code:

for row in reader:
    content = list(row[i] for i in included_cols)
print content

You want it to be this:

for row in reader:
        content = list(row[i] for i in included_cols)
        print content

Now that we have covered your mistake, I would like to take this time to introduce you to the pandas module.

Pandas is spectacular for dealing with csv files, and the following code would be all you need to read a csv and save an entire column into a variable:

import pandas as pd
df = pd.read_csv(csv_file)
saved_column = df.column_name #you can also use df['column_name']

so if you wanted to save all of the info in your column Names into a variable, this is all you need to do:

names = df.Names

It's a great module and I suggest you look into it. If for some reason your print statement was in for loop and it was still only printing out the last column, which shouldn't happen, but let me know if my assumption was wrong. Your posted code has a lot of indentation errors so it was hard to know what was supposed to be where. Hope this was helpful!

import csv
from collections import defaultdict

columns = defaultdict(list) # each value in each column is appended to a list

with open('file.txt') as f:
    reader = csv.DictReader(f) # read rows into a dictionary format
    for row in reader: # read a row as {column1: value1, column2: value2,...}
        for (k,v) in row.items(): # go over each column name and value 
            columns[k].append(v) # append the value into the appropriate list
                                 # based on column name k

print(columns['name'])
print(columns['phone'])
print(columns['street'])

With a file like

name,phone,street
Bob,0893,32 Silly
James,000,400 McHilly
Smithers,4442,23 Looped St.

Will output

>>> 
['Bob', 'James', 'Smithers']
['0893', '000', '4442']
['32 Silly', '400 McHilly', '23 Looped St.']

Or alternatively if you want numerical indexing for the columns:

with open('file.txt') as f:
    reader = csv.reader(f)
    reader.next()
    for row in reader:
        for (i,v) in enumerate(row):
            columns[i].append(v)
print(columns[0])

>>> 
['Bob', 'James', 'Smithers']

To change the deliminator add delimiter=" " to the appropriate instantiation, i.e reader = csv.reader(f,delimiter=" ")

Use pandas:

import pandas as pd
my_csv = pd.read_csv(filename)
column = my_csv.column_name
# you can also use my_csv['column_name']

Discard unneeded columns at parse time:

my_filtered_csv = pd.read_csv(filename, usecols=['col1', 'col3', 'col7'])

P.S. I'm just aggregating what other's have said in a simple manner. Actual answers are taken from here and here.

You can use numpy.loadtext(filename). For example if this is your database .csv:

ID | Name | Address | City | State | Zip | Phone | OPEID | IPEDS |
10 | Adam | 130 W.. | Mo.. | AL... | 3.. | 334.. | 01023 | 10063 |
10 | Carl | 130 W.. | Mo.. | AL... | 3.. | 334.. | 01023 | 10063 |
10 | Adolf | 130 W.. | Mo.. | AL... | 3.. | 334.. | 01023 | 10063 |
10 | Den | 130 W.. | Mo.. | AL... | 3.. | 334.. | 01023 | 10063 |

And you want the Name column:

import numpy as np 
b=np.loadtxt(r'filepath\name.csv',dtype=str,delimiter='|',skiprows=1,usecols=(1,))

>>> b
array([' Adam ', ' Carl ', ' Adolf ', ' Den '], 
      dtype='|S7')

More easily you can use genfromtext:

b = np.genfromtxt(r'filepath\name.csv', delimiter='|', names=True,dtype=None)
>>> b['Name']
array([' Adam ', ' Carl ', ' Adolf ', ' Den '], 
      dtype='|S7')

With pandas you can use read_csv with usecols parameter:

df = pd.read_csv(filename, usecols=['col1', 'col3', 'col7'])

Example:

import pandas as pd
import io

s = '''
total_bill,tip,sex,smoker,day,time,size
16.99,1.01,Female,No,Sun,Dinner,2
10.34,1.66,Male,No,Sun,Dinner,3
21.01,3.5,Male,No,Sun,Dinner,3
'''

df = pd.read_csv(io.StringIO(s), usecols=['total_bill', 'day', 'size'])
print(df)

   total_bill  day  size
0       16.99  Sun     2
1       10.34  Sun     3
2       21.01  Sun     3

Related questions
                            
                                How to access the local Django webserver from outside world
                            
                                Setting different color for each series in scatter plot on matplotlib
                            
                                Iterating through directories with Python
                            
                                How can I one hot encode in Python?
                            
                                How to do multiple arguments to map function where one remains the same in python?
                            
                                Pandas dataframe get first row of each group
                            
                                Python - Get path of root project structure
                            
                                Python - Extracting and Saving Video Frames
                            
                                Print list without brackets in a single row
                            
                                How can I filter a date of a DateTimeField in Django?
                            
                                Get MD5 hash of big files in Python
                            
                                How do I get a list of all the duplicate items using pandas in python?
                            
                                psycopg2: insert multiple rows with one query
                            
                                How to split a dos path into its components in Python
                            
                                Python in Xcode 4+?
                            
                                RuntimeError on windows trying python multiprocessing
                            
                                Numpy: Get random set of rows from 2D array
                            
                                How to get the python.exe location programmatically? [duplicate]
                            
                                What is the inverse function of zip in python? [duplicate]
                            
                                How do I raise the same Exception with a custom message in Python?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Read specific columns from a csv file with csv module?

Tags:

python

csv

People also ask

Recent Activity

Donate For Us