I'm writing a general program to read and plot large amounts of data from .txt files. Each file has a different number of columns. I do know that each file has 8 columns that I'm not interested in, so I can figure out the number of relevant columns that way. How can I read the data and sort each relevant column's data into a separate variable? This is what I have so far: <pre class="prettyprint"><code>datafile = 'plotspecies.txt' with open(datafile) as file: reader = csv.reader(file, delimiter=' ', skipinitialspace=True) first_row = next(reader) num_cols = len(first_row) rows = csv.reader(file, delimiter = ' ', quotechar = '"') data = [data for data in rows] num_species = num_cols - 8 </code></pre> I've seen people say that pandas is good for this sort of thing, but I can't seem to import it. I'd prefer a solution without it.

Pandas is in fact the right solution here. The issue is that in order to robustly handle something where you aren't certain of the underlying structure there's a lot of edge cases you have to watch out for, and trying to shoe-horn it into the <code>csv</code> module is a recipe for headaches (though it can be done) As far as why you can't import <code>pandas</code> the reason is that it doesn't come with <code>python</code> by default. One of the most important things to consider when picking up a language is the ecosystem of packages it gives you access to. Python happens to be one of the best in the respect, so to ignore everything that's not a part of standard python is to ignore the best part of the language. If you're on a windows environment you should start by getting <code>conda</code> set up. This will allow you to seamlessly explore many of the packages available to python users with little overhead. This includes <code>pandas</code>, which is in fact the right way to handle this problem. See this link for more info on installing conda: http://conda.pydata.org/docs/install/quick.html Once you're got <code>pandas</code> installed it's as easy as this: <pre class="prettyprint"><code>import pandas test = pandas.read_csv(<your_file>) your_Variable = test[<column_header>] </code></pre> Easy as that. If you really, really don't want to use things that aren't in core python then you can do this with something like what follows, but you haven't given enough detail for an actual solution: <pre class="prettyprint"><code>def col_var(input_file, delimiter): # get each line into a variable rows = open(input_file).read().splitlines() # split each row into entries split_rows = [row.split(delimiter) for row in rows] # Re-orient your list columns = zip(*split_rows) </code></pre> The least intuitive piece of this is the last line, so here's a little example showing you how it works: <pre class="prettyprint"><code>>>> test = [[1,2], [3,4]] >>> zip(*test) [(1, 3), (2, 4)] </code></pre>

How to assign columns of data to variables

I'm writing a general program to read and plot large amounts of data from .txt files. Each file has a different number of columns. I do know that each file has 8 columns that I'm not interested in, so I can figure out the number of relevant columns that way. How can I read the data and sort each relevant column's data into a separate variable?

This is what I have so far:

datafile = 'plotspecies.txt'
with open(datafile) as file:
    reader = csv.reader(file, delimiter=' ', skipinitialspace=True)
    first_row = next(reader)
    num_cols = len(first_row)
    rows = csv.reader(file, delimiter = ' ', quotechar = '"')
    data = [data for data in rows]

num_species = num_cols - 8

I've seen people say that pandas is good for this sort of thing, but I can't seem to import it. I'd prefer a solution without it.

How do you store a column as a variable?

To store column A in a variable: "column_a = wb['sheet1']['A']". To store column B in a variable: "column_b = wb['sheet1']['B']".

How do you assign data to a variable in Python?

The assignment operator, denoted by the “=” symbol, is the operator that is used to assign values to variables in Python. The line x=1 takes the known value, 1, and assigns that value to the variable with name “x”. After executing this line, this number will be stored into this variable.

Pandas is in fact the right solution here. The issue is that in order to robustly handle something where you aren't certain of the underlying structure there's a lot of edge cases you have to watch out for, and trying to shoe-horn it into the csv module is a recipe for headaches (though it can be done)

As far as why you can't import pandas the reason is that it doesn't come with python by default. One of the most important things to consider when picking up a language is the ecosystem of packages it gives you access to. Python happens to be one of the best in the respect, so to ignore everything that's not a part of standard python is to ignore the best part of the language.

If you're on a windows environment you should start by getting conda set up. This will allow you to seamlessly explore many of the packages available to python users with little overhead. This includes pandas, which is in fact the right way to handle this problem. See this link for more info on installing conda: http://conda.pydata.org/docs/install/quick.html

Once you're got pandas installed it's as easy as this:

import pandas
test = pandas.read_csv(<your_file>)
your_Variable = test[<column_header>]

Easy as that.

If you really, really don't want to use things that aren't in core python then you can do this with something like what follows, but you haven't given enough detail for an actual solution:

def col_var(input_file, delimiter):
    # get each line into a variable
    rows = open(input_file).read().splitlines()

    # split each row into entries
    split_rows = [row.split(delimiter) for row in rows]

    # Re-orient your list
    columns = zip(*split_rows)

The least intuitive piece of this is the last line, so here's a little example showing you how it works:

>>> test = [[1,2], [3,4]]
>>> zip(*test)
[(1, 3), (2, 4)]

Well, you can use the csv module provided there is some kind of delimiter within the rows that sets the columns appart.

import csv

file_to_read_from = 'myFile.txt'

#initializing as many lists as the columns you want (not all)
col1, col2, col3 = [], [], []
with open(file_to_read_from, 'r') as file_in:
    reader = csv.reader(file_in, delimiter=';') #might as well be ',', '\t' etc
    for row in reader:
        col1.append(row[0]) # assuming col 1 in the file is one of the 3 you want
        col2.append(row[3]) # assuming col 4 in the file is one of the 3 you want
        col3.append(row[5]) # assuming col 6 in the file is one of the 3 you want

How to assign columns of data to variables

Tags:

python

variable-assignment

csv

evtoh

People also ask

2 Answers

Slater Victoroff

Ma0

Recent Activity

Donate For Us

How to assign columns of data to variables

Tags:

python

variable-assignment

csv

evtoh

People also ask

2 Answers

Slater Victoroff

Ma0

Related questions

Recent Activity

Donate For Us