Read specific columns with pandas or other python module

Tags:

I have a csv file from this webpage. I want to read some of the columns in the downloaded file (the csv version can be downloaded in the upper right corner).

Let's say I want 2 columns:

59 which in the header is star_name
60 which in the header is ra.

However, for some reason the authors of the webpage sometimes decide to move the columns around.

In the end I want something like this, keeping in mind that values can be missing.

data = #read data in a clever way names = data['star_name'] ras = data['ra']

This will prevent my program to malfunction when the columns are changed again in the future, if they keep the name correct.

Until now I have tried various ways using the csv module and resently the pandas module. Both without any luck.

EDIT (added two lines + the header of my datafile. Sorry, but it's extremely long.)

# name, mass, mass_error_min, mass_error_max, radius, radius_error_min, radius_error_max, orbital_period, orbital_period_err_min, orbital_period_err_max, semi_major_axis, semi_major_axis_error_min, semi_major_axis_error_max, eccentricity, eccentricity_error_min, eccentricity_error_max, angular_distance, inclination, inclination_error_min, inclination_error_max, tzero_tr, tzero_tr_error_min, tzero_tr_error_max, tzero_tr_sec, tzero_tr_sec_error_min, tzero_tr_sec_error_max, lambda_angle, lambda_angle_error_min, lambda_angle_error_max, impact_parameter, impact_parameter_error_min, impact_parameter_error_max, tzero_vr, tzero_vr_error_min, tzero_vr_error_max, K, K_error_min, K_error_max, temp_calculated, temp_measured, hot_point_lon, albedo, albedo_error_min, albedo_error_max, log_g, publication_status, discovered, updated, omega, omega_error_min, omega_error_max, tperi, tperi_error_min, tperi_error_max, detection_type, mass_detection_type, radius_detection_type, alternate_names, molecules, star_name, ra, dec, mag_v, mag_i, mag_j, mag_h, mag_k, star_distance, star_metallicity, star_mass, star_radius, star_sp_type, star_age, star_teff, star_detected_disc, star_magnetic_field 11 Com b,19.4,1.5,1.5,,,,326.03,0.32,0.32,1.29,0.05,0.05,0.231,0.005,0.005,0.011664,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1,2008,2011-12-23,94.8,1.5,1.5,2452899.6,1.6,1.6,Radial Velocity,,,,,11 Com,185.1791667,17.7927778,4.74,,,,,110.6,-0.35,2.7,19.0,G8 III,,4742.0,, 11 UMi b,10.5,2.47,2.47,,,,516.22,3.25,3.25,1.54,0.07,0.07,0.08,0.03,0.03,0.012887,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1,2009,2009-08-13,117.63,21.06,21.06,2452861.05,2.06,2.06,Radial Velocity,,,,,11 UMi,229.275,71.8238889,5.02,,,,,119.5,0.04,1.8,24.08,K4III,1.56,4340.0,,

410

asked Sep 26 '14 15:09

Daniel Thaagaard Andreasen

1 Answers

An easy way to do this is using the pandas library like this.

import pandas as pd fields = ['star_name', 'ra']  df = pd.read_csv('data.csv', skipinitialspace=True, usecols=fields) # See the keys print df.keys() # See content in 'star_name' print df.star_name

The problem here was the skipinitialspace which remove the spaces in the header. So ' star_name' becomes 'star_name'

answered Sep 19 '22 00:09

Daniel Thaagaard Andreasen

Related questions
                            
                                Is the time-complexity of iterative string append actually O(n^2), or O(n)?
                            
                                Binary numbers in Python
                            
                                How to fix RuntimeError "Expected object of scalar type Float but got scalar type Double for argument"?
                            
                                How to set up custom middleware in Django
                            
                                How to add regularizations in TensorFlow?
                            
                                How to reduce the image file size using PIL
                            
                                NumPy: Logarithm with base n
                            
                                Nested For Loops Using List Comprehension
                            
                                How to list all existing loggers using python.logging module
                            
                                How to get Tensorflow tensor dimensions (shape) as int values?
                            
                                What is the easiest way to clear a database from the CLI with manage.py in Django?
                            
                                Flask sqlalchemy many-to-many insert data
                            
                                What version of Visual Studio is Python on my computer compiled with?
                            
                                How to split elements of a list?
                            
                                PEP8 – import not at top of file with sys.path
                            
                                Can I write native iPhone apps using Python? [closed]
                            
                                Python popen command. Wait until the command is finished
                            
                                How to properly round-up half float numbers?
                            
                                matplotlib: format axis offset-values to whole numbers or specific number
                            
                                Anaconda site-packages

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Read specific columns with pandas or other python module

Tags:

python

pandas

csv

Daniel Thaagaard Andreasen

People also ask

1 Answers

Daniel Thaagaard Andreasen

Recent Activity

Donate For Us