I have an output file from a legacy piece of software which is shown below. I want to extract values from it, so that, for example, I can set a variable called direct_solar_irradiance
to 648.957
, and target ground pressure
to 1013.00
.
So far, I have been extracting individual lines and processing them like below (repeated many times for the different values I want to extract):
values = lines[97].split()
self.irradiance_direct, self.irradiance_diffuse, self.irradiance_env = values
However, I have now found that extra lines are added to the middle of the output when certain parameters are selected. This means, of course that the 97th line will no longer have the values I need on it.
Is there a good Pythonic way to extract these values, given that there may be extra lines added into the output under certain circumstances? I guess I need to search for known pieces of text in the file, and then extract the numbers referred to by them, but the only ways I can think of doing that are very clunky.
So:
Is there a nice Pythonic way to search for these strings and extract the values that I want?
If not, is there some other way to sensibly do this? (for example, some kind of cool text-file parsing library that I know nothing about).
******************************* 6sV version 1.0B ******************************
* *
* geometrical conditions identity *
* ------------------------------- *
* user defined conditions *
* *
* month: 14 day : 1 *
* solar zenith angle: 10.00 deg solar azimuthal angle: 20.00 deg *
* view zenith angle: 30.00 deg view azimuthal angle: 40.00 deg *
* scattering angle: 159.14 deg azimuthal angle difference: 20.00 deg *
* *
* atmospheric model description *
* ----------------------------- *
* atmospheric model identity : *
* midlatitude summer (uh2o=2.93g/cm2,uo3=.319cm-atm) *
* aerosols type identity : *
* Maritime aerosol model *
* optical condition identity : *
* visibility : 8.49 km opt. thick. 550 nm : 0.5000 *
* *
* spectral condition *
* ------------------ *
* monochromatic calculation at wl 0.400 micron *
* *
* Surface polarization parameters *
* ---------------------------------- *
* *
* *
* Surface Polarization Q,U,Rop,Chi 0.00000 0.00000 0.00000 0.00 *
* *
* *
* target type *
* ----------- *
* homogeneous ground *
* monochromatic reflectance 1.000 *
* *
* target elevation description *
* ---------------------------- *
* ground pressure [mb] 1013.00 *
* ground altitude [km] 0.000 *
* *
* plane simulation description *
* ---------------------------- *
* plane pressure [mb] 1013.00 *
* plane altitude absolute [km] 0.000 *
* atmosphere under plane description: *
* ozone content 0.000 *
* h2o content 0.000 *
* aerosol opt. thick. 550nm 0.000 *
* *
* atmospheric correction activated *
* -------------------------------- *
* BRDF coupling correction *
* input apparent reflectance : 0.500 *
* *
*******************************************************************************
*******************************************************************************
* *
* integrated values of : *
* -------------------- *
* *
* apparent reflectance 1.1287696 appar. rad.(w/m2/sr/mic) 588.646 *
* total gaseous transmittance 1.000 *
* *
*******************************************************************************
* *
* coupling aerosol -wv : *
* -------------------- *
* wv above aerosol : 1.129 wv mixed with aerosol : 1.129 *
* wv under aerosol : 1.129 *
*******************************************************************************
* *
* integrated values of : *
* -------------------- *
* *
* app. polarized refl. 0.0000 app. pol. rad. (w/m2/sr/mic) 0.000 *
* direction of the plane of polarization 0.00 *
* total polarization ratio 0.000 *
* *
*******************************************************************************
* *
* int. normalized values of : *
* --------------------------- *
* % of irradiance at ground level *
* % of direct irr. % of diffuse irr. % of enviro. irr *
* 0.351 0.354 0.295 *
* reflectance at satellite level *
* atm. intrin. ref. background ref. pixel reflectance *
* 0.000 0.000 1.129 *
* *
* int. absolute values of *
* ----------------------- *
* irr. at ground level (w/m2/mic) *
* direct solar irr. atm. diffuse irr. environment irr *
* 648.957 655.412 544.918 *
* rad at satel. level (w/m2/sr/mic) *
* atm. intrin. rad. background rad. pixel radiance *
* 0.000 0.000 588.646 *
* *
* *
* sol. spect (in w/m2/mic) *
* 1663.594 *
* *
*******************************************************************************
*******************************************************************************
* *
* integrated values of : *
* -------------------- *
* *
* downward upward total *
* global gas. trans. : 1.00000 1.00000 1.00000 *
* water " " : 1.00000 1.00000 1.00000 *
* ozone " " : 1.00000 1.00000 1.00000 *
* co2 " " : 1.00000 1.00000 1.00000 *
* oxyg " " : 1.00000 1.00000 1.00000 *
* no2 " " : 1.00000 1.00000 1.00000 *
* ch4 " " : 1.00000 1.00000 1.00000 *
* co " " : 1.00000 1.00000 1.00000 *
* *
* *
* rayl. sca. trans. : 0.84422 1.00000 0.84422 *
* aeros. sca. " : 0.94572 1.00000 0.94572 *
* total sca. " : 0.79616 1.00000 0.79616 *
* *
* *
* *
* rayleigh aerosols total *
* *
* spherical albedo : 0.23410 0.12354 0.29466 *
* optical depth total: 0.36193 0.55006 0.91199 *
* optical depth plane: 0.00000 0.00000 0.00000 *
* reflectance I : 0.00000 0.00000 0.00000 *
* reflectance Q : 0.00000 0.00000 0.00000 *
* reflectance U : 0.00000 0.00000 0.00000 *
* polarized reflect. : 0.00000 0.00000 0.00000 *
* degree of polar. : nan 0.00 nan *
* dir. plane polar. : -45.00 -45.00 -45.00 *
* phase function I : 1.38819 0.27621 0.71751 *
* phase function Q : -0.09117 -0.00856 -0.04134 *
* phase function U : -1.34383 0.02142 -0.52039 *
* primary deg. of pol: -0.06567 -0.03099 -0.05762 *
* sing. scat. albedo : 1.00000 0.98774 0.99261 *
* *
* *
*******************************************************************************
*******************************************************************************
*******************************************************************************
* atmospheric correction result *
* ----------------------------- *
* input apparent reflectance : 0.500 *
* measured radiance [w/m2/sr/mic] : 260.747 *
* atmospherically corrected reflectance *
* Lambertian case : 0.52995 *
* BRDF case : 0.52995 *
* coefficients xa xb xc : 0.00241 0.00000 0.29466 *
* y=xa*(measured radiance)-xb; acr=y/(1.+xc*y) *
Python has a built-in open() function to open a file. This function returns a file object, also called a handle, as it is used to read or modify the file accordingly. We can specify the mode while opening a file.
Method #1 : Using split() Using the split function, we can split the string into a list of words and this is the most generic and recommended method if one wished to accomplish this particular task. But the drawback is that it fails in cases the string contains punctuation marks.
A more complete, perhaps more robust solution will require the use of either a parser using a custom grammer (pyparsing) or some sort of FSM-based processor (TextFSM).
Both options like they'll be non-trivial to use with this output. A (possibly) lighter-weight solution would be to identify each line based on known labels, then extract appropriately (as suggested by other posters).
There are several ways to implement this. I would suggest mapping 'extractor' callables to known line labels, then iterate and call matched extractors. Each callable would take line and a context object/dict as arguments and add attributes to the context as required. Something along the lines of https://gist.github.com/1035938
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With