Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: How to save statsmodels results as image file?

I'm using statsmodels to make OLS estimates. The results can be studied in the console using print(results.summary()). I'd like to store the very same table as a .png file. Below is a snippet with a reproducible example.

import pandas as pd
import numpy as np
import matplotlib.dates as mdates
import statsmodels.api as sm

# Dataframe with some random numbers
np.random.seed(123)
rows = 10
df = pd.DataFrame(np.random.randint(90,110,size=(rows, 2)), columns=list('AB'))
datelist = pd.date_range(pd.datetime(2017, 1, 1).strftime('%Y-%m-%d'), periods=rows).tolist()
df['dates'] = datelist 
df = df.set_index(['dates'])
df.index = pd.to_datetime(df.index)
print(df)

# OLS estimates using statsmodels.api
x = df['A']
y = df['B']

model = sm.OLS(y,sm.add_constant(x)).fit()

# Output
print(model.summary())

enter image description here

I've made some naive attempts using suggestions here, but I suspect I'm way off target:

os.chdir('C:/images')
sys.stdout = open("model.png","w")
print(model.summary())
sys.stdout.close()

So far this only raises a very long error message.

Thank you for any suggestions!

like image 758
vestland Avatar asked Oct 10 '17 10:10

vestland


People also ask

Is Python StatsModels good?

While StatsModels don't have a variety of options, it only offers statistics and econometric tools that are used in statistics software like Stata and R. It has a similar syntax as that of R so, for those who are transitioning to Python, StatsModels is a good choice.

What is Statsmodel library in Python?

As its name implies, statsmodels is a Python library built specifically for statistics. Statsmodels is built on top of NumPy, SciPy, and matplotlib, but it contains more advanced functions for statistical testing and modeling that you won't find in numerical libraries like NumPy or SciPy.

What is StatsModels formula API in Python?

statsmodels. formula. api : A convenience interface for specifying models using formula strings and DataFrames. This API directly exposes the from_formula class method of models that support the formula API.

What is the use of StatsModels in Python?

Statsmodels is a Python package that allows users to explore data, estimate statistical models, and perform statistical tests. An extensive list of descriptive statistics, statistical tests, plotting functions, and result statistics are available for different types of data and each estimator.


1 Answers

This is a pretty unusual task and your approach is kind of crazy. You are trying to combine a string (which has no positions in some metric-space) with some image (which is based on absolute positions; at least for pixel-based formats -> png, jpeg and co.).

No matter what you do, you need some text-rendering engine!

I tried to use pillow, but results are ugly. Probably because it's quite limited and a post-processing anti-aliasing is not saving anything. But maybe i did something wrong.

from PIL import Image, ImageDraw, ImageFont
image = Image.new('RGB', (800, 400))
draw = ImageDraw.Draw(image)
font = ImageFont.truetype("arial.ttf", 16)
draw.text((0, 0), str(model.summary()), font=font)
image = image.convert('1') # bw
image = image.resize((600, 300), Image.ANTIALIAS)
image.save('output.png')

When you use statsmodels, i assume you already got matplotlib. This one can be used too. Here is some approach, which is quite okay, although not perfect (some line-shifts; i don't know why; edit: OP managed to repair these by using a monospace-font):

import matplotlib.pyplot as plt
plt.rc('figure', figsize=(12, 7))
#plt.text(0.01, 0.05, str(model.summary()), {'fontsize': 12}) old approach
plt.text(0.01, 0.05, str(model.summary()), {'fontsize': 10}, fontproperties = 'monospace') # approach improved by OP -> monospace!
plt.axis('off')
plt.tight_layout()
plt.savefig('output.png')

Output:

enter image description here

Edit: OP managed to improve the matplotlib-approach by using a monospace-font! I incorporated that here and it's reflected in the output image.

Take this as a demo and research python's text-rendering options. Maybe the matplotlib-approach can be improved, but maybe you need to use something like pycairo. Some SO-discussion.

Remark: On my system your code does give those warnings!

Edit: It seems you can ask statsmodels for a latex-representation. So i recommend using this, probably writing this to a file and use subprocess to call pdflatex or something similar (here some similar approach). matplotlib can use latex too (but i won't test it as i'm currently on windows) but in this case we again need to tune text to window ratios somehow (compared to a full latex document given some A5-format for example).

like image 104
sascha Avatar answered Sep 18 '22 20:09

sascha