Python: How to optimize function parameters?

Tags:

Background:

I'd like to solve a wide array of optimization problems such as asset weights in a portfolio, and parameters in trading strategies where the variables are passed to functions containing a bunch of other variables as well.

Until now, I've been able to do these things easily in Excel using the Solver Add-In. But I think it would be much more efficient and even more widely applicable using Python. For the sake of clarity, I'm going to boil the question down to the essence of portfolio optimization.

My question (short version):

Here's a dataframe and a corresponding plot with asset returns.

Dataframe 1:

                A1      A2
2017-01-01  0.0075  0.0096
2017-01-02 -0.0075 -0.0033
.
.
2017-01-10  0.0027  0.0035

Plot 1 - Asset returns

enter image description here

Based on that, I would like to find the weights for the optimal portfolio with regards to risk / return (Sharpe ratio), represented by the green dot in the plot below (the red dot is the so-called minimum variance portfolio, and represents another optimization problem).

Plot 2 - Efficient frontier and optimal portfolios:

enter image description here

How can I do this with numpy or scipy?

The details:

The following code section contains the function returns() to build a dataframe with random returns for two assets, as well as a function pf_sharpe to calculate the Sharpe ratio of two given weights for a portfolio of the returns.

# imports
import pandas as pd
import numpy as np
from scipy.optimize import minimize
import matplotlib.pyplot as plt

np.random.seed(1234)

# Reproducible data sample
def returns(rows, names):
    ''' Function to create data sample with random returns
    
    Parameters
    ==========
    rows : number of rows in the dataframe
    names: list of names to represent assets
    
    Example
    =======
    
    >>> returns(rows = 2, names = ['A', 'B'])
    
                  A       B
    2017-01-01  0.0027  0.0075
    2017-01-02 -0.0050 -0.0024
    '''
    listVars= names
    rng = pd.date_range('1/1/2017', periods=rows, freq='D')
    df_temp = pd.DataFrame(np.random.randint(-100,100,size=(rows, len(listVars))), columns=listVars) 
    df_temp = df_temp.set_index(rng)
    df_temp = df_temp / 10000

    return df_temp


# Sharpe ratio
def pf_sharpe(df, w1, w2):
    ''' Function to calculate risk / reward ratio
        based on a pandas dataframe with two return series
    
    Parameters
    ==========
    df : pandas dataframe
    w1 : portfolio weight for asset 1
    w2 : portfolio weight for asset 2
    
    '''
    
    weights = [w1,w2]      
    
    # Calculate portfolio returns and volatility
    pf_returns = (np.sum(df.mean() * weights) * 252)
    pf_volatility = (np.sqrt(np.dot(np.asarray(weights).T, np.dot(df.cov() * 252, weights))))
       
    # Calculate sharpe ratio
    pf_sharpe = pf_returns / pf_volatility
    
    return pf_sharpe

# Make df with random returns and calculate
# sharpe ratio for a 80/20 split between assets
df_returns = returns(rows = 10, names = ['A1', 'A2'])
df_returns.plot(kind = 'bar')

sharpe = pf_sharpe(df = df_returns, w1 = 0.8, w2 = 0.2)
print(sharpe)

# Output:
# 5.09477512073

Now I'd like to find the portfolio weights that optimize the Sharpe ratio. I think you could express the optimization problem as follows:

maximize:
    pf_sharpe()

by changing:
    w1, w2

under the constraints:
    0 < w1 < 1
    0 < w2 < 1
    w1 + w2 = 1

What I've tried so far:

I found a possible setup in the post Python Scipy Optimization.minimize using SLSQP showing maximized results. Below is what I have so far, and it addresses a central aspect of my question directly:

[...]where the variables are passed to functions containing a bunch of other variables as well.

As you can see, my initial challenge prevents me from even testing if my bounds and constraints will be accepted by the function optimize.minimize(). I haven't even bothered to take into consideration the fact that this is a maximization and not a minimization problem (hopefully amendable by changing the sign of the function).

Attempts:

# bounds
b = (0,1)
bnds = (b,b)

# constraints
def constraint1(w1,w2):
    return w1 - w2

cons = ({'type': 'eq', 'fun':constraint1})

# initial guess
x0 = [0.5, 0.5]

# Testing the initial guess
print(pf_sharpe(df = df_returns, weights = x0))

# Optimization attempts

attempt1 = optimize.minimize(pf_sharpe(), x0, method = 'SLSQP', bounds = bnds, constraints = cons)
attempt2 = optimize.minimize(pf_sharpe(df = df_returns, weights),  x0, method = 'SLSQP', bounds = bnds, constraints = cons)
attempt3 = optimize.minimize(pf_sharpe(weights, df = df_returns), x0, method = 'SLSQP', bounds = bnds, constraints = cons)

Results:

Attempt1 is closest to the scipy setup here, but understandably fails because neither df nor weights have been specified.
Attempt2 fails with SyntaxError: positional argument follows keyword argument
Attempt3 fails with NameError: name 'weights' is not defined

I was under the impression that df could freely be specified, and that x0 in optimize.minimize would be considered the variables to be tested as 'representatives' for the weights in the function specified by pf_sharpe().

As you surely understand, my transition from Excel to Python in this regard has not been the easiest, and there is plenty I don't understand here. Anyway, I'm hoping some of you may offer some suggestions or clarifications!

Thank you!

Appendix 1 - Simulation approach:

This particular portfolio optimization problem can easily be solved by simulating a bunch of portfolio weights. And I did exactly that to produce the portfolio plot above. Here's the whole function if anyone is interested:

# Portfolio simulation
def portfolioSim(df, simRuns):
    ''' Function to take a df with asset returns,
        runs a number of simulated portfolio weights,
        plots return and risk for those weights,
        and finds minimum risk portfolio
        and max risk / return portfolio
    
    Parameters
    ==========
    df : pandas dataframe with returns
    simRuns : number of simulations
    
    '''  
    prets = []
    pvols = []
    pwgts = []
    names = list(df_returns)
    
    for p in range (simRuns):
        
        # Assign random weights
        weights = np.random.random(len(list(df_returns)))
        weights /= np.sum(weights)
        weights = np.asarray(weights)        
    
        # Calculate risk and returns with random weights
        prets.append(np.sum(df_returns.mean() * weights) * 252)
        pvols.append(np.sqrt(np.dot(weights.T, np.dot(df_returns.cov() * 252, weights))))
        pwgts.append(weights)
            
    prets = np.array(prets)
    pvols = np.array(pvols)
    pwgts = np.array(pwgts)
    pshrp = prets / pvols
    
    # Store calculations in a df
    df1 = pd.DataFrame({'return':prets})         
    df2 = pd.DataFrame({'risk':pvols})    
    df3 = pd.DataFrame(pwgts)
    df3.columns = names
    df4 = pd.DataFrame({'sharpe':pshrp})
    df_temp = pd.concat([df1, df2, df3, df4], axis = 1)
    
    # Plot resulst
    plt.figure(figsize=(8, 4))
    plt.scatter(pvols, prets, c=prets / pvols, cmap = 'viridis', marker='o')
    
    # Min risk
    min_vol_port = df_temp.iloc[df_temp['risk'].idxmin()]   
    plt.plot([min_vol_port['risk']], [min_vol_port['return']], marker='o', markersize=12, color="red")
    
    # Max sharpe
    max_sharpe_port = df_temp.iloc[df_temp['sharpe'].idxmax()]    
    plt.plot([max_sharpe_port['risk']], [max_sharpe_port['return']], marker='o', markersize=12, color="green")

# Test run
portfolioSim(df = df_returns, simRuns = 250)

Appendix 2 - Excel Solver approach:

Here is how I would approach the problem using Excel Solver. Instead of linking to a file, I've only attached a screenshot and included the most important formulas in a code section. I'm guessing not many of you is going to be interested in reproducing this anyway. But I've included it just to show that it can be done quite easily in Excel. Grey ranges represent formulas. Ranges that can be changed and used as arguments in the optimization problem are highlighted in yellow. The green range is the objective function.

Here's an image of the worksheet and Solver setup:

enter image description here

Excel formulas:

C3  =AVERAGE(C7:C16)
C4  =AVERAGE(D7:D16)
H4  =COVARIANCE.P(C7:C16;D7:D16)
G5  =COVARIANCE.P(C7:C16;D7:D16)
G10 =G8+G9
G13 =MMULT(TRANSPOSE(G8:G9);C3:C4)
G14 =SQRT(MMULT(TRANSPOSE(G8:G9);MMULT(G4:H5;G8:G9)))
H13 =G12/G13
H14 =G13*252
G16 =G13/G14
H16 =H13/H14

End notes:

As you can see from the screenshot, Excel solver suggests a 47% / 53% split between A1 and A2 to obtain an optimal Sharpe Ratio of 5,6. Running the Python function sr_opt = portfolioSim(df = df_returns, simRuns = 25000) yields a Sharpe Ratio of 5,3 with corresponding weights of 46% and 53% for A1 and A2:

print(sr_opt)
#Output
#return    0.361439
#risk      0.067851
#A1        0.465550
#A2        0.534450
#sharpe    5.326933

The method applied in Excel is GRG Nonlinear. I understand that changing the SLSQP argument to a non-linear method would get me somewhere, and I've look into Nonlinear solvers in scipy as well, but with little success. And maybe Scipy even isn't the best option here?

379

asked Apr 09 '18 11:04

vestland

1 Answers

A more detailed answer, 1st part of your code remains the same

import pandas as pd
import numpy as np
from scipy.optimize import minimize
import matplotlib.pyplot as plt

np.random.seed(1234)

# Reproducible data sample
def returns(rows, names):
    ''' Function to create data sample with random returns

    Parameters
    ==========
    rows : number of rows in the dataframe
    names: list of names to represent assets

    Example
    =======

    >>> returns(rows = 2, names = ['A', 'B'])

                  A       B
    2017-01-01  0.0027  0.0075
    2017-01-02 -0.0050 -0.0024
    '''
    listVars= names
    rng = pd.date_range('1/1/2017', periods=rows, freq='D')
    df_temp = pd.DataFrame(np.random.randint(-100,100,size=(rows, len(listVars))), columns=listVars) 
    df_temp = df_temp.set_index(rng)
    df_temp = df_temp / 10000

    return df_temp

The function pf_sharpe is modified, the 1st input is one of the weights, the parameter to be optimised. Instead of inputting constraint w1 + w2 = 1, we can define w2 as 1-w1 inside pf_sharpe, which is perfectly equivalent but simpler and faster. Also, minimize will attempt to minimize pf_sharpe, and you actually want to maximize it, so now the output of pf_sharpe is multiplied by -1.

# Sharpe ratio
def pf_sharpe(weight, df):
    ''' Function to calculate risk / reward ratio
        based on a pandas dataframe with two return series
    '''   
    weights = [weight[0], 1-weight[0]]
    # Calculate portfolio returns and volatility
    pf_returns = (np.sum(df.mean() * weights) * 252)
    pf_volatility = (np.sqrt(np.dot(np.asarray(weights).T, np.dot(df.cov() * 252, weights))))

    # Calculate sharpe ratio
    pf_sharpe = pf_returns / pf_volatility

    return -pf_sharpe

# initial guess
x0 = [0.5]

df_returns = returns(rows = 10, names = ['A1', 'A2'])

# Optimization attempts

out = minimize(pf_sharpe, x0, method='SLSQP', bounds=[(0, 1)], args=(df_returns,))

optimal_weights = [out.x, 1-out.x]
print(optimal_weights)
print(-pf_sharpe(out.x, df_returns))

This returns an optimized Sharpe Ratio of 6.16 (better than 5.3) for w1 practically one and w2 practically 0

200

answered Oct 25 '22 00:10

Brenlla

Related questions
                            
                                Workaround for Google Earth Engine Python API and no support for `ee.mapclient` in Python 3
                            
                                Django user model extension in an ecommerce application
                            
                                How to get file path + file name into a list? [duplicate]
                            
                                pandas.eval with a boolean series with missing data
                            
                                Scikit image: resize() got an unexpected keyword argument 'anti_aliasing'
                            
                                numpy array indexing with lists and arrays
                            
                                Converting embedded Excel objects from a docx file into images
                            
                                Is it possible to split a Jupyter cell across cells when it contains a function, loop, or other block?
                            
                                gRPC: Rendezvous terminated with (StatusCode.INTERNAL, Received RST_STREAM with error code 2)
                            
                                Python PIL: font weight and style
                            
                                HDBSCAN Python choose number of clusters
                            
                                How to convert a spectrogram to 3d plot. Python
                            
                                Python PANDAS: Converting from pandas/numpy to dask dataframe/array
                            
                                Can't verify hashes for these requirements because we don't have a way to hash version control repositories
                            
                                Python: Pandas wrongly excluding column in groupby
                            
                                Type-hinting for the __init__ function from class meta information in Python
                            
                                Close session after use
                            
                                Shift interpolation does not give expected behaviour
                            
                                How to install tensorflow-1.2.1 in Docker which has alpine:3.7 as base image ? I am using python 3
                            
                                How to fix error: django.db.utils.NotSupportedError: URIs not supported

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python: How to optimize function parameters?

Tags:

python

optimization

numpy

scipy

How can I do this with numpy or scipy?

vestland

People also ask

1 Answers

Brenlla

Recent Activity

Donate For Us