Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas dot product with Multiindex

My problem is quite common in finance.

Given an array w (1xN) of weights and a covariance matrix Q (NxN) of assets, one can calculate the covariance of the portfolio using the quadratic expression w' * Q * w, where * is the dot product.

I want to understand what is the best way to perform this operation when I have an history of weights W (T x N) and a 3D structure for covariance matrix (T, N, N).

import numpy as np
import pandas as pd

returns = pd.DataFrame(0.1 * np.random.randn(100, 4), columns=['A', 'B', 'C', 'D'])
covariance = returns.rolling(20).cov()

weights = pd.DataFrame(np.random.randn(100, 4), columns=['A', 'B', 'C', 'D'])

My solution so far was to converting pandas DataFrames to numpy, perform the calculation doing a loop and then converting back to pandas. Note that I need to explicitly check for the alignment of labels, since in reality covariance and weights could be calculated by different processes.

cov_dict = {key: covariance.xs(key, axis=0, level=0) for key in covariance.index.get_level_values(0)}

def naive_numpy(weights, cov_dict):

    expected_risk = {}

    # Extract columns, index before passing to numpy arrays
    # Columns
    cov_assets = cov_dict[next(iter(cov_dict))].columns
    avail_assets = [el for el in cov_assets if el in weights]

    # Indexes
    cov_dates = list(cov_dict.keys())
    avail_dates = weights.index.intersection(cov_dates)

    sel_weights = weights.loc[avail_dates, avail_assets]

    # Main loop and calculation
    for t, value in zip(sel_weights.index, sel_weights.values):
        expected_risk[t] = np.sqrt(np.dot(value, np.dot(cov_dict[t].values, value)))

    # Back to pandas DataFrame
    expected_risk = pd.Series(expected_risk).reindex(weights.index).sort_index()

    return expected_risk

Is there pure-pandas way to achieve the same result? Or is there any improvement on the code to make it more efficient? (despite using numpy, it is still quite slow).

like image 998
FLab Avatar asked Feb 21 '18 10:02

FLab


People also ask

How do I convert MultiIndex to single index in pandas?

To revert the index of the dataframe from multi-index to a single index using the Pandas inbuilt function reset_index(). Returns: (Data Frame or None) DataFrame with the new index or None if inplace=True.

What is a MultiIndex in pandas?

The MultiIndex object is the hierarchical analogue of the standard Index object which typically stores the axis labels in pandas objects. You can think of MultiIndex as an array of tuples where each tuple is unique. A MultiIndex can be created from a list of arrays (using MultiIndex.

How convert MultiIndex to columns in pandas?

pandas MultiIndex to ColumnsUse pandas DataFrame. reset_index() function to convert/transfer MultiIndex (multi-level index) indexes to columns. The default setting for the parameter is drop=False which will keep the index values as columns and set the new index to DataFrame starting from zero.


1 Answers

I think numpy is definitely the best option. Though you loose that efficiency if you loop on values/dates.

My suggestion for calculating the rolling volatility of a portfolio (with no looping):

returns = pd.DataFrame(0.1 * np.random.randn(100, 4), columns=['A', 'B', 'C', 'D'])
covariance = returns.rolling(20).cov()
weights = pd.DataFrame(np.random.randn(100, 4), columns=['A', 'B', 'C', 'D'])

rows, columns = weights.shape

# Go to numpy:
w = weights.values
cov = covariance.values.reshape(rows, columns, columns)

A = np.matmul(w.reshape(rows, 1, columns), cov)
var = np.matmul(A, w.reshape(rows, columns, 1)).reshape(rows)
std_dev = np.sqrt(var)

# Back to pandas (in case you want that):
pd.Series(std_dev, index = weights.index)
like image 186
ecortazar Avatar answered Oct 13 '22 15:10

ecortazar