I have written a program with the following piece of code:
import pandas as pd
import numpy as np
from typing import Tuple
def split_data(self, df: pd.DataFrame, split_quantile: float) -> Tuple(pd.DataFrame, pd.DataFrame):
'''Split data sets into two parts - train and test data sets.'''
df = df.sort_values(by='datein').reset_index(drop=True)
quantile = int(np.quantile(df.index, split_quantile))
return (
df[df.index <= quantile].reset_index(drop=True),
df[df.index > quantile].reset_index(drop=True)
)
The program returns the following error: TypeError: Type Tuple cannot be instantiated; use tuple() instead
. I understand, that I can solve my code by replacing Tuple(pd.DataFrame, pd.DataFrame)
with tuple()
, however I loose the part of an information, that my tuple would consist of two pandas data frames.
Could you, please, help me, how to solve the error and not to loose the information in the same time?
Use square brackets:
Tuple[pd.DataFrame, pd.DataFrame]
From the docs:
Tuple type; Tuple[X, Y] is the type of a tuple of two items with the first item of type X and the second of type Y. The type of the empty tuple can be written as Tuple[()].
EDIT: With the release of python 3.9, you can now do this with the builtins.tuple
type rather than having to import typing
. For example:
>>> tuple[pd.DataFrame, pd.DataFrame]
tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame]
You still have to use square brackets.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With