I want to understand if I have a set of Dictionary data in JSON such as example below:
data = {'a':'120120121',
'b':'12301101',
'c':'120120121',
'd':'12301101',
'e':'120120121',
'f':'12301101',
'g':'120120121',
'h':'12301101',
'i':'120120121',
'j':'12301101'}
Is it possible to split the dictionary to 70:30 randomly using Python?
The output should be like:
training_data = {'a':'120120121',
'b':'12301101',
'c':'120120121',
'e':'120120121',
'g':'120120121',
'i':'120120121',
'j':'12301101'}
test_data = {'d':'12301101',
'f':'12301101',
'h':'12301101'}
The easiest way would be to just use sklearn.model_selection.train_test_split
here, and
turn back to dictionary if that is the structure you want:
from sklearn.model_selection import train_test_split
s = pd.Series(data)
training_data , test_data = [i.to_dict() for i in train_test_split(s, train_size=0.7)]
print(training_data)
# {'b': '12301101', 'j': '12301101', 'a': '120120121', 'f': '12301101',
# 'e': '120120121', 'c': '120120121', 'h': '12301101'}
print(test_data)
# {'i': '120120121', 'd': '12301101', 'g': '120120121'}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With