I'm trying to run the following Colab project, but when I want to split the training data into validation and train parts I get this error: <pre class="prettyprint"><code>KeyError: "Invalid split train[:70%]. Available splits are: ['train']" </code></pre> I use the following code: <pre class="prettyprint"><code>(training_set, validation_set), dataset_info = tfds.load( 'tf_flowers', split=['train[:70%]', 'train[70%:]'], with_info=True, as_supervised=True, ) </code></pre> How I can fix this error?

According to the Tensorflow Dataset docs the approach you presented is now supported. Splitting is possible by passing split parameter to <code>tfds.load</code> like so <code>split="test[:70%]"</code>. <pre class="prettyprint lang-py prettyprint-override"><code>(training_set, validation_set), dataset_info = tfds.load( 'tf_flowers', split=['train[:70%]', 'train[70%:]'], with_info=True, as_supervised=True, ) </code></pre> With the above code the <code>training_set</code> has 2569 entries, while <code>validation_set</code> has 1101. Thank you Saman for the comment on API deprecation: In previous Tensorflow version it was possible to use <code>tfds.Split</code> API which is now deprecated: <pre class="prettyprint lang-py prettyprint-override"><code>(training_set, validation_set), dataset_info = tfds.load( 'tf_flowers', split=[ tfds.Split.TRAIN.subsplit(tfds.percent[:70]), tfds.Split.TRAIN.subsplit(tfds.percent[70:]) ], with_info=True, as_supervised=True, ) </code></pre>

Split train data to train and validation by using tensorflow_datasets.load (TF 2.1)

Tags:

python

tensorflow

tensorflow-datasets

I'm trying to run the following Colab project, but when I want to split the training data into validation and train parts I get this error:

KeyError: "Invalid split train[:70%]. Available splits are: ['train']"

I use the following code:

(training_set, validation_set), dataset_info = tfds.load(
'tf_flowers',
split=['train[:70%]', 'train[70%:]'],
with_info=True,
as_supervised=True,
)

How I can fix this error?

804

asked Jan 25 '20 02:01

Pouya Ahmadvand

1 Answers

According to the Tensorflow Dataset docs the approach you presented is now supported. Splitting is possible by passing split parameter to tfds.load like so split="test[:70%]".

(training_set, validation_set), dataset_info = tfds.load(
    'tf_flowers',
    split=['train[:70%]', 'train[70%:]'],
    with_info=True,
    as_supervised=True,
)

With the above code the training_set has 2569 entries, while validation_set has 1101.

Thank you Saman for the comment on API deprecation:
In previous Tensorflow version it was possible to use tfds.Split API which is now deprecated:

(training_set, validation_set), dataset_info = tfds.load(
    'tf_flowers',
    split=[
        tfds.Split.TRAIN.subsplit(tfds.percent[:70]),
        tfds.Split.TRAIN.subsplit(tfds.percent[70:])
    ],
    with_info=True,
    as_supervised=True,
)

answered Nov 14 '22 20:11

sebastian-sz

Related questions
                            
                                Setting meld as git mergetool with Python3
                            
                                Trouble getting the screenshot of any element after zooming in
                            
                                Keras you are trying to load a weight file containing 2 layers into a model with 1 layers
                            
                                Given a dict iterator, get the dict
                            
                                How to use torchvision.transforms for data augmentation of segmentation task in Pytorch?
                            
                                How to divide a rectangle in specific number of rows and columns?
                            
                                How can I make a recursive Python type defined over several aliases?
                            
                                python3: extract IP address from compiled pattern
                            
                                How does @pytest.mark.filterwarnings work?
                            
                                Upload file from memory to S3
                            
                                Fast Fourier Transform in Python
                            
                                Can I define an action on file upload when using ipywidgets FileUpload widget
                            
                                python how to find the number of days in each month from Dec 2019 and forward between two date columns
                            
                                TensorFlow 2 custom loss: "No gradients provided for any variable" error
                            
                                Fetching data with snowflake connector throws EmptyPyArrowIterator error
                            
                                SimpleCookie generic type
                            
                                Block network access of a test/process on Travis?
                            
                                How to use the Language Server Protocol for Python in Neovim
                            
                                Multi GPU training slower than single GPU on Tensorflow
                            
                                How does one check if conda develop installed my project/packages?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With