What type of neural network can handle variable input and output sizes?

Tags:

I'm trying to use the approach described in this paper https://arxiv.org/abs/1712.01815 to make the algorithm learn a new game.

There is only one problem that does not directly fit into this approach. The game I am trying to learn has no fixed board size. So currently the input tensor has dimensions m*n*11, where m and n are the dimensions of the game board and can vary each time the game is played. So first of all I need a neural network able to make use of such varying input sizes.

The size of the output is also a function of the board size, as it has a vector with entries for every possible move on the board, and so the output vector will be bigger if the board size increases.

I have read about recurrent and recursive neural networks but they all seem to relate to NLP, and I'm not sure on how to translate that to my problem.

Any ideas on NN architectures able to handle my case would be welcome.

431

asked Apr 04 '18 16:04

Damian Szkaut

1 Answers

What you need is Pointer Networks (https://arxiv.org/abs/1506.03134)

Here is a introduction quote from a post about it:

Pointer networks are a new neural architecture that learns pointers to positions in an input sequence. This is new because existing techniques need to have a fixed number of target classes, which isn't generally applicable— consider the Travelling Salesman Problem, in which the number of classes is equal to the number of inputs. An additional example would be sorting a variably sized sequence. - https://finbarr.ca/pointer-networks/

Its an attention based model.

Essentially a pointer network is used to predict pointers back to the input, meaning your output layer isn't actually fixed, but variable.

A use case where I have used them is for translating raw text into SQL queries.

Input: "HOW MANY CARS WERE SOLD IN US IN 1983"
Output: SELECT COUNT(Car_id) FROM Car_table WHERE (Country='US' AND Year=='1983')

The issue with raw text such as this is that it will only make sense w.r.t to a specific table (in this case car table with a set of variables around car sales, similar to your different boards for board games). Meaning, that if the question cant be the only input. So the input that actually goes into the pointer network is a combination of -

Input -

Query
Metadata of the table (column names)
Token vocabulary for all categorical columns
Keywords from SQL syntax (SELECT, WHERE etc..)

All of these are appended together.

The output layer then simply points back to specific indexes of the input. It points to Country and Year (from column names in metadata), it points to US and 1983 (from tokens in vocabulary of categorical columns), it points to SELECT, WHERE etc from the SQL syntax component of the input.

The sequence of these indexes in the appended index is then used as the output of your computation graph, and optimized using a training dataset that exists as WIKISQL dataset.

Your case is quite similar, you need to pass the inputs, metadata of the game, and the stuff you need as part of your output as an appended index. Then the pointer network simply makes selections from the input (points to them).

104

answered Sep 22 '22 05:09

Akshay Sehgal

Related questions
                            
                                Specify list of possible values for Pandas get_dummies
                            
                                Same function in Keras Loss and Metric give different values even without regularization
                            
                                Random projection algorithm pseudo code
                            
                                Supervised Dimensionality Reduction for Text Data in scikit-learn
                            
                                How to encode dependency path as a feature for classification?
                            
                                xgboost binary logistic regression
                            
                                Spark K-fold Cross Validation
                            
                                How to understand RandomForestExplainer output (R package)
                            
                                How to build hybrid model of RF(Random Forest) and PSO(Particle Swarm Optimizer) to find optimal discount of products?
                            
                                User analysis based on their facebook profile?
                            
                                Is there a way to set up a multi-hidden layer neural network with the mlp method in the caret package?
                            
                                Mnist recognition using keras
                            
                                Implementing im2col in TensorFlow
                            
                                Keras - Add attention mechanism to an LSTM model [duplicate]
                            
                                Custom combined hinge/kb-divergence loss function in siamese-net fails to generate meaningful speaker-embeddings
                            
                                In scikit learn, how to deal with the data mixed with numerical and nominal value?
                            
                                Azure Machine Learning - CORS
                            
                                Getting reproducible results using tensorflow-gpu
                            
                                What is imbalance in image segmentation?
                            
                                Is there a keras method to split data?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What type of neural network can handle variable input and output sizes?

Tags:

machine-learning

neural-network

conv-neural-network

rnn

Damian Szkaut

People also ask

1 Answers

Akshay Sehgal

Recent Activity

Donate For Us