Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do we need to use beam search in training process?

If we use beam search in seq2seq model it will give more proper results. There are several tensorflow implementations. But with the softmax function in each cell you can't use beam search in the training process. So is there any other modified optimization function when using beam search?

like image 370
Shamane Siriwardhana Avatar asked May 28 '17 14:05

Shamane Siriwardhana


People also ask

What is beam search in machine learning?

In computer science, beam search is a heuristic search algorithm that explores a graph by expanding the most promising node in a limited set. Beam search is an optimization of best-first search that reduces its memory requirements.

How is beam search applied to machine translation?

The beam search strategy generates the translation word by word from left-to-right while keeping a fixed number (beam) of active candidates at each time step. By increasing the beam size, the translation perfor- mance can increase at the expense of significantly reducing the decoder speed.

What is the tradeoff between the greedy search and beam search algorithms?

Beam Search makes two improvements over Greedy Search. With Greedy Search, we took just the single best word at each position. In contrast, Beam Search expands this and takes the best 'N' words. With Greedy Search, we considered each position in isolation.

What is beam search decoding and why is it used in machine translation?

Beam search is the go-to method for decod- ing auto-regressive machine translation models. While it yields consistent improvements in terms of BLEU, it is only concerned with finding out- puts with high model likelihood, and is thus agnos- tic to whatever end metric or score practitioners care about.

How does beam search work?

Beam search is a heuristic search technique that always expands the W number of the best nodes at each level. It progresses level by level and moves downwards only from the best W nodes at each level. Beam Search uses breadth-first search to build its search tree. Beam Search constructs its search tree using breadth-first search.

What is beam?

Ask the right Questions to succeed in your Agile Projects and Products Business Event Analysis & Modelling (BEAM) is an agile requirement gathering for Data Warehouses, with the goal of aligning requirement analysis with business processes rather than just reports.

Which is better greedy search or beam width?

The hyperparameter ’N’ is known as the Beam width. Intuitively it makes sense that this gives us better results over Greedy Search. Because, what we are really interested in is the best complete sentence, and we might miss that if we picked only the best individual word in each position.

What is beam (business event analysis & modelling)?

Business Event Analysis & Modelling (BEAM) is an agile requirement gathering for Data Warehouses, with the goal of aligning requirement analysis with business processes rather than just reports. It has its roots in Agile Data Warehouse Design by Lawrence Corr and Jim Stagnitto [1].


2 Answers

As Oliver mentioned in order to use beam search in the training procedure we have to use beam search optimization which is clearly mentioned in the paper Sequence-to-Sequence Learning as Beam-Search Optimization.

We can't use beam search in the training procedure with the current loss function. Because current loss function is a log loss which is taken on each time step. It's a greedy way. It also clearly mentioned in the this paper Sequence to Sequence Learning with Neural Networks. In the section 3.2 it has mentioned the above case neatly.

enter image description here

"where S is the training set. Once training is complete, we produce tr anslations by finding the most likely translation according to the LSTM:"

So the original seq2seq architecture use beam search only in the testing time. If we want to use this beam search in the training time we have to use another loss and optimization method as in the paper.

like image 173
Shamane Siriwardhana Avatar answered Nov 10 '22 04:11

Shamane Siriwardhana


Sequence-to-Sequence Learning as Beam-Search Optimization is a paper that describes the steps neccesary to use beam search in the training process. https://arxiv.org/abs/1606.02960

The following issue contains a script that can perform the beam search however it does not contain any of the training logic https://github.com/tensorflow/tensorflow/issues/654

like image 40
Oliver Avatar answered Nov 10 '22 06:11

Oliver