Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is iteration necessary in the following piece of code?

Tags:

Here's a piece of code from the xss_clean method of the Input_Core class of the Kohana framework:

do
{
 // Remove really unwanted tags
 $old_data = $data;
 $data = preg_replace('#</*(?:applet|b(?:ase|gsound|link)|embed|frame(?:set)?|i(?:frame|layer)|l(?:ayer|ink)|meta|object|s(?:cript|tyle)|title|xml)[^>]*+>#i', '', $data);
}
while ($old_data !== $data);

Is the do ... while loop necessary? I would think that the preg_replace call would do all the work in just one iteration.

like image 658
Emanuil Rusev Avatar asked Mar 06 '10 21:03

Emanuil Rusev


People also ask

Why iteration is important in programming?

Why is iteration important? Iteration allows us to simplify our algorithm by stating that we will repeat certain steps until told otherwise. This makes designing algorithms quicker and simpler because they don't have to include lots of unnecessary steps.

What is a iteration in code?

In programming specifically, iterative refers to a sequence of instructions or code being repeated until a specific end result is achieved.

What is the use of loops or iterations in programing?

This is known as iteration, which allows us to "write code once" and "execute many times." In computer programming, iteration is often referred as 'looping' because instead of repeatedly writing the same code, we can execute the same code a finite number of times.

What are the 2 types of iteration?

There are two ways in which programs can iterate or 'loop': count-controlled loops. condition-controlled loops.


1 Answers

Well, it's necessary if the replacement potentially creates new matches in the next iteration. It's not very wasteful because it's only and additional check at worst, though.

Going by the code it matches, it seems unlikely that it will create new matches by replacement, however: it's very strict about what it matches.

EDIT: To be more specific, it tries to match an opening angle bracket optionally followed by a slash followed by one of several keywords optionally followed by any number of symbols that are not a closing angle bracket and finally a closing angle bracket. If the input follows that syntax, it'll be swallowed whole. If it's malformed (e.g. multiple opening and closing angle brackets), it'll generate garbage until it can't find substrings matching the initial sequence anymore.

So, no. Unless you have code like <<iframe>iframe>, no repetition is necessary. But then you're dealing with a level of tag soup the regex isn't good enough for anyway (e.g. it will fail on < iframe> with the extra space).

EDIT2: It's also a bit odd that the pattern matches zero or more slashes at the beginning of the tag (it should be zero or one). And if my regex knowledge isn't too rusty, the final *+ doesn't make much sense either (the asterisk means zero or more, the plus means one or more, maybe it's a greedy syntax or something fancy like that?).

like image 137
Alan Plum Avatar answered Sep 28 '22 19:09

Alan Plum