Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular expression on voxel space

Is there a way to loosely describe an object (via pattern matching finite automata, for example) in 3d voxel grid in the same way we can loosely describe patterns in one-dimensional string with regexp?

Let's say I want to describe a cuboid of "A" type voxels with lower facet composed of "B" or "C" type voxels with height 3 and width 5, and match this description to voxel field to find examples of pattern. I can do some search for exact models (kind-of-like-Boyer-Moore-in-3D) but I need to specify variable dimensions for some objects (like variable length for aforementioned cuboid).

like image 392
Kuroki Kaze Avatar asked Sep 21 '11 20:09

Kuroki Kaze


People also ask

What is the regular expression for space?

The most common forms of whitespace you will use with regular expressions are the space (␣), the tab (\t), the new line (\n) and the carriage return (\r) (useful in Windows environments), and these special characters match each of their respective whitespaces.

Do you need to escape space in regex?

Also, there's nothing special about the space character in regular expressions, you should be able to search for it the same as you would search for any other character. That is, unless you disabled pattern whitespace, which would hardly be necessary in this case.

What is voxel grid?

A voxel grid geometry is a 3D grid of values organized into layers of rows and columns. Each row, column, and layer intersection in the grid is called a voxel or small 3D cube.

What is voxel data?

Data. A voxel represents a single sample, or data point, on a regularly spaced, three-dimensional grid. This data point can consist of a single piece of data, such as an opacity, or multiple pieces of data, such as a color in addition to opacity.


1 Answers

Regular expressions are a compact way to express the syntax of a limited (and still infinite) set of languages. With regular expressions you don't need to tell where to look for the next symbol, since it is known that you are working on a string and iterating over its characters taking them as the symbols of the language... but in 3D you will need to tell what way to go.

You can think about it as a 3D Turing machine, that is a Turing machine that has an internal state and can read symbols from a 3D "tape", since we are only verifying let's ignore writing to the tape. Then, this Turing machine will walk the 3D "tape" aka the 3D voxel grid and read voxels as symbols, after reading each symbol the internal state of the Turing machine will change according to certain laws. Once the execution is over the final state of the machine tell you if it were a match or not. Now these laws in a Von Newman Architecture are an interpretation of the data from tape as instruction, but we want a Harvard architecture, that is that the instructions are separated from the data. Now what you want is a way to describe those instructions for the Turing machine. [You can think of this as the turtle of Logo but in 3D].

Following the spirit of the regular expressions we would prefer a language that resembles the actual structure. If we make it text based it will be a descriptive language (because an imperative one is no better than your favorite Turing complete one), it will have to say for example (in English):

There is a voxel type A and then moving (x1, y1, z1) from that there is a voxel of type B or C and then moving (x2, y2, z3) from that there is a voxel type D

This describes what the Turing machine is looking for, with a backtracking algorithm that test all potential matches it will work as expected.

Please note that I don't know the set of possible values of the voxels. That is, I don’t know the alphabet. So I've just said type A, type B, type C and type D as examples, one of those may be the representation of no voxel, and the others may be colors or whatever you are using. There can be as many types of voxels as needed. If the type of the voxel is complex the description of it got to be inserted there.

I've been thinking of practical use this kind of language and one problem that arises very quickly is rotations, I have to decide that three voxels type A over the X axis is considered the same a three voxels type A over the Z axis, the better is to allow to describe that in the language.

Now it is very similar to describe a path if the voxels are nodes, I've done a language to describe 2D paths as part of a private project (to store them in a database, go figure...), it is very simple, it reserve a character for every direction and uses a number for the steps, for example: "2d5l1u". Doing the same for 3D and adding a way to group and match will do. To solve the rotation problem it will be necessary to expand the direction to allow disjunctions to express alternative configurations for the match. This will become clearer on some example of how it may work that I've thought of (I've not worked a formal syntax in EBNF or similar):

Matching a line of three voxels type A over the X axis:

(A1X){3}

Here I intercalate matching "A" with movement "1X", use parenthesis "(" and ")" for grouping and the curly brackets "{" and "}" to quantify. This unrolls to this:

A1XA1XA1X

The last "1X" doesn't affect the result, so it may as well be:

A1XA1XA

And it clearly says: Match a type A voxel, move 1 over the X and match a type A voxel, move 1 over the X and match a type A voxel.

Matching a line of three voxels type A over the X axis or the Z axis:

(A1X){3}|(A1Z){3}

Alternative:

(A1[X|Z]){3}

Here I use brackets "[" and "]" to make a 'class', the location of it tells it is about the direction and it only includes the possible axis, do not confuse with:

[(A1X)|(A1Z)]{3}

This will match three voxels type A but they may not be on the same axis, they only got to be contiguous and share either the X axis or the Z axis with it neighbor.

Matching a 3x3 set of voxels type a over the plane X, Y:

(((A1X){3})1Y){3}

This matches a line over the X axis and the moves 1 over the Y axis to match another and so on. It implies that after grouping a repetition "([(A1X)]{16})" we backtrack to where the match begun to execute the following move "1Y". To unroll it would be:

(A1XA1XA1X)1Y(A1XA1XA1X)1Y(A1XA1XA1X)1Y

Look the remaining parenthesis, those means to backtrack to where the match begun. So the program will check what is inside of the group and when it is done it will go back where it was before entering the group and continue executing after it.

Matching a pair of voxels type A separated by a voxel of ignored type (over any axis):

A1(X|Y|Z).1(X|Y|Z)A1(X|Y|Z)

Influenced by regular expressions we use the dot "." to represent any type of voxel.

I still don't decide if it is better to use negative values than to use other letters for other axis, also I consider that the number 1 could be optional. Also other parts of the syntax of regular expressions such as "+", "*", and "?" got to be though out more carefully. It may be good to enforce "{" and "}" for any quantification until proven that there is no ambiguity.

As you may have notice it will not be a problem to add another direction of movement or entirely another axis, so this port very well to say four dimensions, as in:

(A1[X|Y|Z|W]){3}

It may also be good to use the dot "." to represent any direction:

(A1.){3}

There is a problem with assuming any direction when not specified and that is that this language is defined to identify what is a direction and distinguish them from voxel types based on the location inside of the expression. So "(A1B1){3}" will not map to "(A1.B1.){3}" because it will take "B" as the direction to move, it may be possible to infer the meaning by the trailing number at the end, but I don't know if it will be unambigous.

Lastly this will match any valid tetris piece in the plane X, Y made of voxels type A:

(A1[X|Y]){4}

If we assume that the world is only bidimensional and that we allow to ignore the number one, it reduces to this:

(A.){4}

And I'm happy with that. Nevertheless, you should consider a more complex, less compact and more readable notation for complex structures.

And that is my theory for generalizing regular expressions to two, three or more dimensions.

EDIT:

If the type of voxel is complex or causes ambiguity I propose to write it with angular brackets "<" and ">", so for example you can use the hex value of the raw voxel data:

(<0088FF>.){4}
like image 112
Theraot Avatar answered Sep 22 '22 20:09

Theraot