Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Matching brackets in a string

What is the most efficient or elegant method for matching brackets in a string such as:

"f @ g[h[[i[[j[2], k[[1, m[[1, n[2]]]]]]]]]] // z"

for the purpose of identifying and replacing [[ Part ]] brackets with the single character forms?

I want to get:

enter image description here

With everything else intact, such as the prefix @ and postfix // forms intact


An explanation of Mathematica syntax for those unfamiliar:

Functions use single square brackets for arguments: func[1, 2, 3]

Part indexing is done with double square brackets: list[[6]] or with single-character Unicode double brackets: list〚6〛

My intent is to identify the matching [[ ]] form in a string of ASCII text, and replace it with the Unicode characters 〚 〛

like image 400
Mr.Wizard Avatar asked Apr 25 '11 07:04

Mr.Wizard


People also ask

How do you match brackets?

Two brackets are considered to be matching if the an opening bracket i.e. ( , [ , or { occurs to the left of a closing bracket i.e. ) , ] , or } of the exact same type. A matching pair of brackets will be considered balanced if all enclosing brackets are matched correctly.

Which brackets are used in string?

The string must be consists of only opening and closing brackets i.e. '(' and ')'. An equal point is an index such that the number of opening brackets before it is equal to the number of closing brackets from and after.

How do you check if a given string contains valid parentheses?

Push an opening parenthesis on top of the stack. In case of a closing bracket, check if the stack is empty. If not, pop in a closing parenthesis if the top of the stack contains the corresponding opening parenthesis. If the parentheses are valid,​ then the stack will be empty once the input string finishes.


2 Answers

Ok, here is another answer, a bit shorter:

Clear[replaceDoubleBrackets];
replaceDoubleBrackets[str_String, openSym_String, closeSym_String] := 
Module[{n = 0},
  Apply[StringJoin, 
   Characters[str] /. {"[" :> {"[", ++n}, 
     "]" :> {"]", n--}} //. {left___, {"[", m_}, {"[", mp1_}, 
      middle___, {"]", mp1_}, {"]", m_}, right___} /; 
       mp1 == m + 1 :> {left, openSym, middle, 
        closeSym, right} /. {br : "[" | "]", _Integer} :> br]]

Example:

In[100]:= replaceDoubleBrackets["f[g[h[[i[[j[2], k[[1, m[[1, n[2]]]]]]]]]]]", "(", ")"]

Out[100]= "f[g[h(i(j[2], k(1, m(1, n[2]))))]]"

EDIT

You can also use Mathematica built-in facilities, if you want to replace double brackets specifically with the symbols you indicated:

Clear[replaceDoubleBracketsAlt];
replaceDoubleBracketsAlt[str_String] :=
  StringJoin @@ Cases[ToBoxes@ToExpression[str, InputForm, HoldForm],
     _String, Infinity]

In[117]:= replaceDoubleBracketsAlt["f[g[h[[i[[j[2], k[[1, m[[1, n[2]]]]]]]]]]]"]

Out[117]= f[g[h[[i[[j[2],k[[1,m[[1,n[2]]]]]]]]]]]

The result would not show here properly, but it is a Unicode string with the symbols you requested.

like image 166
Leonid Shifrin Avatar answered Oct 19 '22 16:10

Leonid Shifrin


When I wrote my first solution, I hadn't noticed that you just wanted to replace the [[ with in a string, and not an expression. You can always use HoldForm or Defer as

enter image description here

but I think you already knew that, and you want the expression as a string, just like the input (ToString@ on the above doesn't work)

As all the answers so far focus on string manipulations, I'll take a numeric approach instead of wrestling with strings, which is more natural to me. The character code for [ is 91 and ] is 93. So doing the following

enter image description here

gives the locations of the brackets as a 0/1 vector. I've negated the closing brackets, just to aid the thought process and for use later on.

NOTE: I have only checked for divisibility by 91 and 93, as I certainly don't expect you to be entering any of the following characters, but if, for some reason you choose to, you can easily AND the result above with a boolean list of equality with 91 or 93.

enter image description here

From this, the positions of the first of Part's double bracket pair can be found as

enter image description here

The fact that in mma, expressions do not start with [ and that more than two [ cannot appear consecutively as [[[... has been implicitly assumed in the above calculation.

Now the closing pair is trickier to implement, but simple to understand. The idea is the following:

  • For each non-zero position in closeBracket, say i, go to the corresponding position in openBracket and find the first non-zero position to the left of it (say j).
  • Set doubleCloseBrackets[[i-1]]=closeBracket[[i]]+openBracket[[j]]+doubleOpenBrackets[[j]].
  • You can see that doubleCloseBrackets is the counterpart of doubleOpenBrackets and is non-zero at the position of the first of Part's ]] pair.

enter image description here

enter image description here

So now we have a set of Boolean positions for the first open bracket. We simply have to replace the corresponding element in charCode with the equivalent of and similarly, with the Boolean positions for the first close bracket, we replace the corresponding element in charCode with the equivalent of .

enter image description here

Finally, by deleting the element next to the ones that were changed, you can get your modified string with [[]] replaced by 〚 〛

enter image description here

NOTE 2:

A lot of my MATLAB habits have crept in the above code, and is not entirely idiomatic in Mathematica. However, I think the logic is correct, and it works. I'll leave it to you to optimize it (me thinks you can do away with Do[]) and make it a module, as it would take me a lot longer to do it.

Code as text

Clear["Global`*"]
str = "f[g[h[[i[[j[2], k[[1, m[[1, n[2]]]]]]]]]]]";
charCode = ToCharacterCode@str;
openBracket = Boole@Divisible[charCode, First@ToCharacterCode["["]];
closeBracket = -Boole@
    Divisible[charCode, First@ToCharacterCode["]"]];
doubleOpenBracket = 
  Append[Differences@Accumulate[openBracket], 0] openBracket;
posClose = Flatten@Drop[Position[closeBracket, Except@0, {1}], 1];

doubleCloseBracket = ConstantArray[0, Dimensions@doubleOpenBracket];
openBracketDupe = openBracket + doubleOpenBracket;
Do[
  tmp = Last@
    Flatten@Position[openBracketDupe[[1 ;; i]], Except@0, {1}];
  doubleCloseBracket[[i - 1]] = 
   closeBracket[[i]] + openBracketDupe[[tmp]];
  openBracketDupe[[tmp]] = 0;,
  {i, posClose}];

changeOpen = 
  Cases[Range[First@Dimensions@charCode]  doubleOpenBracket, Except@0];
changeClosed = 
  Cases[Range[First@Dimensions@charCode]  doubleCloseBracket, 
   Except@0];
charCode[[changeOpen]] = ToCharacterCode["\[LeftDoubleBracket]"];
charCode[[changeClosed]] = ToCharacterCode["\[RightDoubleBracket]"];
FromCharacterCode@
 Delete[Flatten@charCode, 
  List /@ (Riffle[changeOpen, changeClosed] + 1)]
like image 44
abcd Avatar answered Oct 19 '22 15:10

abcd