I was very confused about this question. I know about finding the edit distance between 2 strings using recursion and dynamic programming as an improvement, however am confused about how to go with this one.
Not sure if my thinking is correct. But we have a string of parenthesis which is unbalanced say
String s = "((())))";
How to find the String with balanced Parenthesis which requires minimum number of edits ?
Can some one explain this with an example ?
I am still not sure if I am explaining it correctly.
A parentheses string is balanced if: Any left parenthesis '(' must have a corresponding two consecutive right parenthesis '))' . Left parenthesis '(' must go before the corresponding two consecutive right parenthesis '))' .
Are the parentheses balanced? The most basic version of this problem asks, given a string containing only ( and ) , are the parentheses in the string balanced? For the parentheses to be balanced, each open parenthesis must have a corresponding close parenthesis, in the correct order. For example: ((())) is balanced.
If there is a closing bracket, check the top of the stack. If the top of the stack contains the opening bracket match of the current closing bracket, then pop and move ahead in the string. If the top of the stack is not the opening bracket match of the current closing bracket, the parentheses are not balanced.
The valid parentheses problem involves checking that: all the parentheses are matched, i.e., every opening parenthesis has a corresponding closing parenthesis. the matched parentheses are in the correct order, i.e., an opening parenthesis should come before the closing parenthesis.
Given a string consisting of left and right parentheses, we are asked to balance it by performing a minimal number of delete, insert, and replace operations.
To begin with, let's look at the input string and distinguish matched pairs from unmatched characters. We can mark all the characters belonging to matched pairs by executing the following algorithm:
The marked pairs are already balanced at zero cost, so the optimal course of action is to do nothing further with them.
Now let's consider the unmarked characters. Notice that no unmarked '(' is followed by an unmarked ')', or else the pair would have been marked. Therefore, if we scan the unmarked characters from left to right, we will find zero or more ')' characters followed by zero or more '(' characters.
To balance the sequence of ')' characters, it is optimal to rewrite every other one to '(', starting with the first one and excluding the last one. If there is an odd number of ')' characters, it is optimal to delete the last one.
As for the sequence of '(' characters, it is optimal to rewrite every other one to ')', starting with the second one. If there is a leftover '(' character, we delete it. The following Python code implements the steps described above and displays the intermediate results.
def balance(s):  # s is a string of '(' and ')' characters in any order
  n = len(s)
  print('original string: %s' % s)
  # Mark all matched pairs
  marked = n * [ False ]
  left_parentheses = []
  for i, ch in enumerate(s):
    if ch == '(':
      left_parentheses.append(i)
    else:
      if len(left_parentheses) != 0:
        marked[i] = True
        marked[left_parentheses.pop()] = True
  # Display the matched pairs and unmatched characters.
  matched, remaining = [], []
  for i, ch in enumerate(s):
    if marked[i]:
      matched.append(ch)
      remaining.append(' ')
    else:
      matched.append(' ')
      remaining.append(ch)
  print('  matched pairs: %s' % ''.join(matched))
  print('      unmatched: %s' % ''.join(remaining))
  cost = 0
  deleted = n * [ False ]
  new_chars = list(s)
  # Balance the unmatched ')' characters.
  right_count, last_right = 0, -1
  for i, ch in enumerate(s):
    if not marked[i] and ch == ')':
      right_count += 1
      if right_count % 2 == 1:
        new_chars[i] = '('
        cost += 1
        last_right = i
  if right_count % 2 == 1:      # Delete the last ')' if we couldn't match it.
    deleted[last_right] = True  # The cost was incremented during replacement.
  # Balance the unmatched '(' characters.
  left_count, last_left = 0, -1
  for i, ch in enumerate(s):
    if not marked[i] and ch == '(':
      left_count += 1
      if left_count % 2 == 0:
        new_chars[i] = ')'
        cost += 1
      else:
        last_left = i
  if left_count % 2 == 1:      # Delete the last '(' if we couldn't match it.
    deleted[last_left] = True  # This character wasn't replaced, so we must
    cost += 1                  # increment the cost now.
  # Display the outcome of replacing and deleting.
  balanced = []
  for i, ch in enumerate(new_chars):
    if marked[i] or deleted[i]:
      balanced.append(' ')
    else:
      balanced.append(ch)
  print('        balance: %s' % ''.join(balanced))
  # Display the cost of balancing and the overall balanced string.
  print('           cost: %d' % cost)
  result = []
  for i, ch in enumerate(new_chars):
    if not deleted[i]:  # Skip deleted characters.
      result.append(ch)
  print('     new string: %s' % ''.join(result))
balance(')()(()())))()((())((')
For the test case ')()(()())))()((())((', the output is as follows.
original string: )()(()())))()((())((
  matched pairs:  ()(()())  () (())
      unmatched: )        ))  (    ((
        balance: (        )   (    )
           cost: 4
     new string: (()(()()))()((()))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With