Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to find repeated sub sequences of numbers in big string of numbers?

Can anyone help me to solve my problem?

the problem is:

Assumption 1: we have undefined number of sub strings(s1,s2,s3,…) that each of this sub strings are a sequence of 100 numbers (Integer number between 20000000 and 80000000) that they have been chosen randomly. We don’t have any knowledge about the numbers that make this sub strings and the number of sub strings. the important thing here is the order of numbers in sub string not the relation between them.`

Assumption 2: we have a big and long string include millions of numbers, this long string is made of repetition of sub string that mentioned in assumption 1. The name of this string is “S”.

We simplify the example like below: Each sub string contain four number instead of 100 number and each number is between 20 and 80 instead of 20000000 and 80000000: We have the “S” string, our algorithm must find sub string s1 and s2 and s3 from string “S”.

S= 71,59,32,51,45,22,53,25,66,72,71,26,32,28,45,72,59,51,53,66,59,51,53,66,59,51,53,66,22,59,51,25,72,32,26,53,28,66,45,72,71,32,45,72,71,32,45,72, ... .

The output of this algorithm is like below:

S1= 59,51,53,66
S2= 22,25,26,28
S3= 71,32,45,72

NOTE:if we are lucky the sub strings can coming in string "s " without combining and repeated one after another.

I want the algorithm that find the number of sub string(s1,s2,s3s, …) And also find the sub string(s1,s2,s3, …) that make the string “S”.

Thanks a lot.

like image 514
user3588552 Avatar asked Oct 30 '22 04:10

user3588552


1 Answers

Hope this will work::

import java.util.*;

public class ComputeSubSequence {

 public static void main(String[] args) {
  String rootString = "59,22,51,25,53,66,26,28,59,51,22,53,25,66,71,26,32,28,45,59,72,51,71,53,66,32,45,72,22,25,26,59,51,28,71,53,32,66,45,72";
  Integer sizeOfSubString = 4;
  List < String > rootList = new ArrayList < String > (Arrays.asList(rootString.split("\\s*,\\s*")));

  Set < String > setValue = new LinkedHashSet < String > ();
  Set < Integer > setValueNew = new LinkedHashSet < Integer > ();
  HashMap < Integer, String > map = new LinkedHashMap < Integer, String > ();

  for (String string: rootList) {
   map.put(Integer.valueOf(string), Integer.valueOf(Collections.frequency(rootList, string)).toString());
   setValue.add(Integer.valueOf(Collections.frequency(rootList, string)).toString());
  }

  for (String string: setValue) {
   for (Map.Entry < Integer, String > entry: map.entrySet()) {
    if (entry.getValue().contains(string)) {
     setValueNew.add(entry.getKey());
    }
   }
  }

  List < Integer > listOfNames = new ArrayList < Integer > (setValueNew);

  Integer j = 0;
  Integer i = 0;
  Integer count = 1;
  for (i = sizeOfSubString; i <= listOfNames.size(); i = i + sizeOfSubString) {
   System.out.println("S" + count + "=" + listOfNames.subList(j, i).toString().replace("]", "").replace("[", ""));
   count++;
   j = j + sizeOfSubString;

  }
 }
}
like image 126
VishalZ Avatar answered Nov 09 '22 22:11

VishalZ