Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Filter out digits from a CSV File using Java

I am new to CSV Parsing. I have a CSV file where the 3rd column (a description field) may have one or more 6 digit numbers along with other values. I need to filter out those numbers and write them in the adjacent column corresponding to each row.

Eg:

3rd column                       4th column
=============                    ===========
123456adjfghviu77                123456

shgdasd234567                    234567

123456abc:de234567:c567890d      123456-234567-567890

12654352474                        

Please help. This is what I have done so far.

        String strFile="D:/Input.csv";
        CSVReader reader=new CSVReader(new FileReader(strFile));

        String[] nextline;
        //int lineNumber=0;
        String str="^[\\d|\\s]{5}$";
        String regex="[^\\d]+";

        FileWriter fw = new FileWriter("D:/Output.csv");
        PrintWriter pw = new PrintWriter(fw);


        while((nextline=reader.readNext())!=null){
            //lineNumber++;
            //System.out.println("Line : "+lineNumber);
            if(nextline[2].toString().matches(str)){
            pw.print(nextline[1]);
            pw.append('\n');
            System.out.println(nextline[2]);
            }               

        }
        pw.flush();
like image 815
Ritesh Avatar asked May 04 '26 20:05

Ritesh


1 Answers

I suggest just matching 6-digit chunks, and build a new string when collecting matches:

String s = "123456abc:de234567:c567890d";
StringBuilder result = new StringBuilder();
Pattern pattern = Pattern.compile("(?<!\\d)\\d{6}(?!\\d)");  // Pattern to match 6 digit chunks not enclosed with digits
Matcher matcher = pattern.matcher(s);
while (matcher.find()){
    if (result.length() == 0)  {              // If the result is empty
        result.append(matcher.group(0));      // add the 6 digit chunk
    } else {
       result.append("-").append(matcher.group(0)); // else add a delimiter and the digits after it
    }
} 
System.out.println(result.toString());      // Demo, use this to write to your new column

See the Java demo

UPDATE: I have changed the pattern from "\\d{6}" to "(?<!\\d)\\d{6}(?!\\d)" to make sure we only match 6-digit chunks that are not enclosed with other digits.

See the regex demo

like image 123
Wiktor Stribiżew Avatar answered May 06 '26 09:05

Wiktor Stribiżew