OpenCV EAST Text detector implementation in Java

Tags:

opencv

I am trying to convert the text detection example from the below page in Java. The original code is in C++.

https://github.com/opencv/opencv/blob/master/samples/dnn/text_detection.cpp

But I am facing issues in converting the below lines (131-136 in the cpp file) in Java:

...
    const float* scoresData = scores.ptr<float>(0, 0, y);
    const float* x0_data = geometry.ptr<float>(0, 0, y);
    const float* x1_data = geometry.ptr<float>(0, 1, y);
...

I tried using every method from the openCV Mat class but most of them throw exceptions!

My code till now is as below:

Click to copy

    Net net = Dnn.readNet("C:\\frozen_east_text_detection.pb");

    Mat blob = Dnn.blobFromImage(resizedImg, 1.0, resizedImg.size(), new Scalar(123.68, 116.78, 103.94), true,
            false);
    net.setInput(blob);
    List<Mat> outs = new ArrayList<>();
    net.forward(outs, LAYER_NAMES);

    Mat scores = outs.get(0);
    Mat geometry = outs.get(1);

    int numRows = scores.size(2);
    int numCols = scores.size(3);

    List<RotatedRect> boxes = new ArrayList<>();
    List<Double> confidences = new ArrayList<>();

    System.out.printf("numRows = %d\n", scores.size(2)); 
    System.out.printf("numCols = %d\n", scores.size(3));

I am not very familiar with pointers, but what I could understand is the scores native object seems to be a 4-D array, but it is a class in Java and there is no way in java to address a class with indices, like possible in C++ as shown, or in python as below (conversion of cpp program to python on another website):

Click to copy

for y in range(0, numRows):
    scoresData = scores[0, 0, y]
    xData0 = geometry[0, 0, y]
    xData1 = geometry[0, 1, y]
    xData2 = geometry[0, 2, y]
    xData3 = geometry[0, 3, y]
    anglesData = geometry[0, 4, y]

429

asked Nov 20 '18 21:11

inquizitive

1 Answers

Here is the equivalent code in Java that I found from this gist by berak:

Click to copy

Mat scoresData = scores.row(y);
Mat x0Data = geometry.submat(0, height, 0, width).row(y);
Mat x1Data = geometry.submat(height, 2 * height, 0, width).row(y);
Mat x2Data = geometry.submat(2 * height, 3 * height, 0, width).row(y);
Mat x3Data = geometry.submat(3 * height, 4 * height, 0, width).row(y);
Mat anglesData = geometry.submat(4 * height, 5 * height, 0, width).row(y);

Here is the complete text detection code in Java:

Click to copy

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import org.opencv.core.Core;
import org.opencv.core.*;
import org.opencv.core.MatOfFloat;
import org.opencv.core.MatOfByte;
import org.opencv.core.Scalar;
import org.opencv.core.Size;
import org.opencv.dnn.*;
import org.opencv.dnn.Dnn;
import org.opencv.imgcodecs.Imgcodecs;
import org.opencv.imgproc.Imgproc;
import org.opencv.utils.*;

public class SimpleSample {
    static {
        System.loadLibrary(Core.NATIVE_LIBRARY_NAME);
    }

    public static void main(String[] args) {
        System.loadLibrary(Core.NATIVE_LIBRARY_NAME);

        float scoreThresh = 0.5f;
        float nmsThresh = 0.4f;
        // Model from https://github.com/argman/EAST
        // You can find it here : https://github.com/opencv/opencv_extra/blob/master/testdata/dnn/download_models.py#L309
        Net net = Dnn.readNetFromTensorflow("c:/data/mdl/frozen_east_text_detection.pb");
        // input image
        Mat frame = Imgcodecs.imread("nantext.png");
        Imgproc.cvtColor(frame, frame, Imgproc.COLOR_RGBA2RGB);

        Size siz = new Size(320, 320);
        int W = (int)(siz.width / 4); // width of the output geometry  / score maps
        int H = (int)(siz.height / 4); // height of those. the geometry has 4, vertically stacked maps, the score one 1
        Mat blob = Dnn.blobFromImage(frame, 1.0,siz, new Scalar(123.68, 116.78, 103.94), true, false);
        net.setInput(blob);
        List<Mat> outs = new ArrayList<>(2);
        List<String> outNames = new ArrayList<String>();
        outNames.add("feature_fusion/Conv_7/Sigmoid");
        outNames.add("feature_fusion/concat_3");
        net.forward(outs, outNames);

        // Decode predicted bounding boxes.
        Mat scores = outs.get(0).reshape(1, H);
        // My lord and savior : http://answers.opencv.org/question/175676/javaandroid-access-4-dim-mat-planes/
        Mat geometry = outs.get(1).reshape(1, 5 * H); // don't hardcode it !
        List<Float> confidencesList = new ArrayList<>();
        List<RotatedRect> boxesList = decode(scores, geometry, confidencesList, scoreThresh);

        // Apply non-maximum suppression procedure.
        MatOfFloat confidences = new MatOfFloat(Converters.vector_float_to_Mat(confidencesList));
        RotatedRect[] boxesArray = boxesList.toArray(new RotatedRect[0]);
        MatOfRotatedRect boxes = new MatOfRotatedRect(boxesArray);
        MatOfInt indices = new MatOfInt();
        Dnn.NMSBoxesRotated(boxes, confidences, scoreThresh, nmsThresh, indices);

        // Render detections
        Point ratio = new Point((float)frame.cols()/siz.width, (float)frame.rows()/siz.height);
        int[] indexes = indices.toArray();
        for(int i = 0; i<indexes.length;++i) {
            RotatedRect rot = boxesArray[indexes[i]];
            Point[] vertices = new Point[4];
            rot.points(vertices);
            for (int j = 0; j < 4; ++j) {
                vertices[j].x *= ratio.x;
                vertices[j].y *= ratio.y;
            }
            for (int j = 0; j < 4; ++j) {
                Imgproc.line(frame, vertices[j], vertices[(j + 1) % 4], new Scalar(0, 0,255), 1);
            }
        }
        Imgcodecs.imwrite("out.png", frame);
    }

    private static List<RotatedRect> decode(Mat scores, Mat geometry, List<Float> confidences, float scoreThresh) {
        // size of 1 geometry plane
        int W = geometry.cols();
        int H = geometry.rows() / 5;
        //System.out.println(geometry);
        //System.out.println(scores);

        List<RotatedRect> detections = new ArrayList<>();
        for (int y = 0; y < H; ++y) {
            Mat scoresData = scores.row(y);
            Mat x0Data = geometry.submat(0, H, 0, W).row(y);
            Mat x1Data = geometry.submat(H, 2 * H, 0, W).row(y);
            Mat x2Data = geometry.submat(2 * H, 3 * H, 0, W).row(y);
            Mat x3Data = geometry.submat(3 * H, 4 * H, 0, W).row(y);
            Mat anglesData = geometry.submat(4 * H, 5 * H, 0, W).row(y);

            for (int x = 0; x < W; ++x) {
                double score = scoresData.get(0, x)[0];
                if (score >= scoreThresh) {
                    double offsetX = x * 4.0;
                    double offsetY = y * 4.0;
                    double angle = anglesData.get(0, x)[0];
                    double cosA = Math.cos(angle);
                    double sinA = Math.sin(angle);
                    double x0 = x0Data.get(0, x)[0];
                    double x1 = x1Data.get(0, x)[0];
                    double x2 = x2Data.get(0, x)[0];
                    double x3 = x3Data.get(0, x)[0];
                    double h = x0 + x2;
                    double w = x1 + x3;
                    Point offset = new Point(offsetX + cosA * x1 + sinA * x2, offsetY - sinA * x1 + cosA * x2);
                    Point p1 = new Point(-1 * sinA * h + offset.x, -1 * cosA * h + offset.y);
                    Point p3 = new Point(-1 * cosA * w + offset.x,      sinA * w + offset.y); // original trouble here !
                    RotatedRect r = new RotatedRect(new Point(0.5 * (p1.x + p3.x), 0.5 * (p1.y + p3.y)), new Size(w, h), -1 * angle * 180 / Math.PI);
                    detections.add(r);
                    confidences.add((float) score);
                }
            }
        }
        return detections;
    }
}

153

answered Sep 22 '22 01:09

Stanley

Related questions
                            
                                Intellij ultimate vs community version for spring framework
                            
                                Flyway deprecation message logged when using Spring Boot 2
                            
                                How can I make WebView keep a video or audio playing in the background?
                            
                                Java Mail API setContent() not written in the mail body as HTML
                            
                                Relaxing SSL algorithm constrains programmatically
                            
                                Java: protobuf from string
                            
                                Can I start a JVM with Eden space so big, it runs to completion without any GC. Assuming I have heap of free mem
                            
                                How to convert a for iteration with conditions to Java 8 stream
                            
                                What is the purpose of the OptionalInt.isPresent field?
                            
                                How can I store nested Hashmap in Redis?
                            
                                Spring Data JPA aggregate functions on an empty resultset
                            
                                Interface Annotation does not accept application.properties value
                            
                                Lambda & Stream : collect in a Map
                            
                                How to print a nested list using java stream where the Object holds a list of references to itself
                            
                                spring boot: how to configure datasource from application properties
                            
                                How to capture save or update events in Couchbase
                            
                                Convert Big Integer value to eight bit bytes(2s complement big endian) sequence which is multiple of 8 in Java
                            
                                Rewrite double nested for loop as a Java 8 stream
                            
                                Javadoc error: "option --boot-class-path not allowed with target 11"
                            
                                Raw String Literals - Remove Leading Indentation

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

OpenCV EAST Text detector implementation in Java

Tags:

java

opencv

inquizitive

People also ask

1 Answers

Stanley

Recent Activity

Donate For Us