Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Highlighting Text in java

We are developing a plagiarism detection framework. In there i have to highlight the possible plagiarized phrases in the document. The document gets preprocessed with stop word removal, stemming and number removal first. So the highlighting gets difficult with the preprocessed token As and example:

Orginal Text: "Extreme programming is one approach of agile software development which emphasizes on frequent releases in short development cycles which are called time boxes. This result in reducing the costs spend for changes, by having multiple short development cycles, rather than one long one. Extreme programming includes pair-wise programming (for code review, unit testing). Also it avoids implementing features which are not included in the current time box, so the schedule creep can be minimized. "

phrase want to highlight: Extreme programming includes pair-wise programming

preprocessed token : Extrem program pair-wise program

Is there anyway I can highlight the preprocessed token in the original document????

Thanx

like image 516
Nuwan Avatar asked Jun 30 '11 04:06

Nuwan


People also ask

How do you highlight a text field in Java?

println("hi"); String theSentence = txtTheSentence. getText(); String WordToFind = txtWordToFind. getText(); Highlighter h = txtWordToFind. getHighlighter(); Pattern pattern = Pattern.

How do you highlight a text?

How to highlight text on an Android smartphone and tablet. Press and hold down on any text with your finger, drag your finger over the text you'd like to highlight, and then let go.

How do you highlight text in Javascript?

The 'mark' tag If you surround any text inside of the mark tag, it will automatically get highlighted by the browser in this striking yellow color. That makes highlighting searched text quite a simple task then.


2 Answers

You'd better use JTextPane or JEditorPane, instead of JTextArea.

A text area is a "plain" text component, which means taht although it can display text in any font, all of the text is in the same font.

So, JTextArea is not a convenient component to make any text formatting.

On the contrary, using JTextPane or JEditorPane, it's quite easy to change style (highlight) of any part of loaded text.

See How to Use Editor Panes and Text Panes for details.

Update:

The following code highlights the desired part of your text. It's not exectly what you want. It simply finds the exact phrase in the text.

But I hope that if you apply your algorithms, you can easily modify it to fit your needs.

import java.lang.reflect.InvocationTargetException;
import javax.swing.*;
import javax.swing.text.*;
import java.awt.*;

public class LineHighlightPainter {

    String revisedText = "Extreme programming is one approach "
            + "of agile software development which emphasizes on frequent"
            + " releases in short development cycles which are called "
            + "time boxes. This result in reducing the costs spend for "
            + "changes, by having multiple short development cycles, "
            + "rather than one long one. Extreme programming includes "
            + "pair-wise programming (for code review, unit testing). "
            + "Also it avoids implementing features which are not included "
            + "in the current time box, so the schedule creep can be minimized. ";
    String token = "Extreme programming includes pair-wise programming";

    public static void main(String args[]) {
        try {
            SwingUtilities.invokeAndWait(new Runnable() {

                public void run() {
                    new LineHighlightPainter().createAndShowGUI();
                }
            });
        } catch (InterruptedException ex) {
            // ignore
        } catch (InvocationTargetException ex) {
            // ignore
        }
    }

    public void createAndShowGUI() {
        JFrame frame = new JFrame("LineHighlightPainter demo");
        frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);

        JTextArea area = new JTextArea(9, 45);
        area.setLineWrap(true);
        area.setWrapStyleWord(true);
        area.setText(revisedText);

        // Highlighting part of the text in the instance of JTextArea
        // based on token.
        highlight(area, token);

        frame.getContentPane().add(new JScrollPane(area), BorderLayout.CENTER);
        frame.pack();
        frame.setVisible(true);
    }

    // Creates highlights around all occurrences of pattern in textComp
    public void highlight(JTextComponent textComp, String pattern) {
        // First remove all old highlights
        removeHighlights(textComp);

        try {
            Highlighter hilite = textComp.getHighlighter();
            Document doc = textComp.getDocument();
            String text = doc.getText(0, doc.getLength());

            int pos = 0;
            // Search for pattern
            while ((pos = text.indexOf(pattern, pos)) >= 0) {
                // Create highlighter using private painter and apply around pattern
                hilite.addHighlight(pos, pos + pattern.length(), myHighlightPainter);
                pos += pattern.length();
            }

        } catch (BadLocationException e) {
        }
    }

    // Removes only our private highlights
    public void removeHighlights(JTextComponent textComp) {
        Highlighter hilite = textComp.getHighlighter();
        Highlighter.Highlight[] hilites = hilite.getHighlights();

        for (int i = 0; i < hilites.length; i++) {
            if (hilites[i].getPainter() instanceof MyHighlightPainter) {
                hilite.removeHighlight(hilites[i]);
            }
        }
    }
    // An instance of the private subclass of the default highlight painter
    Highlighter.HighlightPainter myHighlightPainter = new MyHighlightPainter(Color.red);

    // A private subclass of the default highlight painter
    class MyHighlightPainter
            extends DefaultHighlighter.DefaultHighlightPainter {

        public MyHighlightPainter(Color color) {
            super(color);
        }
    }
}

This example is based on Highlighting Words in a JTextComponent.

like image 60
MockerTim Avatar answered Oct 04 '22 04:10

MockerTim


From a technical point of view: You can either choose or develop a markup language and add annotations or tags to the original document. Or you want to create a second file that records all potential plagiarisms.

With markup, your text could look like this:

[...] rather than one long one. <plag ref="1234">Extreme programming 
includes pair-wise programming</plag> (for code review, unit testing). [...]

(with ref referencing to some metadata record that describes the original)

like image 30
Andreas Dolk Avatar answered Oct 04 '22 03:10

Andreas Dolk