Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

I have tried to optimize (memory) my program, but GC is still making it lag

I have written a piece of software in Java that checks if proxies are working by sending a HTTP request using the proxy.

It takes around 30,000 proxies from a database, then attempts to check if they are operational. The proxies received from the database used to be returned as an ArrayList<String>, but have been changed to Deque<String> for reasons stated below.

The way the program works is there is a ProxyRequest object that stores the IP & Port as a String and int respectively. The ProxyRequest object has a method isWorkingProxy() which attempts to send a request using a proxy and returns a boolean on whether it was successful.

This ProxyRequest object is wrapped around by a RunnableProxyRequest object that calls super.isWorkingProxy() in the overrided run() method. Based on the response from super.isWorkingProxy(), the RunnableProxyRequest object updates a MySQL database.

Do note that the updating of the MySQL database is synchronized().

It runs on 750 threads using a FixedThreadPool (on a VPS), but towards the end, it becomes very slow (stuck on ~50 threads), which obviously implies the garbage collector is working. This is the problem.

I have attempted the following to improve the lag, it does not seem to work:

1) Using a Deque<String> proxies and using Deque.pop() to obtain the String in which the proxy is. This (I believe), continuously makes the Deque<String> smaller, which should improve lag caused by the GC.

2) Set the con.setConnectTimeout(this.timeout);, where this.timeout = 5000; This way, the connection should return a result in 5 seconds. If not, the thread is completed and should no longer be active in the threadpool.

Besides this, I don't know any other way I can improve performance.

Can anyone recommend a way for me to improve performance to avoid / stop lagging towards the end of the threads by the GC? I know there is a Stackoverflow question about this (Java threads slow down towards the end of processing), but I have tried everything in the answer and it has not worked for me.

Thank you for your time.

Code snippets:

Loop adding threads to the FixedThreadPool:

//This code is executed recursively (at the end, main(args) is called again)
//Create the threadpool for requests
//Threads is an argument that is set to 750.
ThreadPoolExecutor executor = (ThreadPoolExecutor)Executors.newFixedThreadPool(threads);
Deque<String> proxies = DB.getProxiesToCheck();

while(proxies.isEmpty() == false) {
    try {
        String[] split = proxies.pop().split(":");

        Runnable[] checks = new Runnable[] {
            //HTTP check
            new RunnableProxyRequest(split[0], split[1], Proxy.Type.HTTP, false),
            //SSL check
            new RunnableProxyRequest(split[0], split[1], Proxy.Type.HTTP, true),
            //SOCKS check
            new RunnableProxyRequest(split[0], split[1], Proxy.Type.SOCKS, false)
            //Add more checks to this list as time goes...
        };

        for(Runnable check : checks) {
            executor.submit(check);
        }

    } catch(IndexOutOfBoundsException e) {
        continue;
    }
}

ProxyRequest class:

//Proxy details
private String proxyIp;
private int proxyPort;
private Proxy.Type testingType;

//Request details
private boolean useSsl;

public ProxyRequest(String proxyIp, String proxyPort, Proxy.Type testingType, boolean useSsl) {
    this.proxyIp = proxyIp;
    try {
        this.proxyPort = Integer.parseInt(proxyPort);
    } catch(NumberFormatException e) {
        this.proxyPort = -1;
    }
    this.testingType = testingType;
    this.useSsl = useSsl;
}

public boolean isWorkingProxy() {
    //Case of an invalid proxy
    if(proxyPort == -1) {
        return false;
    }

    HttpURLConnection con = null;

    //Perform checks on URL
    //IF any exception occurs here, the proxy is obviously bad.
    try {
        URL url = new URL(this.getTestingUrl());
        //Create proxy
        Proxy p = new Proxy(this.testingType, new InetSocketAddress(this.proxyIp, this.proxyPort));
        //No redirect
        HttpURLConnection.setFollowRedirects(false);
        //Open connection with proxy
        con = (HttpURLConnection)url.openConnection(p);
        //Set the request method
        con.setRequestMethod("GET");
        //Set max timeout for a request.
        con.setConnectTimeout(this.timeout);
    } catch(MalformedURLException e) {
        System.out.println("The testing URL is bad. Please fix this.");
        return false;
    } catch(Exception e) {
        return false;
    }

    try(
            BufferedReader in = new BufferedReader(new InputStreamReader(con.getInputStream()));
            ) {

        String inputLine = null; StringBuilder response = new StringBuilder();
        while((inputLine = in.readLine()) != null) {
            response.append(inputLine);
        }

        //A valid proxy!
        return con.getResponseCode() > 0;

    } catch(Exception e) {
        return false;
    }
}

RunnableProxyRequest class:

public class RunnableProxyRequest extends ProxyRequest implements Runnable {


    public RunnableProxyRequest(String proxyIp, String proxyPort, Proxy.Type testingType, boolean useSsl) {

        super(proxyIp, proxyPort, testingType, useSsl);

    }

    @Override
    public void run() {

        String test = super.getTest();

        if(super.isWorkingProxy()) {

            System.out.println("-- Working proxy: " + super.getProxy() + " | Test: " +  test);

            this.updateDB(true, test);

        } else {
            System.out.println("-- Not working: " + super.getProxy() + " | Test: " +  test);

            this.updateDB(false, test);
        }   


    }

    private void updateDB(boolean success, String testingType) {
        switch(testingType) {
            case "SSL":
                DB.updateSsl(super.getProxyIp(), super.getProxyPort(), success);
                break;
            case "HTTP":
                DB.updateHttp(super.getProxyIp(), super.getProxyPort(), success);
                break;
            case "SOCKS":
                DB.updateSocks(super.getProxyIp(), super.getProxyPort(), success);
                break;
            default:
                break;
        }
    }
}

DB class:

//Locker for async 
private static Object locker = new Object();

private static void executeUpdateQuery(String query, String proxy, int port, boolean toSet) {
    synchronized(locker) {
        //Some prepared statements here.
    }
}
like image 992
Raghav Avatar asked Nov 08 '22 02:11

Raghav


1 Answers

Thanks to Peter Lawrey for guiding me to the solution! :)
His comment:

@ILoveKali I have found network libraries are not aggressive enough in shutting down a connection when things go really wrong. Timeouts tend to work best when the connection is fine. YMMV

So I did some research, and found that I had to also use the method setReadTimeout(this.timeout);. Previously, I was only using setConnectTimeout(this.timeout);!

Thanks to this post (HttpURLConnection timeout defaults) that explained the following:

Unfortunately, in my experience, it appears using these defaults can lead to an unstable state, depending on what happens with your connection to the server. If you use an HttpURLConnection and don't explicitly set (at least read) timeouts, your connection can get into a permanent stale state. By default. So always set setReadTimeout to "something" or you might orphan connections (and possibly threads depending on how your app runs).

So the final answer is: The GC was doing just fine, it was not responsible for the lag. The threads were simply stuck FOREVER at a single number because I did not set the read timeout, and so the isWorkingProxy() method never got a result and kept reading.

like image 177
Raghav Avatar answered Nov 14 '22 22:11

Raghav