Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should I keep an open connection to Google cloud storage?

To communicate with Google Cloud Storage, I'm following this example: https://developers.google.com/storage/docs/json_api/v1/json-api-java-samples

Should I keep an open connection to the cloud? Wouldn't this cause memory or resources issues?

/** Global instance of the HTTP transport. */
private static HttpTransport httpTransport;

private static Storage client;

Otherwise, should I close the connection after each get/delete request? what's the best practice?

I'm working on an application which will be deployed on Linux. The application will receive a HTTP POST request with a file to upload to the cloud. When the application is first loaded I initiate the following as global variables:

// Initialize the transport.
httpTransport = GoogleNetHttpTransport.newTrustedTransport();

// Initialize the data store factory.
dataStoreFactory = new FileDataStoreFactory(DATA_STORE_DIR);

// Authorization.
Credential credential = authorize();

// Set up global Storage instance.
client = new Storage.Builder(httpTransport, JSON_FACTORY, credential)
    .setApplicationName(APPLICATION_NAME).build();

Is this the best practice? Will this implementation cause me memory/resources issues?

like image 214
snabel Avatar asked Nov 11 '22 03:11

snabel


1 Answers

I think you missed the best practice document from google. See Uploading data to Cloud Storage at https://cloud.google.com/storage/docs/best-practices which will clear your all doubts.

  • If you use XMLHttpRequest (XHR) callbacks to get progress updates, do not close and re-open the connection if you detect that progress has stalled. Doing so creates a bad positive feedback loop during times of network congestion. When the network is congested, XHR callbacks can get backlogged behind the acknowledgement (ACK/NACK) activity from the upload stream, and closing and reopening the connection when this happens uses more network capacity at exactly the time when you can least afford it.
  • For upload traffic, we recommend setting reasonably long timeouts. For a good end-user experience, you can set a client-side timer that updates the client status window with a message (e.g., "network congestion") when your application hasn't received an XHR callback for a long time. Don't just close the connection and try again when this happens.
  • If you use Google Compute Engine instances with processes that POST to Cloud Storage to initiate a resumable upload, then you should use Compute Engine instances in the same locations as your Cloud Storage buckets. You can then use a geo IP service to pick the Compute Engine region to which you route customer requests, which will help keep traffic localized to a geo-region.
  • Avoid breaking a transfer into smaller chunks if possible and instead upload the entire content in a single chunk. Avoiding chunking removes fixed latency costs and improves throughput, as well as reducing QPS against Google Cloud Storage.
  • Situations where you should consider uploading in chunks include when your source data is being generated dynamically, your clients have request size limitations (which is true for many browsers), or your clients are unable to stream bytes in a single request without first loading the full request into memory. If your clients receive an error, they can query the server for the commit offset and resume uploading remaining bytes from that offset.
like image 110
Youdhveer Avatar answered Nov 14 '22 23:11

Youdhveer