I know that with the Flurl HTTP .NET library I can set a global proxy by using a custom HttpClientFactory
, but is there a way to choose a custom proxy for each request?
With many other programming languages, setting a proxy is as easy as setting an option. For example, with Node.js I can do:
const request = require('request');
let opts = { url: 'http://random.org', proxy: 'http://myproxy' };
request(opts, callback);
The ideal way to do that with Flurl would be something like this, which is currently not possible:
await "http://random.org".WithProxy("http://myproxy").GetAsync();
I also know that creating a FlurlClient
/HttpClient
for every request is not an option, because of the socket exhaustion issue, which I've experienced myself in the past as well.
The scenario for this is when you need to have a pool of proxies that are rotated in some way, so that each HTTP request potentially uses a different proxy URL.
So after some discussion with the Flurl creator (#228 and #374), the solution we'come up with is to use a custom FlurlClient manager class, which would be in charge of creating the required FlurlClient
s and the linked HttpClient
instances. This is needed because each FlurlClient
can only use one proxy at a time, for limitations of how the .NET HttpClient
is designed.
If you're looking for the actual solution (and code), you can skip to the end of this answer. The following section still helps if you want to understand better.
[UPDATE: I've also built an HTTP client library that takes care of all the stuff below, allowing to set a per-request proxy out of the box. It's called PlainHttp.]
So, the first explored idea was to create a custom FlurlClientFactory
that implements the IFlurlClientFactory
interface.
The factory keeps a pool of FlurlClient
s, and when a new request needs to be sent, the factory is invoked with the Url
as the input parameter. Some logic is then performed to decide whether the request should go through a proxy or not. The URL could potentially be used as the discriminator for choosing the proxy to use for the particular request. In my case, a random proxy would be chosen for each request, and then a cached FlurlClient
would be returned.
In the end, the factory would create:
FlurlClient
per proxy URL (which will be then used for all the requests that have to go through that proxy);Some code for this solution can be found here. After registering the custom factory, there would be not much else to do. Standard requests like await "http://random.org".GetAsync();
would be automagically proxied, if the factory decided to do so.
Unfortunately, this solution has a drawback. It turns out that the custom factory is invoked multiple times during the process of building a request with Flurl. According to my experience, it is called at least 3 times. This could lead to issues, because the factory might not return the same FlurlClient
for the same input URL.
The solution is to build a custom FlurlClientManager
class, to completely bypass the FlurlClient factory mechanism and keep a custom pool of clients that are provided on demand.
While this solution is specifically built to work with the awesome Flurl library, a very similar thing can be done using the HttpClient
class directly.
/// <summary>
/// Static class that manages cached IFlurlClient instances
/// </summary>
public static class FlurlClientManager
{
/// <summary>
/// Cache for the clients
/// </summary>
private static readonly ConcurrentDictionary<string, IFlurlClient> Clients =
new ConcurrentDictionary<string, IFlurlClient>();
/// <summary>
/// Gets a cached client for the host associated to the input URL
/// </summary>
/// <param name="url"><see cref="Url"/> or <see cref="string"/></param>
/// <returns>A cached <see cref="FlurlClient"/> instance for the host</returns>
public static IFlurlClient GetClient(Url url)
{
if (url == null)
{
throw new ArgumentNullException(nameof(url));
}
return PerHostClientFromCache(url);
}
/// <summary>
/// Gets a cached client with a proxy attached to it
/// </summary>
/// <returns>A cached <see cref="FlurlClient"/> instance with a proxy</returns>
public static IFlurlClient GetProxiedClient()
{
string proxyUrl = ChooseProxy();
return ProxiedClientFromCache(proxyUrl);
}
private static string ChooseProxy()
{
// Do something and return a proxy URL
return "http://myproxy";
}
private static IFlurlClient PerHostClientFromCache(Url url)
{
return Clients.AddOrUpdate(
key: url.ToUri().Host,
addValueFactory: u => {
return CreateClient();
},
updateValueFactory: (u, client) => {
return client.IsDisposed ? CreateClient() : client;
}
);
}
private static IFlurlClient ProxiedClientFromCache(string proxyUrl)
{
return Clients.AddOrUpdate(
key: proxyUrl,
addValueFactory: u => {
return CreateProxiedClient(proxyUrl);
},
updateValueFactory: (u, client) => {
return client.IsDisposed ? CreateProxiedClient(proxyUrl) : client;
}
);
}
private static IFlurlClient CreateProxiedClient(string proxyUrl)
{
HttpMessageHandler handler = new SocketsHttpHandler()
{
Proxy = new WebProxy(proxyUrl),
UseProxy = true,
PooledConnectionLifetime = TimeSpan.FromMinutes(10)
};
HttpClient client = new HttpClient(handler);
return new FlurlClient(client);
}
private static IFlurlClient CreateClient()
{
HttpMessageHandler handler = new SocketsHttpHandler()
{
PooledConnectionLifetime = TimeSpan.FromMinutes(10)
};
HttpClient client = new HttpClient(handler);
return new FlurlClient(client);
}
}
This static class keeps a global pool of FlurlClient
s. As with the previous solution, the pool consists of:
In this implementation of the class, the proxy is chosen by the class itself (using whatever policy you want, e.g. round robin or random), but it can be adapted to take a proxy URL as the input. In that case, remember that with this implementation clients are never disposed after they're created, so you might want to think about that.
This implementation also used the new SocketsHttpHandler.PooledConnectionLifetime
option, available since .NET Core 2.1, to solve the DNS issues that arise when your HttpClient
instances have a long lifetime. On .NET Framework, the ServicePoint.ConnectionLeaseTimeout
property should be used instead.
Using the manager class is easy. For normal requests, use:
await FlurlClientManager.GetClient(url).Request(url).GetAsync();
For proxied requests, use:
await FlurlClientManager.GetProxiedClient().Request(url).GetAsync();
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With