I'm trying to port a simple async TCP server in F# to C# 4. The server receives a connection, reads a single request and streams back a sequence of responses before closing the connection.
Async in C# 4 looks tedious and error prone so I thought I'd try using WCF instead. This server is not unlikely to see 1,000 simultaneous requests in the wild so I think both throughput and latency are of interest.
I've written a minimal duplex WCF web service and console client in C#. Although I'm using WCF instead of raw sockets, this is already 175 lines of code compared to 80 lines for the original. But I'm more concerned about the performance and scalability:
Firstly, I'm using the default settings for everything so I'm wondering if there is anything I can tweak to improve these performance figures?
Secondly, I'm wondering if anyone is using WCF for this kind of thing or if it is the wrong tool for the job?
Here's my WCF server in C#:
public class Stock
public DateTime FirstDealDate { get; set; }
public DateTime LastDealDate { get; set; }
public DateTime StartDate { get; set; }
public DateTime EndDate { get; set; }
public decimal Open { get; set; }
public decimal High { get; set; }
public decimal Low { get; set; }
public decimal Close { get; set; }
public decimal VolumeWeightedPrice { get; set; }
public decimal TotalQuantity { get; set; }
[ServiceContract(CallbackContract = typeof(IPutStock))]
public interface IStock
void GetStocks();
public interface IPutStock
void PutStock(Stock stock);
<%@ ServiceHost Language="C#" Debug="true" Service="DuplexWcfService2.Stocks" CodeBehind="Service1.svc.cs" %>
[ServiceBehavior(ConcurrencyMode = ConcurrencyMode.Multiple)]
public class Stocks : IStock
IPutStock callback;
#region IStock Members
public void GetStocks()
callback = OperationContext.Current.GetCallbackChannel<IPutStock>();
Stock st = null;
st = new Stock
FirstDealDate = System.DateTime.Now,
LastDealDate = System.DateTime.Now,
StartDate = System.DateTime.Now,
EndDate = System.DateTime.Now,
Open = 495,
High = 495,
Low = 495,
Close = 495,
VolumeWeightedPrice = 495,
TotalQuantity = 495
for (int i=0; i<1000; ++i)
<?xml version="1.0"?>
<compilation debug="true" targetFramework="4.0" />
<service name="DuplexWcfService2.Stocks">
<endpoint address="" binding="wsDualHttpBinding" contract="DuplexWcfService2.IStock">
<dns value="localhost"/>
<endpoint address="mex" binding="mexHttpBinding" contract="IMetadataExchange"/>
<serviceMetadata httpGetEnabled="true"/>
<serviceDebug includeExceptionDetailInFaults="true"/>
<serviceHostingEnvironment multipleSiteBindingsEnabled="true" />
<modules runAllManagedModulesForAllRequests="true"/>
Here's the C# WCF client:
[CallbackBehavior(ConcurrencyMode = ConcurrencyMode.Multiple, UseSynchronizationContext = false)]
class Callback : DuplexWcfService2.IStockCallback
System.Diagnostics.Stopwatch timer;
int n;
public Callback(System.Diagnostics.Stopwatch t)
timer = t;
n = 0;
public void PutStock(DuplexWcfService2.Stock st)
if (n == 1)
Console.WriteLine("First result in " + this.timer.Elapsed.TotalSeconds + "s");
if (n == 1000)
Console.WriteLine("1,000 results in " + this.timer.Elapsed.TotalSeconds + "s");
class Program
static void Test(int i)
var timer = System.Diagnostics.Stopwatch.StartNew();
var ctx = new InstanceContext(new Callback(timer));
var proxy = new DuplexWcfService2.StockClient(ctx);
Console.WriteLine(i + " connected");
static void Main(string[] args)
for (int i=0; i<10; ++i)
int j = i;
new System.Threading.Thread(() => Test(j)).Start();
Here's my async TCP client and server code in F#:
type AggregatedDeals =
FirstDealTime: System.DateTime
LastDealTime: System.DateTime
StartTime: System.DateTime
EndTime: System.DateTime
Open: decimal
High: decimal
Low: decimal
Close: decimal
VolumeWeightedPrice: decimal
TotalQuantity: decimal
let read (stream: System.IO.Stream) = async {
let! header = stream.AsyncRead 4
let length = System.BitConverter.ToInt32(header, 0)
let! body = stream.AsyncRead length
let fmt = System.Runtime.Serialization.Formatters.Binary.BinaryFormatter()
use stream = new System.IO.MemoryStream(body)
return fmt.Deserialize(stream)
let write (stream: System.IO.Stream) value = async {
let body =
let fmt = System.Runtime.Serialization.Formatters.Binary.BinaryFormatter()
use stream = new System.IO.MemoryStream()
fmt.Serialize(stream, value)
let header = System.BitConverter.GetBytes body.Length
do! stream.AsyncWrite header
do! stream.AsyncWrite body
let endPoint = System.Net.IPEndPoint(System.Net.IPAddress.Loopback, 4502)
let server() = async {
let listener = System.Net.Sockets.TcpListener(endPoint)
while true do
let client = listener.AcceptTcpClient()
async {
use stream = client.GetStream()
let! _ = stream.AsyncRead 1
for i in 1..1000 do
let aggregatedDeals =
FirstDealTime = System.DateTime.Now
LastDealTime = System.DateTime.Now
StartTime = System.DateTime.Now
EndTime = System.DateTime.Now
Open = 1m
High = 1m
Low = 1m
Close = 1m
VolumeWeightedPrice = 1m
TotalQuantity = 1m
do! write stream aggregatedDeals
} |> Async.Start
let client() = async {
let timer = System.Diagnostics.Stopwatch.StartNew()
use client = new System.Net.Sockets.TcpClient()
client.Connect endPoint
use stream = client.GetStream()
do! stream.AsyncWrite [|0uy|]
for i in 1..1000 do
let! _ = read stream
if i=1 then lock stdout (fun () ->
printfn "First result in %fs" timer.Elapsed.TotalSeconds)
lock stdout (fun () ->
printfn "1,000 results in %fs" timer.Elapsed.TotalSeconds)
server() |> Async.Start
seq { for i in 1..100 -> client() }
|> Async.Parallel
|> Async.RunSynchronously
|> ignore
WCF selects very safe values for almost all its defaults. This follows the philosophy of don’t let the novice developer shoot themselves. However if you know the throttles to change and the bindings to use, you can get reasonable performance and scaling.
On my core i5-2400 (quad core, no hyper threading, 3.10 GHz) the solution below will run 1000 clients with a 1000 callbacks each for an average total running time of 20 seconds. That’s 1,000,000 WCF calls in 20 seconds.
Unfortunately I couldn’t get your F# program to run for a direct comparison. If you run my solution on your box, could you please post some F# vs C# WCF performance comparison numbers?
Disclaimer: The below is intended to be a proof of concept. Some of these settings don’t make sense for production.
What I did:
Note that in this prototype all services and clients are in the same App Domain and sharing the same thread pool.
What I learned:
Program output running on a core i5-2400. Note the timers are used differently than in the original question (see the code).
All client hosts open.
Service Host opened. Starting timer...
Press ENTER to close the host one you see 'ALL DONE'.
Client #100 completed 1,000 results in 0.0542168 s
Client #200 completed 1,000 results in 0.0794684 s
Client #300 completed 1,000 results in 0.0673078 s
Client #400 completed 1,000 results in 0.0527753 s
Client #500 completed 1,000 results in 0.0581796 s
Client #600 completed 1,000 results in 0.0770291 s
Client #700 completed 1,000 results in 0.0681298 s
Client #800 completed 1,000 results in 0.0649353 s
Client #900 completed 1,000 results in 0.0714947 s
Client #1000 completed 1,000 results in 0.0450857 s
ALL DONE. Total number of clients: 1000 Total runtime: 19323 msec
Code all in one console application file:
using System;
using System.Collections.Generic;
using System.ServiceModel;
using System.Diagnostics;
using System.Threading;
using System.Runtime.Serialization;
namespace StockApp
public class Stock
public DateTime FirstDealDate { get; set; }
public DateTime LastDealDate { get; set; }
public DateTime StartDate { get; set; }
public DateTime EndDate { get; set; }
public decimal Open { get; set; }
public decimal High { get; set; }
public decimal Low { get; set; }
public decimal Close { get; set; }
public decimal VolumeWeightedPrice { get; set; }
public decimal TotalQuantity { get; set; }
public interface IStock
[OperationContract(IsOneWay = true)]
void GetStocks(string address);
public interface IPutStock
[OperationContract(IsOneWay = true)]
void PutStock(Stock stock);
[ServiceBehavior(InstanceContextMode = InstanceContextMode.PerCall)]
public class StocksService : IStock
public void SendStocks(object obj)
string address = (string)obj;
ChannelFactory<IPutStock> factory = new ChannelFactory<IPutStock>("CallbackClientEndpoint");
IPutStock callback = factory.CreateChannel(new EndpointAddress(address));
Stock st = null; st = new Stock
FirstDealDate = System.DateTime.Now,
LastDealDate = System.DateTime.Now,
StartDate = System.DateTime.Now,
EndDate = System.DateTime.Now,
Open = 495,
High = 495,
Low = 495,
Close = 495,
VolumeWeightedPrice = 495,
TotalQuantity = 495
for (int i = 0; i < 1000; ++i)
//Console.WriteLine("Done calling {0}", address);
public void GetStocks(string address)
/// WCF service methods execute on IO threads.
/// Passing work off to worker thread improves service responsiveness... with a measurable cost in total runtime.
System.Threading.ThreadPool.QueueUserWorkItem(new System.Threading.WaitCallback(SendStocks), address);
// SendStocks(address);
[ServiceBehavior(InstanceContextMode = InstanceContextMode.PerSession)]
public class Callback : IPutStock
public static int CallbacksCompleted = 0;
System.Diagnostics.Stopwatch timer = Stopwatch.StartNew();
int n = 0;
public void PutStock(Stock st)
if (n == 1000)
//Console.WriteLine("1,000 results in " + this.timer.Elapsed.TotalSeconds + "s");
int compelted = Interlocked.Increment(ref CallbacksCompleted);
if (compelted % 100 == 0)
Console.WriteLine("Client #{0} completed 1,000 results in {1} s", compelted, this.timer.Elapsed.TotalSeconds);
if (compelted == Program.CLIENT_COUNT)
Console.WriteLine("ALL DONE. Total number of clients: {0} Total runtime: {1} msec", Program.CLIENT_COUNT, Program.ProgramTimer.ElapsedMilliseconds);
class Program
public static System.Diagnostics.Stopwatch ProgramTimer;
static void StartCallPool(object uriObj)
string callbackUri = (string)uriObj;
ChannelFactory<IStock> factory = new ChannelFactory<IStock>("StockClientEndpoint");
IStock proxy = factory.CreateChannel();
static void Test()
ThreadPool.SetMinThreads(CLIENT_COUNT, CLIENT_COUNT * 2);
// Create all the hosts that will recieve call backs.
List<ServiceHost> callBackHosts = new List<ServiceHost>();
for (int i = 0; i < CLIENT_COUNT; ++i)
string port = string.Format("{0}", i).PadLeft(3, '0');
string baseAddress = "net.tcp://localhost:7" + port + "/";
ServiceHost callbackHost = new ServiceHost(typeof(Callback), new Uri[] { new Uri( baseAddress)});
Console.WriteLine("All client hosts open.");
ServiceHost stockHost = new ServiceHost(typeof(StocksService));
Console.WriteLine("Service Host opened. Starting timer...");
ProgramTimer = Stopwatch.StartNew();
foreach (var callbackHost in callBackHosts)
ThreadPool.QueueUserWorkItem(new WaitCallback(StartCallPool), callbackHost.BaseAddresses[0].AbsoluteUri);
Console.WriteLine("Press ENTER to close the host once you see 'ALL DONE'.");
foreach (var h in callBackHosts)
static void Main(string[] args)
public static class Extensions
static public void Shutdown(this ICommunicationObject obj)
catch (Exception ex)
Console.WriteLine("Shutdown exception: {0}", ex.Message);
<?xml version="1.0" encoding="utf-8" ?>
<service name="StockApp.StocksService">
<add baseAddress="net.tcp://localhost:8123/StockApp/"/>
<endpoint address="" binding="netTcpBinding" bindingConfiguration="tcpConfig" contract="StockApp.IStock">
<dns value="localhost"/>
<service name="StockApp.Callback">
<!-- Base address defined at runtime. -->
<endpoint address="" binding="netTcpBinding" bindingConfiguration="tcpConfig" contract="StockApp.IPutStock">
<dns value="localhost"/>
<endpoint name="StockClientEndpoint"
contract="StockApp.IStock" >
<!-- CallbackClientEndpoint address defined at runtime. -->
<endpoint name="CallbackClientEndpoint"
contract="StockApp.IPutStock" >
<!--<serviceMetadata httpGetEnabled="true"/>-->
<serviceDebug includeExceptionDetailInFaults="true"/>
<serviceThrottling maxConcurrentCalls="1000" maxConcurrentSessions="1000" maxConcurrentInstances="1000" />
<binding name="tcpConfig" listenBacklog="100" maxConnections="1000">
<security mode="None"/>
<reliableSession enabled="false" />
Update: I just tried the above solution with a netNamedPipeBinding:
<netNamedPipeBinding >
<binding name="pipeConfig" maxConnections="1000" >
<security mode="None"/>
It actually got 3 seconds slower (from 20 to 23 seconds). Since this particular example is all inter-process, I'm not sure why. If anyone has some insights, please comment.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With