Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I share an observable with publish and connect?

I have an observable data stream that I am applying operations to, splitting into two separate streams, applying more (distinct) operations to each of the two streams, and merging together again. I am trying to share the observable between two subscribers using Publish and Connect but each of the subscribers seems to be using a separate stream. That is, in the example below, I see "Doing an expensive operation" printed once for each item in the stream for both of the subscribers. (Imagine the expensive operation as being something that should happen only once between all subscribers, as such I am trying to reuse the stream.) I have used Publish and Connect to try and share the merged observable with both subscribers, but it seems to have the wrong effect.

Example with the issue:

var foregroundScheduler = new NewThreadScheduler(ts => new Thread(ts) { IsBackground = false });
var timer = Observable.Timer(TimeSpan.Zero, TimeSpan.FromSeconds(10), foregroundScheduler);
var expensive = timer.Select(i =>
{
    // Converting to strings is an expensive operation
    Console.WriteLine("Doing an expensive operation");
    return string.Format("#{0}", i);
});

var a = expensive.Where(s => int.Parse(s.Substring(1)) % 2 == 0).Select(s => new { Source = "A", Value = s });
var b = expensive.Where(s => int.Parse(s.Substring(1)) % 2 != 0).Select(s => new { Source = "B", Value = s });

var connectable = Observable.Merge(a, b).Publish();
connectable.Where(x => x.Source.Equals("A")).Subscribe(s => Console.WriteLine("Subscriber A got: {0}", s));
connectable.Where(x => x.Source.Equals("B")).Subscribe(s => Console.WriteLine("Subscriber B got: {0}", s));
connectable.Connect();

I see the following output:

Doing expensive operation
Doing expensive operation
Subscriber A got: { Source = A, Value = #0 }
Doing expensive operation
Doing expensive operation
Subscriber B got: { Source = B, Value = #1 }

(Output continues, truncated for brevity.)

How can I share the observable with both subscribers?

like image 257
Whymarrh Avatar asked Aug 27 '15 22:08

Whymarrh


2 Answers

You have published the wrong observable.

With the current code you are merging and then publishing like this Observable.Merge(a, b).Publish();. Now since a & b are defined against expensive you still get two subscriptions to expensive.

The subscriptions create these pipelines:

Original

You can see this if you take out the .Publish(); from your code. The output becomes:

Doing an expensive operation
Doing an expensive operation
Doing an expensive operation
Doing an expensive operation
Subscriber A got: { Source = A, Value = #0 }
Doing an expensive operation
Doing an expensive operation
Doing an expensive operation
Doing an expensive operation
Subscriber B got: { Source = B, Value = #1 }

This creates these pipelines:

No Publish

So, by shifting the .Publish() back up to expensive you eliminate the problem. That's where you really needed it because it is the expensive operation after all.

This is the code you needed:

var foregroundScheduler = new NewThreadScheduler(ts => new Thread(ts) { IsBackground = false });
var timer = Observable.Timer(TimeSpan.Zero, TimeSpan.FromSeconds(10), foregroundScheduler);
var expensive = timer.Select(i =>
{
    // Converting to strings is an expensive operation
    Console.WriteLine("Doing an expensive operation");
    return string.Format("#{0}", i);
});

var connectable = expensive.Publish();

var a = connectable.Where(s => int.Parse(s.Substring(1)) % 2 == 0).Select(s => new { Source = "A", Value = s });
var b = connectable.Where(s => int.Parse(s.Substring(1)) % 2 != 0).Select(s => new { Source = "B", Value = s });

var merged = Observable.Merge(a, b);

merged.Where(x => x.Source.Equals("A")).Subscribe(s => Console.WriteLine("Subscriber A got: {0}", s));
merged.Where(x => x.Source.Equals("B")).Subscribe(s => Console.WriteLine("Subscriber B got: {0}", s));

connectable.Connect();

That nicely produces the following:

Doing an expensive operation
Subscriber A got: { Source = A, Value = #0 }
Doing an expensive operation
Subscriber B got: { Source = B, Value = #1 }
Doing an expensive operation
Subscriber A got: { Source = A, Value = #2 }
Doing an expensive operation
Subscriber B got: { Source = B, Value = #3 }

And this gives you these pipelines:

Expensive Publish

You can see from this image that there is still duplication. That's fine because these parts aren't expensive.

The duplication is actually important. Shared parts of the pipelines make their endpoints vulnerable to errors and thus to early termination. The less sharing the better for the robustness of the code. It's only when you have an expensive operation that you should worry about publishing. Otherwise you should just let the pipelines be themselves.

Here's an example to show it. If you don't have a published source then, if one source produces an error then it doesn't pull down all of the pipelines.

Separate

But once you introduce a shared observable then a single error will bring down all of the pipelines.

Shared

like image 125
Enigmativity Avatar answered Oct 18 '22 20:10

Enigmativity


One possible fix:

var foregroundScheduler = new NewThreadScheduler(ts => new Thread(ts) { IsBackground = false });
var timer = Observable.Timer(TimeSpan.Zero, TimeSpan.FromSeconds(10), foregroundScheduler);
var expensive = timer.Select(i =>
{
    // Converting to strings is an expensive operation
    Console.WriteLine("Doing an expensive operation");
    return string.Format("#{0}", i);
});

var subj = new ReplaySubject<string>();
expensive.Subscribe(subj);

var a = subj.Where(s => int.Parse(s.Substring(1)) % 2 == 0).Select(s => new { Source = "A", Value = s });
var b = subj.Where(s => int.Parse(s.Substring(1)) % 2 != 0).Select(s => new { Source = "B", Value = s });

var merged = Observable.Merge(a, b);
merged.Where(x => x.Source.Equals("A")).Subscribe(s => Console.WriteLine("Subscriber A got: {0}", s));
merged.Where(x => x.Source.Equals("B")).Subscribe(s => Console.WriteLine("Subscriber B got: {0}", s));

The above example essentially creates a new intermediate observable that emits the results of the expensive operation. This allows you to subscribe to the results of the expensive operation, not to an expensive transformation applied to a timer.

With this you'll see:

Doing an expensive operation
Subscriber A got: { Source = A, Value = #0 }
Doing an expensive operation
Subscriber B got: { Source = B, Value = #1 }

(Output continues, truncated for brevity.)

Alternatively, you could move the calls to Publish and Connect:

var foregroundScheduler = new NewThreadScheduler(ts => new Thread(ts) {IsBackground = false});
var timer = Observable.Timer(TimeSpan.Zero, TimeSpan.FromSeconds(10), foregroundScheduler);
var expensive = timer.Select(i =>
{
    // Converting to strings is an expensive operation
    Console.WriteLine("Doing an expensive operation");
    return string.Format("#{0}", i);
}).Publish();

var a = expensive.Where(s => int.Parse(s.Substring(1)) % 2 == 0).Select(s => new { Source = "A", Value = s });
var b = expensive.Where(s => int.Parse(s.Substring(1)) % 2 != 0).Select(s => new { Source = "B", Value = s });

var merged = Observable.Merge(a, b);
merged.Where(x => x.Source.Equals("A")).Subscribe(s => Console.WriteLine("Subscriber A got: {0}", s));
merged.Where(x => x.Source.Equals("B")).Subscribe(s => Console.WriteLine("Subscriber B got: {0}", s));

expensive.Connect();

Why ReplaySubject, not just Subject or some other subject?

A Subject, in the .NET Rx implementation is by default what the ReactiveX documentation calls a PublishSubject, which emits to an observer only those items that are emitted by the source Observable subsequent to the time of the subscription. A ReplaySubject on the other hand, emits to any observer all of the items that were emitted by the source Observable, regardless of when the observer subscribes. If we use a plain subject in the first example, the subscription of subj to the timer will cause subscriptions to subj to miss anything emitted between the time that the subject subscribes to the expensive operation and the time that they subscribe to the intermediate subject (subj).

like image 38
Whymarrh Avatar answered Oct 18 '22 19:10

Whymarrh