Imagine an object like this
public class ContentType
{
public string Alias { get; set;}
public string ParentAlias { get; set;}
}
And a flat collection of these objects
List<ContentType> contentTypes...;
How can I use a linq chained syntax query to determine if there is a circular reference in the collection.
//Example
ContentType #50
Alias: Truck
ParentAlias: Vehicle
ContentType #90
Alias: Vehicle
ParentAlias: Truck
That would be a circular dependency which would break the code that creates the content types (it would get stuck in an infinite loop walking the parent hierarchies..)
So before I process the parent/child content types I would like to first detect if there is a circular dependency in the collection and halt the operation if one is detected.
So we'll start with two helper methods. First we'll want a method that yields all of the ancestors for an item, when given that item and a delegate that gets the parent of an item:
public static IEnumerable<T> Ancestors<T>(T item, Func<T, T> parentSelector)
{
while (item != null)
{
item = parentSelector(item);
yield return item;
}
}
We'll also write a method to determine if a sequence repeats by storing all of the previously yielded items and seeing if each new item is in that set:
public static bool Repeats<T>(
this IEnumerable<T> sequence,
IEqualityComparer<T> comparer = null)
{
comparer = comparer ?? EqualityComparer<T>.Default;
var set = new HashSet<T>(comparer);
foreach (var item in sequence)
if (!set.Add(item))
return true;
return false;
}
From here we can determine if any sequence contains cycles by computing the ancestors of each item and determining if any of those collections repeat:
public static bool ContainsCycles<T>(IEnumerable<T> sequence,
Func<T, T> parentSelector,
IEqualityComparer<T> comparer = null)
{
comparer = comparer ?? EqualityComparer<T>.Default;
return sequence.Any(item => Ancestors(item, parentSelector).Repeats(comparer));
}
All that's left is to write a method that computes the parent of each item, since that's not an operation your class already supports, which can be done by just creating a lookup from the alias to the item and then using it:
IEnumerable<ContentType> types = CreateContentTypes();
var lookup = types.ToDictionary(type => type.Alias);
bool anyCycles = ContainsCycles(types, type => lookup[type.ParentAlias]);
As far as performance, and possible improvements, if you're dealing with particularly large trees/subtrees, you could cache the results of the intermediate calculations. For example, if we have a node A, and a parent B, and a grandparent C, which is a root, then when doing a calculation to determine if A is in a cycle we also need to determine if B is in a cycle. If we already determined if B was in a cycle earlier, and cached it, we could skip that step. if we didn't, then we could cache it when doing the cycle calculator for A, and then we won't need to do it again when checking B later.
This does complicate the code a fair bit, so if you don't have a particularly large/deep graph, you may choose not to bother caching these intermediate results and just choose to re-calculate them for each item:
public static bool IsInCycle<T>(
this IEnumerable<T> sequence,
HashSet<T> itemsNotInASequence,
IEqualityComparer<T> comparer = null)
{
comparer = comparer ?? EqualityComparer<T>.Default;
var set = new HashSet<T>(comparer);
foreach (var item in sequence)
{
if (itemsNotInASequence.Contains(item))
return false;
else if (!set.Add(item))
return true;
}
itemsNotInASequence.UnionWith(set);
return false;
}
public static bool ContainsCycles<T>(IEnumerable<T> sequence,
Func<T, T> parentSelector,
IEqualityComparer<T> comparer = null)
{
comparer = comparer ?? EqualityComparer<T>.Default;
var itemsNotInASequence = new HashSet<T>(comparer);
return sequence.All(item => Ancestors(item, parentSelector)
.IsInCycle(itemsNotInASequence));
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With