What I am trying to do is find the OpenXMLElements between a CommentRangeStart
and the corresponding CommentRangeEnd
.
I have tried two methods to achieve this however the problem is a CommentRangeEnd
does not need to be on the same level as the start. It can be nested in a child element see the below very simple structure (note this is not correct open xml it is just to show the general idea).
<w:commentstart/>
<w:paragraph>
<w:run />
<w:commentend />
</w:paragraph>
The two items I have tried are the following: First: I wrote an enumerable which returns items until the end
public static IEnumerable<OpenXmlElement> SiblingsUntilCommentRangeEnd(CommentRangeStart commentStart)
{
OpenXmlElement element = commentStart.NextSibling();
if (IsMatchingCommentEnd(element, commentStart.Id.Value))
{
yield break;
}
while (true)
{
yield return element;
element = element.NextSibling();
// Check that the item
if (element == null)
{
yield break;
}
if (IsMatchingCommentEnd(element, commentStart.Id.Value))
{
yield break;
}
}
}
public static bool IsMatchingCommentEnd(OpenXmlElement element, string commentId)
{
CommentRangeEnd commentEnd = element as CommentRangeEnd;
if (commentEnd != null)
{
return commentEnd.Id == commentId;
}
return false;
}
Second: Then realising the issue with the start and end not being on the same level I continued to hunt around and I found Eric Whites answer for dealing with elements between bookmark elements I retro fitted that for my example but still the issue with the start and end not having the same parent (i.e on the same level) was an issue and I could not use that.
Is there a better way to be looking at this I am looking for a way to handle the elements as I am needing to work with the text that is being commented on.
Edit: Clarification of what I am trying to achieve: I am taking a document edited in word and for a comment in the document I am looking to get the text that has been commented on in between the start and end range for a specific comment id.
Edit 2: I have put up a working version of what I am currently thinking but my issue with it is it potentially being quite fragile with different user combinations from Word. This is also working with xml which is not really an issue but could have liked to change to the OpenXML SDK. Currently it is looking like I am going to need to parse an entire document getting the items that I need instead of working with 1 specific comment. https://github.com/mhbuck/DocumentCommentParser/
Main issue encountering: The CommentRangeStart
and CommentRangeEnd
can be in different nestings within the XML document. The root node is potentially the only similar ancestor element.
You can try to use Descendants<T>()
method to enumerate all the descendants of a node of a given type. So, your code can look similar to this (i've written it without using yeld
to make it more readable ;)):
public static IEnumerable<OpenXmlElement> SiblingsUntilCommentRangeEnd(CommentRangeStart commentStart)
{
List<OpenXmlElement> commentedNodes = new List<OpenXmlElement>();
OpenXmlElement element = commentStart;
while (true)
{
element = element.NextSibling();
// check that the item exists
if (element == null)
{
break;
}
//check that the item is matching comment end
if (IsMatchingCommentEnd(element, commentStart.Id.Value))
{
break;
}
//check that there is a matching element in the current element's descendants
var descendantsCommentEnd = element.Descendants<CommentRangeEnd>();
if (descendantsCommentEnd != null)
{
foreach (CommentRangeEnd rangeEndNode in descendantsCommentEnd)
{
if (IsMatchingCommentEnd(rangeEndNode, commentStart.Id.Value))
{
//matching range end element found in current element's descendants
//an improvement could be made here to manually select descendants before CommentRangeEnd node
break;
}
}
}
commentedNodes.Add(element);
}
return commentedNodes;
}
As marked in one of the comments, it's now ending if it finds CommentRangeEnd
element in current element's descendants.
I haven't tested this code yet, so if you have any issues with it, let me know in the comments.
Note that this method won't work if start element is deeper in document's hierarchy than end element. In some cases, it also won't return some of the contents put in a comment. If you need it, I can later update the answer with an alternative solution to handle this case. Please also explain why do you need to find those comments, because maybe an alternative method can be used.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With