I'm trying to obtain the text shown in a MS Word window in C# using Microsoft.Office.Interop.Word. Please note it's not the whole document or even the page; just the same content the user sees.
The following code seems to work with simple documents:
Application word = new Application();
word.Visible = true;
object fileName = @"example.docx";
word.Documents.Add(ref fileName, Type.Missing, Type.Missing, true);
Rect rect = AutomationElement.FocusedElement.Current.BoundingRectangle;
Range r1 = word.ActiveWindow.RangeFromPoint((int)rect.Left, (int)rect.Top);
Range r2 = word.ActiveWindow.RangeFromPoint((int)rect.Left + (int)rect.Width, (int)rect.Top + (int)rect.Height);
r1.End = r2.Start;
Console.WriteLine(r1.Text.Replace("\r", "\r\n"));
However, when the document includes other structures such as headers, only parts of the text are returned.
So, what's the correct way to achieve this?
Thanks a lot!
Updated Code
Rect rect = AutomationElement.FocusedElement.Current.BoundingRectangle;
foreach (Range r in word.ActiveDocument.StoryRanges) {
int left = 0, top = 0, width = 0, height = 0;
try {
try {
word.ActiveWindow.GetPoint(out left, out top, out width, out height, r);
} catch {
left = (int)rect.Left;
top = (int)rect.Top;
width = (int)rect.Width;
height = (int)rect.Height;
}
Rect newRect = new Rect(left, top, width, height);
Rect inter;
if ((inter = Rect.Intersect(rect, newRect)) != Rect.Empty) {
Range r1 = word.ActiveWindow.RangeFromPoint((int)inter.Left, (int)inter.Top);
Range r2 = word.ActiveWindow.RangeFromPoint((int)inter.Right, (int)inter.Bottom);
r.SetRange(r1.Start, r2.Start);
Console.WriteLine(r.Text.Replace("\r", "\r\n"));
}
} catch { }
}
C programming language is a machine-independent programming language that is mainly used to create many types of applications and operating systems such as Windows, and other complicated programs such as the Oracle database, Git, Python interpreter, and games and is considered a programming foundation in the process of ...
In the real sense it has no meaning or full form. It was developed by Dennis Ritchie and Ken Thompson at AT&T bell Lab. First, they used to call it as B language then later they made some improvement into it and renamed it as C and its superscript as C++ which was invented by Dr. Stroustroupe.
C is a general-purpose language that most programmers learn before moving on to more complex languages. From Unix and Windows to Tic Tac Toe and Photoshop, several of the most commonly used applications today have been built on C. It is easy to learn because: A simple syntax with only 32 keywords.
C is more difficult to learn than JavaScript, but it's a valuable skill to have because most programming languages are actually implemented in C. This is because C is a “machine-level” language. So learning it will teach you how a computer works and will actually make learning new languages in the future easier.
There may be some problems with this:
enumerator = r1.StoryRanges.GetEnumerator();
{
while (enumerator.MoveNext()
{
Range current = (Range) enumerator.Current;
}
}
Have you tried to look at How to programmatically extract the text of the currently viewed page of an Office.Interop.Word.Document object ?
You are probably seeing the side effects of range selecting across page elements.
In most cases, if you move your cursor to the top left of the screen, down to the bottom right of the screen it will only select the main body text (no headers or footers). Also, if the document has columns, and those columns start or end off screen, then when you select from the fist column the text through to the last column will be selected, even if it is off the screen.
To my knowledge there is no easy way to achieve your goal unless you are willing to ignore the inconsistencies, or want to deal with all the use cases specifically (images, columns, tables, etc.).
If you can tell us what you are trying to do then we can offer alternatives, otherwise please mark an answer as correct.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With