Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Word Wrapping with Regular Expressions

Tags:

People also ask

What is word wrap example?

An example of word wrap is the automatic moving of the words on a long line of words to fit the words within a cell on a spreadsheet. (computing) A word processing feature which automatically adjusts lines of text to fit within the page margins. Words exceeding the margins are set to begin a new line.

What is word wrapping manipulation?

Line breaking, also known as word wrapping, is breaking a section of text into lines so that it will fit into the available width of a page, window or other display area.

What is regex in Visual Studio?

In visual basic, regular expression (regex) is a pattern and it is useful to parse and validate whether the given input text is matching the defined pattern (such as an email address) or not.


EDIT FOR CLARITY - I know there are ways to do this in multiple steps, or using LINQ or vanilla C# string manipulation. The reason I am using a single regex call, is because I wanted practice with complex regex patterns. - END EDIT

I am trying to write a single regular expression that will perform word wrapping. It's extremely close to the desired output, but I can't quite get it to work.

Regex.Replace(text, @"(?<=^|\G)(.{1,20}(\s|$))", "$1\r\n", RegexOptions.Multiline)

This is correctly wrapping words for lines that are too long, but it's adding a line break when there already is one.

Input

"This string is really long. There are a lot of words in it.\r\nHere's another line in the string that's also very long."

Expected Output

"This string is \r\nreally long. There \r\nare a lot of words \r\nin it.\r\nHere's another line \r\nin the string that's \r\nalso very long."

Actual Output

"This string is \r\nreally long. There \r\nare a lot of words \r\nin it.\r\n\r\nHere's another line \r\nin the string that's \r\nalso very long.\r\n"

Note the double "\r\n" between sentences where the input already had a line break and the extra "\r\n" that was put at the end.

Perhaps there's a way to conditionally apply different replacement patterns? I.E. If the match ends in "\r\n", use replace pattern "$1", otherwise, use replace pattern "$1\r\n".

Here's a link to a similar question for wrapping a string with no white space that I used as a starting point. Regular expression to find unbroken text and insert space