Maybe a basic question but let us say I have a string that is 2000 characters long, I need to split this string into max 512 character chunks each.
Is there a nice way, like a loop or so for doing this?
To split a string with specific character as delimiter in Java, call split() method on the string object, and pass the specific character as argument to the split() method. The method returns a String Array with the splits as elements in the array.
Python split() method is used to split the string into chunks, and it accepts one argument called separator. A separator can be any character or a symbol. If no separators are defined, then it will split the given string and whitespace will be used by default.
Something like this:
private IList<string> SplitIntoChunks(string text, int chunkSize)
{
List<string> chunks = new List<string>();
int offset = 0;
while (offset < text.Length)
{
int size = Math.Min(chunkSize, text.Length - offset);
chunks.Add(text.Substring(offset, size));
offset += size;
}
return chunks;
}
Or just to iterate over:
private IEnumerable<string> SplitIntoChunks(string text, int chunkSize)
{
int offset = 0;
while (offset < text.Length)
{
int size = Math.Min(chunkSize, text.Length - offset);
yield return text.Substring(offset, size);
offset += size;
}
}
Note that this splits into chunks of UTF-16 code units, which isn't quite the same as splitting into chunks of Unicode code points, which in turn may not be the same as splitting into chunks of glyphs.
Though this question meanwhile has an accepted answer, here's a short version with the help of regular expressions. Purists may not like it (understandably) but when you need a quick solution and you are handy with regexes, this can be it. Performance is rather good, surprisingly:
string [] split = Regex.Split(yourString, @"(?<=\G.{512})");
What it does? Negative look-backward and remembering the last position with \G
. It will also catch the last bit, even if it isn't dividable by 512.
using Jon's implementation and the yield keyword.
IEnumerable<string> Chunks(string text, int chunkSize)
{
for (int offset = 0; offset < text.Length; offset += chunkSize)
{
int size = Math.Min(chunkSize, text.Length - offset);
yield return text.Substring(offset, size);
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With