I have a String
which has some ASCII control characters in it (namely RS
(0x1e) and US
(0x1f)). I have defined them in my code as such:
static public final byte RS = 0x1E;
static public final byte US = 0x1F;
later in my code, I want to split a string using these characters:
String[] records = content.split(String.valueOf(RS));
but, that doesn't work correctly. After some fiddling I found that this
String[] records = content.split("\u001e");
does work, but in that case, I have to remember the codes. I do use the RS
static byte also in other parts, so just changing that is not a real option. I could of course create a RS_STRING
or something, but that means double work.
Any clean good solution for this?
We can slice bytearrays. And because bytearray is mutable, we can use slices to change its contents. Here we assign a slice to an integer list.
A byte string is a fixed-length array of bytes. A byte is an exact integer between 0 and 255 inclusive. A byte string can be mutable or immutable. When an immutable byte string is provided to a procedure like bytes-set!, the exn:fail:contract exception is raised.
Python Split functionPython split() method is used to split the string into chunks, and it accepts one argument called separator. A separator can be any character or a symbol. If no separators are defined, then it will split the given string and whitespace will be used by default.
Declaring the character as a char
rather than a byte
fixed it for me - the following works fine:
char RS = 0x1E;
String s = new String(new char[]{'d', RS, 'e'});
System.out.println(s.split(String.valueOf(RS)).length); //Prints 2
However, using a byte as the type causes it to fail:
byte RS = 0x1E;
String s = new String(new char[]{'d', (char)RS, 'e'});
System.out.println(s.split(String.valueOf(RS)).length); //Prints 1
You can of course cast the char
back to byte
if you need to refer to it as such in other parts of your code.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With