Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split a string by a byte

Tags:

java

string

split

I have a String which has some ASCII control characters in it (namely RS (0x1e) and US (0x1f)). I have defined them in my code as such:

static public final byte RS  = 0x1E;
static public final byte US  = 0x1F;

later in my code, I want to split a string using these characters:

String[] records = content.split(String.valueOf(RS));

but, that doesn't work correctly. After some fiddling I found that this

String[] records = content.split("\u001e");

does work, but in that case, I have to remember the codes. I do use the RS static byte also in other parts, so just changing that is not a real option. I could of course create a RS_STRING or something, but that means double work.

Any clean good solution for this?

like image 264
Bart Friederichs Avatar asked Apr 29 '15 14:04

Bart Friederichs


People also ask

Can you slice bytes in Python?

We can slice bytearrays. And because bytearray is mutable, we can use slices to change its contents. Here we assign a slice to an integer list.

What is a byte string?

A byte string is a fixed-length array of bytes. A byte is an exact integer between 0 and 255 inclusive. A byte string can be mutable or immutable. When an immutable byte string is provided to a procedure like bytes-set!, the exn:fail:contract exception is raised.

How do you split a string into binary in Python?

Python Split functionPython split() method is used to split the string into chunks, and it accepts one argument called separator. A separator can be any character or a symbol. If no separators are defined, then it will split the given string and whitespace will be used by default.


1 Answers

Declaring the character as a char rather than a byte fixed it for me - the following works fine:

char RS  = 0x1E;
String s = new String(new char[]{'d', RS, 'e'});
System.out.println(s.split(String.valueOf(RS)).length); //Prints 2

However, using a byte as the type causes it to fail:

byte RS  = 0x1E;
String s = new String(new char[]{'d', (char)RS, 'e'});
System.out.println(s.split(String.valueOf(RS)).length); //Prints 1

You can of course cast the char back to byte if you need to refer to it as such in other parts of your code.

like image 135
Michael Berry Avatar answered Sep 23 '22 20:09

Michael Berry