Given a proto file:
syntax = "proto3";
package hello;
message TopGreeting {
NestedGreeting greeting = 1;
}
message NestedGreeting {
Greeting greeting = 1;
}
message Greeting {
string message = 1;
}
and the code:
public class Main {
public static void main(String[] args) {
System.out.printf("From top: %s%n", newGreeting("오늘은 무슨 요일입니까?"));
System.out.printf("Directly: %s%n", "오늘은 무슨 요일입니까?");
System.out.printf("ByteString: %s", newGreeting("오늘은 무슨 요일입니까?").toByteString().toStringUtf8());
}
private static Hello.TopGreeting newGreeting(String message) {
Hello.Greeting greeting = Hello.Greeting.newBuilder()
.setMessage(message)
.build();
Hello.NestedGreeting nestedGreeting = Hello.NestedGreeting.newBuilder()
.setGreeting(greeting)
.build();
return Hello.TopGreeting.newBuilder()
.setGreeting(nestedGreeting)
.build();
}
}
Output
From top: greeting {
greeting {
message: "\354\230\244\353\212\230\354\235\200 \353\254\264\354\212\250 \354\232\224\354\235\274\354\236\205\353\213\210\352\271\214?"
}
}
Directly: 오늘은 무슨 요일입니까?
ByteString:
%
#
!오늘은 무슨 요일입니까?
How do I print the message in a human-readable way? As you can see, converting to ByteString
prints the UTF-8 characters alright, but also prints some other garbage %
and #
.
Answering my own question, I solved this issue by digging through Protobuf source code.
System.out.println(TextFormat.printer().escapingNonAscii(false).printToString(greeting))
Output:
greeting {
greeting {
message: "오늘은 무슨 요일입니까?"
}
}
toString
uses the same mechanism but with escapingNonAscii(true)
(default when omitted).
Also see this answer for how to convert Octal sequences to UTF-8 characters in case you don't have access to the source code, only logs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With