Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Easier way to convert a character to an integer?

Tags:

julia

Still getting a feel for what's in the Julia standard library. I can convert strings to their integer representation via the Int() constructor, but when I call Int() with a Char I don't integer value of a digit:

julia> Int('3')
51

Currently I'm calling string() first:

intval = Int(string(c)) # doesn't work anymore, see note below

Is this the accepted way of doing this? Or is there a more standard method? It's coming up quite a bit in my Project Euler exercise.


Note: This question was originally asked before Julia 1.0. Since it was asked the int function was renamed to Int and became a method of the Int type object. The method Int(::String) for parsing a string to an integer was removed because of the potentially confusing difference in behavior between that and Int(::Char) discussed in the accepted answer.

like image 928
Rick Avatar asked May 31 '14 15:05

Rick


People also ask

Which function is used to convert a character to integer?

The atoi() function converts a character string to an integer value. The input string is a sequence of characters that can be interpreted as a numeric value of the specified return type.

Can you turn a char into an int in C?

There are 3 ways to convert the char to int in C language as follows: Using Typecasting. Using sscanf() Using atoi()

How do you convert a char to an int in Python?

In Python an strings can be converted into a integer using the built-in int() function. The int() function takes in any python data type and converts it into a integer.

Can char be used for integers?

A character can be a single letter, number, symbol, or whitespace. The char data type is an integral type, meaning the underlying value is stored as an integer.


1 Answers

The short answer is you can do parse(Int, c) to do this correctly and efficiently. Read on for more discussion and details.

The code in the question as originally asked doesn't work anymore because Int(::String) was removed from the languge because of the confusing difference in behavior between it and Int(::Char). Prior to Julia 1.0, the former was parsing a string as an integer whereas the latter was giving the unicode code point of the character which meant that Int("3") would return 3 whereaas Int('3') would return 51. The modern working equivalent of what the questioner was using would be parse(Int, string(c)). However, you can skip converting the character to a string (which is quite inefficient) and just write parse(Int, c).

What does Int(::Char) do and why does Int('3') return 51? That is the code point value assigned to the character 3 by the Unicode Consortium, which was also the ASCII code point for it before that. Obviously, this is not the same as the digit value of the letter. It would be nice if these matched, but they don't. The code points 0-9 are a bunch of non-printing "control characters" starting with the NUL byte that terminates C strings. The code points for decimal digits are at least contiguous, however:

julia> [Int(c) for c in "0123456789"]
10-element Vector{Int64}:
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57

Because of this you can compute the value of a digit by subtracting the code point of 0 from it:

julia> [Int(c) - Int('0') for c in "0123456789"]
10-element Vector{Int64}:
 0
 1
 2
 3
 4
 5
 6
 7
 8
 9

Since subtraction of Char values works and subtracts their code points, this can be simplified to [c-'0' for c in "0123456789"]. Why not do it this way? You can! That is exactly what you'd do in C code. If you know your code will only ever encounter c values that are decimal digits, then this works well. It doesn't, however, do any error checking whereas parse does:

julia> c = 'f'
'f': ASCII/Unicode U+0066 (category Ll: Letter, lowercase)

julia> parse(Int, c)
ERROR: ArgumentError: invalid base 10 digit 'f'
Stacktrace:
 [1] parse(::Type{Int64}, c::Char; base::Int64)
   @ Base ./parse.jl:46
 [2] parse(::Type{Int64}, c::Char)
   @ Base ./parse.jl:41
 [3] top-level scope
   @ REPL[38]:1

julia> c - '0'
54

Moreover, parse is a bit more flexible. Suppose you want to accept f as a hex "digit" encoding the value 15. To do that with parse you just need to use the base keyword argument:

julia> parse(Int, 'f', base=16)
15

julia> parse(Int, 'F', base=16)
15

As you can see it parses upper or lower case hex digits correctly. In order to do that with the subtraction method, your code would need to do something like this:

'0' <= c <= '9' ? c - '0' :
'A' <= c <= 'F' ? c - 'A' + 10 :
'a' <= c <= 'f' ? c - 'a' + 10 : error()

Which is actually quite close to the implementation of the parse(Int, c) method. Of course at that point it's much clearer and easier to just call parse(Int, c) which does this for you and is well optimized.

like image 71
StefanKarpinski Avatar answered Oct 07 '22 23:10

StefanKarpinski