Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Clojure's equivalent to Python's encode('hex') and decode('hex')

Tags:

hex

clojure

Is there an idiomatic way of encoding and decoding a string in Clojure as hexadecimal? Example from Python:

'Clojure'.encode('hex')
# ⇒ '436c6f6a757265'
'436c6f6a757265'.decode('hex')
# ⇒ 'Clojure'

To show some effort on my part:

(defn hexify [s]
  (apply str
    (map #(format "%02x" (int %)) s)))

(defn unhexify [hex]
  (apply str
    (map 
      (fn [[x y]] (char (Integer/parseInt (str x y) 16))) 
      (partition 2 hex))))

(hexify "Clojure")
;; ⇒ "436c6f6a757265"

(unhexify "436c6f6a757265")
;; ⇒ "Clojure"
like image 786
Iceland_jack Avatar asked Apr 08 '12 12:04

Iceland_jack


People also ask

How do you convert bytes to hex in python?

Using format() + join() to Convert Byte Array to Hex String The format function converts the bytes into hexadecimal format.

How do you convert bytes to hexadecimal?

To convert byte array to a hex value, we loop through each byte in the array and use String 's format() . We use %02X to print two places ( 02 ) of Hexadecimal ( X ) value and store it in the string st . This is a relatively slower process for large byte array conversion.


2 Answers

Since all posted solutions have some flaws, I'm sharing my own:

(defn hexify "Convert byte sequence to hex string" [coll]
  (let [hex [\0 \1 \2 \3 \4 \5 \6 \7 \8 \9 \a \b \c \d \e \f]]
      (letfn [(hexify-byte [b]
        (let [v (bit-and b 0xFF)]
          [(hex (bit-shift-right v 4)) (hex (bit-and v 0x0F))]))]
        (apply str (mapcat hexify-byte coll)))))

(defn hexify-str [s]
  (hexify (.getBytes s)))

and

(defn unhexify "Convert hex string to byte sequence" [s] 
      (letfn [(unhexify-2 [c1 c2] 
                 (unchecked-byte 
                   (+ (bit-shift-left (Character/digit c1 16) 4)
                      (Character/digit c2 16))))]
     (map #(apply unhexify-2 %) (partition 2 s))))

(defn unhexify-str [s]
  (apply str (map char (unhexify s)))) 

Pros:

  • High performance
  • Generic byte stream <--> string conversions with specialized wrappers
  • Handling leading zero in hex result
like image 172
Grzegorz Luczywo Avatar answered Sep 22 '22 09:09

Grzegorz Luczywo


Your implementation(s) don't work for non-ascii characters,

(defn hexify [s]
  (apply str
    (map #(format "%02x" (int %)) s)))

(defn unhexify [hex]
  (apply str
    (map 
      (fn [[x y]] (char (Integer/parseInt (str x y) 16))) 
        (partition 2 hex))))

(= "\u2195" (unhexify(hexify "\u2195")))
false ; should be true 

To overcome this you need to serialize the bytes of the string using the required character encoding, which can be multi-byte per character.

There are a few 'issues' with this.

  • Remember that all numeric types are signed in the JVM.
  • There is no unsigned-byte.

In idiomatic java you would use the low byte of an integer and mask it like this wherever you used it.

    int intValue = 0x80;
    byte byteValue = (byte)(intValue & 0xff); -- use only low byte

    System.out.println("int:\t" + intValue);
    System.out.println("byte:\t" + byteValue);

    -- output:
    -- int:   128
    -- byte:  -128

clojure has (unchecked-byte) to effectively do the same.

For example, using UTF-8 you can do this:

(defn hexify [s]
  (apply str (map #(format "%02x" %) (.getBytes s "UTF-8"))))

(defn unhexify [s]
  (let [bytes (into-array Byte/TYPE
                 (map (fn [[x y]]
                    (unchecked-byte (Integer/parseInt (str x y) 16)))
                       (partition 2 s)))]
    (String. bytes "UTF-8")))

; with the above implementation:

;=> (hexify "\u2195")
"e28695"
;=> (unhexify "e28695")
"↕"
;=> (= "\u2195" (unhexify (hexify "\u2195")))
true
like image 35
sw1nn Avatar answered Sep 18 '22 09:09

sw1nn