Is there an idiomatic way of encoding and decoding a string in Clojure as hexadecimal? Example from Python:
'Clojure'.encode('hex')
# ⇒ '436c6f6a757265'
'436c6f6a757265'.decode('hex')
# ⇒ 'Clojure'
To show some effort on my part:
(defn hexify [s]
(apply str
(map #(format "%02x" (int %)) s)))
(defn unhexify [hex]
(apply str
(map
(fn [[x y]] (char (Integer/parseInt (str x y) 16)))
(partition 2 hex))))
(hexify "Clojure")
;; ⇒ "436c6f6a757265"
(unhexify "436c6f6a757265")
;; ⇒ "Clojure"
Using format() + join() to Convert Byte Array to Hex String The format function converts the bytes into hexadecimal format.
To convert byte array to a hex value, we loop through each byte in the array and use String 's format() . We use %02X to print two places ( 02 ) of Hexadecimal ( X ) value and store it in the string st . This is a relatively slower process for large byte array conversion.
Since all posted solutions have some flaws, I'm sharing my own:
(defn hexify "Convert byte sequence to hex string" [coll]
(let [hex [\0 \1 \2 \3 \4 \5 \6 \7 \8 \9 \a \b \c \d \e \f]]
(letfn [(hexify-byte [b]
(let [v (bit-and b 0xFF)]
[(hex (bit-shift-right v 4)) (hex (bit-and v 0x0F))]))]
(apply str (mapcat hexify-byte coll)))))
(defn hexify-str [s]
(hexify (.getBytes s)))
and
(defn unhexify "Convert hex string to byte sequence" [s]
(letfn [(unhexify-2 [c1 c2]
(unchecked-byte
(+ (bit-shift-left (Character/digit c1 16) 4)
(Character/digit c2 16))))]
(map #(apply unhexify-2 %) (partition 2 s))))
(defn unhexify-str [s]
(apply str (map char (unhexify s))))
Pros:
Your implementation(s) don't work for non-ascii characters,
(defn hexify [s]
(apply str
(map #(format "%02x" (int %)) s)))
(defn unhexify [hex]
(apply str
(map
(fn [[x y]] (char (Integer/parseInt (str x y) 16)))
(partition 2 hex))))
(= "\u2195" (unhexify(hexify "\u2195")))
false ; should be true
To overcome this you need to serialize the bytes of the string using the required character encoding, which can be multi-byte per character.
There are a few 'issues' with this.
In idiomatic java you would use the low byte of an integer and mask it like this wherever you used it.
int intValue = 0x80;
byte byteValue = (byte)(intValue & 0xff); -- use only low byte
System.out.println("int:\t" + intValue);
System.out.println("byte:\t" + byteValue);
-- output:
-- int: 128
-- byte: -128
clojure has (unchecked-byte)
to effectively do the same.
For example, using UTF-8 you can do this:
(defn hexify [s]
(apply str (map #(format "%02x" %) (.getBytes s "UTF-8"))))
(defn unhexify [s]
(let [bytes (into-array Byte/TYPE
(map (fn [[x y]]
(unchecked-byte (Integer/parseInt (str x y) 16)))
(partition 2 s)))]
(String. bytes "UTF-8")))
; with the above implementation:
;=> (hexify "\u2195")
"e28695"
;=> (unhexify "e28695")
"↕"
;=> (= "\u2195" (unhexify (hexify "\u2195")))
true
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With