I am trying to extract all the leading 7 digit hexadecimal strings in a file, that contains lines such as:
3fce110:: ..\Utilities\c\misc.c(431): YESFREED (120 bytes) Misc
Hexadecimal numbers are usually prefixed with the characters '0x' which are not part of the number. A single hexadecimal digit can represent four binary digits!
By string of hexadecimal digits what they mean is a combination of the digits 0-9 and characters A-F, just like how a binary string is a combination of 0's and 1's. Eg: "245FC" is a hexadecimal string.
An example regular expression that combines some of the operators and constructs to match a hexadecimal number is \b0[xX]([0-9a-fA-F]+)\b .
Using hexadecimal makes it very easy to convert back and forth from binary because each hexadecimal digit corresponds to exactly 4 bits (log 2(16) = 4) and each byte is two hexadecimal digit. In contrast, a decimal digit corresponds to log2(10) = 3.322 bits and a byte is 2.408 decimal digits.
egrep -o '^[0-9a-f]{7}\b' file.txt
egrep
is the same as grep -E
; it uses extended regexp.-o
prints only the matching part of each line.^
anchors the match to the beginning of the line.[0-9a-f]{7}
matches seven hexadecimal characters. If you want to match uppercase letters add A-F
here or add the -i
flag.\b
checks for a word boundary; it ensures we don't match hex numbers more than 7 digits long.If all the lines in the file follow the given format then a couple of methods:
$ grep -o '^[^:]*' file
3fce110
$ awk -F: '{print $1}' file
3fce110
$ cut -d: -f1 file
3fce110
$ sed 's/:.*//' file
3fce110
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With