I have a regular expression for IPv6 addresses as given below
IPV4ADDRESS [ \t]*(([[:digit:]]{1,3}"."){3}([[:digit:]]{1,3}))[ \t]*
x4 ([[:xdigit:]]{1,4})
xseq ({x4}(:{x4}){0,7})
xpart ({xseq}|({xseq}::({xseq}?))|::{xseq})
IPV6ADDRESS [ \t]*({xpart}(":"{IPV4ADDRESS})?)[ \t]*
It is correctly all formats of IPv6 addresses including
1) non-compressed IPv6 addresses
2) compressed IPv6 addresses
3) IPv6 addresses in legacy formats.(supporting IPv4)
Ideal examples of IPv6 addresses in legacy formats would be
2001:1234::3210:5.6.7.8
OR
2001:1234:1234:5432:4578:5678:5.6.7.8
As you can see above there are 10 groups separated by either `":" or ".".`
As opposed to 8 groups in normal IPv6 addresses.This is because the last 4 groups that are separated by `"." should be compressed into least significant 32-bits of the IPv6 addresses.Hence we need 10 groups to satisfy 128 bits.
However If I use the following address format
2001:1234:4563:3210:5.6.7.8
Here each group separated by ":" represents 16-bits.the last four groups separted by "." represents 8 bits.Total number of bits is 64 + 32 = 96 bits.32 bits are missing
The regular expression is accepting it as a valid IPv6 address format.I am unable to figure out how to fix the regular expression to discard such values.Any help is highly appreciated.
Here's the grammar for IPv6 addresses as given in RFC 3986 and subsequently affirmed in RFC 5954:
IPv6address = 6( h16 ":" ) ls32
/ "::" 5( h16 ":" ) ls32
/ [ h16 ] "::" 4( h16 ":" ) ls32
/ [ *1( h16 ":" ) h16 ] "::" 3( h16 ":" ) ls32
/ [ *2( h16 ":" ) h16 ] "::" 2( h16 ":" ) ls32
/ [ *3( h16 ":" ) h16 ] "::" h16 ":" ls32
/ [ *4( h16 ":" ) h16 ] "::" ls32
/ [ *5( h16 ":" ) h16 ] "::" h16
/ [ *6( h16 ":" ) h16 ] "::"
h16 = 1*4HEXDIG
ls32 = ( h16 ":" h16 ) / IPv4address
IPv4address = dec-octet "." dec-octet "." dec-octet "." dec-octet
dec-octet = DIGIT ; 0-9
/ %x31-39 DIGIT ; 10-99
/ "1" 2DIGIT ; 100-199
/ "2" %x30-34 DIGIT ; 200-249
/ "25" %x30-35 ; 250-255
Using this, we can build a standards-compliant regular expression for IPv6 addresses.
dec_octet ([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])
ipv4address ({dec_octet}"."){3}{dec_octet}
h16 ([[:xdigit:]]{1,4})
ls32 ({h16}:{h16}|{ipv4address})
ipv6address (({h16}:){6}{ls32}|::({h16}:){5}{ls32}|({h16})?::({h16}:){4}{ls32}|(({h16}:){0,1}{h16})?::({h16}:){3}{ls32}|(({h16}:){0,2}{h16})?::({h16}:){2}{ls32}|(({h16}:){0,3}{h16})?::{h16}:{ls32}|(({h16}:){0,4}{h16})?::{ls32}|(({h16}:){0,5}{h16})?::{h16}|(({h16}:){0,6}{h16})?::)
Disclaimer: untested.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With