Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular Expression for England only Postcode

I have an Asp.Net website and I want to use a RegularExpressionValidator to check if a UK postcode is English (i.e. it's not Scottish, Welsh or N.Irish).

It should be possible to see if the postcode is English by using just the letters from the first segmant (called the Postcode Area). In total there are 124 postcode areas and this is a list of them.

From that list, the following postcode areas are not in England.

  • ZE,KW,IV,HS,PH,AB,DD,PA,FK,G,KY,KA,DG,TD,EH,ML (Scotland)
  • LL,SY,LD,HR,NP,CF,SA (Wales)
  • BT (N.Ireland)

The input to the regex may be the whole postcode, or it might just be the postcode area.

Can anyone help me create a regular expression that will match only if a given postcode is English?

EDIT - Solution

With help from several posters I was able to create the following regex which i've tested against over 1500 testcases successfully.

^(AL|B|B[ABDHLNRS]|C[ABHMORTVW]|D[AEHLNTY]|E|E[CNX]|FY|G[LUY]|H[ADGPUX]|I[GM‌​P]‌​ |JE|KT|L|L[AENSU]|M|ME|N|N[EGNRW]|O[LX]|P[ELOR]|R[GHM]|S|S[EGKLMNOPRSTW]|T[AFNQ‌​‌​ RSW]|UB|W|W[ACDFNRSV]|YO)\d{1,2}\s?(\d[\w]{2})?

like image 817
Robbie Avatar asked Mar 07 '12 20:03

Robbie


4 Answers

I've already answered once, making the point that it's not possible to come up with a 100% correct England-only regex (since the postcode areas don't lie along political boundaries).

However I've dug a bit deeper into this, and ... well it is possible, but it's a lot of work.

To verify an England-only postcode, you need to exclude the non-English postcodes. The easy ones are:

  • BT (Northern Ireland)
  • IM (Isle of Man)
  • JE (Jersey)
  • GG (Guernsey)
  • BF (British Forces)
  • BX (non-geographic UK postcodes)
  • GIR (Girobank, which is also non-geographic)

(I'm not going to mention UK-style postcodes for territories outside the UK, like St Helena, Gibraltar etc. Technically speaking, the Isle of Man and Channel Islands aren't part of the UK either, but they're much nearer by, and more closely tied into the Royal Mail system in the UK.)

The purely Scottish postcode areas are (as you mentioned):

ZE,KW,IV,HS,PH,AB,DD,PA,FK,G,KY,KA,EH,ML

DG and TD are nominally Scottish, and are for the most part in Scotland. However some areas extend over the Scotland-England border as follows:

  • DG16 - a tiny bit in England
  • TD9 - a tiny bit in England
  • TD12 - half in England
  • TD15 - mostly in England

The breakdown is as follows:

DG16 is in Scotland except for the following English postcodes:

  • DG16 5H[TUZ]
  • DG16 5J[AB]

TD9 is in Scotland except for TD9 0T[JPRSTUW]

TD12 has only one sector (TD12 4), which is spread roughly half and half across England and Scotland:

  • TD12 4[ABDEHJLN] are in Scotland
  • TD12 4[QRSTUWX] are in England

TD15 is the most complicated. There are 3 sectors, of which TD15 2 and TD15 9 are entirely in England.

TD15 1 is split across England and Scotland.

Postcodes beginning as follows are in Scotland:

  • TD15 1T
  • TD15 1X

... except for these English postcodes:

  • TD15 1T[ABQUX]
  • TD15 1XX

All other postcodes in TD15 1 are in England, except for those beginning as follows:

  • TD15 1B
  • TD15 1S (i.e. TD15 1S[ABEJLNPWXY])
  • TD15 1U (i.e. TD15 1U[BDENPQRTUXY])

... which are all in England, with the exception of the following postcodes which are in Scotland:

  • TD15 1BT
  • TD15 1S[UZ]
  • TD15 1U[FGHJLSZ]

The English postcode areas CA and NE lie on the other side of the England-Scotland border, however they never extend into Scotland.

In fact, the last two letters of a UK postcode is based on how the postman actually delivers post (as far as I'm aware), so it's not given for granted that it will fall inside a political boundary. Thus if there's a group of houses which straddle the border, then it's possible that the entire postcode (i.e. at the most fine-grained level) does not lie entirely within either England or Scotland. E.g. TD9 0TJ and TD15 1UZ are very close to the border, and I don't really know for sure if they're entirely on one side or not.

The England-Wales border is also complicated, however I'll leave this as an exercise for the reader.

like image 138
jim Avatar answered Oct 11 '22 03:10

jim


There are 124 Postcode Areas in the UK.

-- PAF® statistics August 2012, via List of postcodes in the United Kingdom (Wikipedia).

I recommend breaking your problem down into two parts (think functions):

  1. Is the postcode valid?

    UK Postcode Regex (Comprehensive)

  2. Is the postcode English?

    This can be broken down further:

    • Not Scottish:
      • ! /^(ZE|KW|IV|HS|PH|AB|DD|PA|FK|G|KY|KA|DG|TD|EH|ML)[0-9]/
    • Not Welsh:
      • ! /^(LL|SY|LD|HR|NP|CF|SA)[0-9]/
    • Not Northern Irish, Manx, from the Channel Islands, ...
      • et cetera...
    • or you could just check that the Postcode Area is among the hundred or so English ones, depending on how you want to optimise ☻

Note that the syntax will vary according to your programming language. Doing all this in one regular expression would soon become unmanageable.

like image 21
Johnsyweb Avatar answered Oct 11 '22 03:10

Johnsyweb


It's not possible to come up with an England-only regex, because the postcode areas don't lie along political boundaries, at least not at the postcode area or district level.

For example, CH1 is in England, and CH5 is in Wales.

At the postcode district level there are still problems, for example TD12 is half in England, half in Scotland.

The only area which you can rely on is BT (Northern Ireland)

like image 3
jim Avatar answered Oct 11 '22 02:10

jim


Use ^(AB|AL|B| ... )$, where the ... is where you fill the rest of the valid ones in, separated by pipes (|).

EDIT: There's a boatload of information here: http://en.wikipedia.org/wiki/Postcodes_in_the_United_Kingdom

If you were to include the in/out codes, it would be something like ^(AB|AL|B| ... )([\d\w]{3})\s([\d\w]{3})$, which would get the rest of the code.

EDIT

^(A[BL]|B[ABDHLNRST]?|C[ABFHMORTVW]|D[ADEGHLNTY]|E[CNX]?|F[KY]|G[LUY]|H[ADGPRSUX]|I[GMPV]|JE|K[ATWY]|L[ADELNSU]?|M[EL]?|N[EGNPRW]?|O[LX]|P[AEHLOR]|R[GHM]|S[AEGKLMNOPRSTWY]?|T[AFNQRSW]|UB|W[ACDFNRSV]?|YO|ZE)([\w\d]{1,2})\s?([\w\d]{3})$

Part of this regex is taken from another one of the answers. It matches the valid postcodes, then 1 to 2 {1,2} letters \w or numbers \d, an optional space \s?, then 3 letters or numbers. Hope that helps.

like image 1
Derreck Dean Avatar answered Oct 11 '22 02:10

Derreck Dean