Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ignoring cookies list efficiently in NGINX reverse proxy setup

I am currently working/testing microcache feature in NGINX reverse proxy setup for dynamic content.

One big issue that occurs is sessions/cookies that need to be ignored otherwise people will logon with random accounts on the site(s).

Currently I am ignoring popular CMS cookies like this:

if ($http_cookie ~* "(joomla_[a-zA-Z0-9_]+|userID|wordpress_(?!test_)[a-zA-Z0-9_]+|wp-postpass|wordpress_logged_in_[a-zA-Z0-9]+|comment_author_[a-zA-Z0-9_]+|woocommerce_cart_hash|woocommerce_items_in_cart|wp_woocommerce_session_[a-zA-Z0-9]+|sid_customer_|sid_admin_|PrestaShop-[a-zA-Z0-9]+") 
    {

# set ignore variable to 1
# later used in:
# proxy_no_cache                 $IGNORE_VARIABLE;
# proxy_cache_bypass             $IGNORE_VARIABLE;
# makes sense ?

    }

However this becomes a problem if I want to add more cookies to the ignore list. Not to mention that using too many "if" statements in NGINX is not recommended as per the docs.

My questions is, if this could be done using a map method ? I saw that regex in map is different( or maybe I am wrong ).

Or is there another way to efficiently ignore/bypass cookies ?

I have search a lot on stackoverflow, and whilst there are so many different examples; I could not find something specific for my needs.

Thank you

Update:

A lot of reading and "digging" on the internet ( we might as well just say Google ), and I found quite some interesting examples.

However I am very confused with these, as I do not fully understand the regex usage and I am afraid to implement such without understanding it.

Example 1:

map $http_cookie $cache_uid {
  default nil;
  ~SESS[[:alnum:]]+=(?<session_id>[[:alnum:]]+) $session_id;
}
  1. In this example I can notice that the regex is very different from the ones used in "if" blocks. I don't understand why the pattern starts without any "" and directly with just a ~ sign.

  2. I don't understand what does [[:alnum:]]+ mean ? I search for this but I was unable to find documentation. ( or maybe I missed it )

  3. I can see that the author was setting "nil" as default, this will not apply for my case.

Example 2:

map $http_cookie $cache_uid {
  default  '';
  ~SESS[[:alnum:]]+=(?<session_id>[[:graph:]]+)  $session_id;
}
  1. Same points as in Example 1, but this time I can see [[:graph:]]+. What is that ?

My Example (not tested):

map $http_cookie $bypass_cache {

    "~*wordpress_(?!test_)[a-zA-Z0-9_]+"  1;
    "~*wp-postpass|wordpress_logged_in_[a-zA-Z0-9]+"  1;
    "~*comment_author_[a-zA-Z0-9_]+"  1;
    "~*[a-zA-Z0-9]+_session)"  1;

    default      0;
}

In my pseudo example, the regex must be wrong since I did not find any map cookie examples with such regex.

So once again my goal is to have a map style list of cookies that I can bypass the cache for, with proper regex.

Any advice/examples much appreciated.

like image 913
Mecanik Avatar asked Jul 11 '19 09:07

Mecanik


Video Answer


2 Answers

What exactly are you trying to do?

The way you're doing it, by trying to blacklist only certain cookies from being cached, through if ($http_cookie …, is a wrong approach — this means that one day, someone will find a cookie that is not blacklisted, and which your backend would nonetheless accept, and cause you cache poisoning or other security issues down the line.

There's also no reason to use the http://nginx.org/r/map approach to get the values of the individual cookies, either — all of this is already available through the http://nginx.org/r/$cookie_ paradigm, making the map code for parsing out $http_cookie rather redundant and unnecessary.

Are there any cookies which you actually want to cache? If not, why not just use proxy_no_cache $http_cookie; to disallow caching when any cookies are present?


What you'd probably want to do is first have a spec of what must be cached and under what circumstances, only then resorting to expressing such logic in a programming language like nginx.conf.

For example, a better approach would be to see which URLs should always be cached, clearing out the Cookie header to ensure that cache poisoning isn't possible (proxy_set_header Cookie "";). Else, if any cookies are present, it may either make sense to not cache anything at all (proxy_no_cache $http_cookie;), or to structure the cache such that certain combination of authentication credentials are used for http://nginx.org/r/proxy_cache_key; in this case, it might also make sense to reconstruct the Cookie request header manually through a whitelist-based approach to avoid cache-poisoning issues.

like image 178
cnst Avatar answered Oct 22 '22 13:10

cnst


You 2nd example that you have is what you actually need

map $http_cookie $bypass_cache {

    "~*wordpress_(?!test_)[a-zA-Z0-9_]+"  1;
    "~*wp-postpass|wordpress_logged_in_[a-zA-Z0-9]+"  1;
    "~*comment_author_[a-zA-Z0-9_]+"  1;
    "~*[a-zA-Z0-9]+_session)"  1;

    default      0;
}

Basically here what you are saying the bypass_cache value will be 1 if the regex is matched else 0.

So as long as you got the pattern right, it will work. And that list only you can have, since you would only know which cookies to bypass cache on

like image 25
Tarun Lalwani Avatar answered Oct 22 '22 13:10

Tarun Lalwani