Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Adding Encryption to Solr/lucene indexes

I am currently using Solr to perform search services over some sensitive records.

As Solr/lucene provides fast searching by storing inverted indexes of the sensitive information in plain text on a disk there is a requirement to encrypt these index files so that unauthorized people can't have access to them by bypassing the system's security.

I found there are similar patches open on Apache JIRA AES encrypted directory and Codec for index-level encryption.

AES encrypted directory looks promising but this patch has been implemented for lucene 3.1 as I am using the newer version, I am not sure if this patch can be used with lucene version 5 or higher.

I was wondering if there is a way to implement a security measure that encrypts the indexes or if it is possible to write some custom plugin which can encrypt/decrypt the indexes on I/O level(i.e FsDirectory)?

like image 596
Kewl_guy89 Avatar asked Apr 13 '16 16:04

Kewl_guy89


1 Answers

The discussion in the comment section of LUCENE-6966 you have shared is really interesting. I would reason with this quote of Robert Muir that there is nothing baked into Solr and probably will never be.

More importantly, with file-level encryption, data would reside in an unencrypted form in memory which is not acceptable to our security team and, therefore, a non-starter for us.

This speaks volumes. You should fire your security team! You are wasting your time worrying about this: if you are using lucene, your data will be in memory, in plaintext, in ways you cannot control, and there is nothing you can do about that!

Trying to guarantee anything better than "at rest" is serious business, sounds like your team is over their head.

So you should consider to encrypt the storage Solr is using on OS level. This should be transparent for Solr. But if someone comes into your system, he should not be able to copy the Solr data.

This is also the conclusion the article Encrypting Solr/Lucene indexes from Erick Erickson of Lucidwors draws in the end

The short form is that this is one of those ideas that doesn't stand up to scrutiny. If you're concerned about security at this level, it's probably best to consider other options, from securing your communications channels to using an encrypting file system to physically divorcing your system from public networks. Of course, you should never, ever, let your working Solr installation be accessible directly from the outside world, just consider the following: http://server:port/solr/update?stream.body=<delete><query>*:*</query></delete>!

like image 86
cheffe Avatar answered Oct 20 '22 15:10

cheffe