I am looking into porting a website in CouchDB and it looks very interesting.
However, a big problem is that CouchDB does not seem to support read authentication; all documents within a database are accessable by all readers.
It is suggested elsewhere to use different databases for different reader-groups or to implement reader authentication in another (middle) tier, neither of which is an option for this project where the access is determined by complex, per document ACLs.
I was thinking to implement the authentication in lists and to restrict all access to the CouchDb to these lists. This restriction could be enforced by the simple mod_rewrite clauses in the Apache used as reverse-proxy. The lists would simple fetch the row and check the userCtx against the document's ACL. Something like:
function(head, req) {
var row;
while (row = getRow()) {
if (row.value.ACL[req.userCtx.name])
send(row.value);
else
throw({unauthorized : "You are not allowed to access this resource"});
}
Since I have no experience with CouchDB, and I haven't read about this approach anywhere, I'd like to know whether this approach could work.
Is this a way to implement read access or am I abusing lists for the wrong purpose? Should I not expect such a simple solution is possible with CouchDB?
Apache mod_rewrite is a middle tier, so it is not clear what you mean when you say a middle tier is not an option.
Implementing your security policy based on data in couchdb is perfectly fine. However the cost is that you are responsible for the implementation to be correct. It's not as bad as it sounds. Remember, people have been doing this with MySQL web apps for a long time.
The thing to keep in mind is that CouchDB does not support document-level read permissions because it is impractical to track those permissions as the data weaves through all the maps and reduces of the views. For example, say we have a bidding system.
In other words, if you are wrapping the CouchDB API, you will at least need to whitelist those queries which are allowed. And remember, the vhost and rewrite rules run within CouchDB so simply looking at the incoming query may not be enough.
Hopefully that sheds some light on why read control is at the database level.
Usually it is sufficient to restrict access to certain views - this can be done via lists as you proposed (thanks for the idea). Using unguessable IDs for documents, you already have some kind of access control for documents. I would avoid iterating through the rows and checking for permissions there, but I don't think that's much of a problem either.
Some have mentioned here that the purpose of lists is to change the format - I don't agree, as even the official CouchDB guide states that lists could even produce json documents.
Another way is to restrict users per database and use selective replication so one database will only contain the data a certain group of users is allowed to access. See couchdb read authentication This is not actually per-user, but maybe anyway an option for you. For details on filtered replication see http://wiki.apache.org/couchdb/Replication
Edit: I just came up with a great idea to enforce per document user permissions via lists with better performance:
The advantage is that CouchDB, as far as I know, internally uses caching for views. I'm not sure about how the caching works with lists. Also I think iterating and filtering in views is generally faster than in lists.
List functions are reasonable way to enforce read ACL in simple cases, but this approach has several drawbacks.
First, you need something in front of CouchDB to block any read request, that does not pipes through list fn that implements ACL. _all-docs
, requests with reduce=true
, direct GETs of docs – thay all, and many others, must be blocked. Simplest way is to use Apache and regexp masks.
Second, you must understand that you can not in simple way control access to attachments. Although you can block any read request, that does not match your /db/_design/ddoc/_list/list/view
pattern, you can not build effective view+list pair to provide access control to attaches.
It’s absolutely impossible for CouchDB 1.5 and earlier – view index can not include attachment data. It’s nearly impossible in CouchDB 1.6 since processing base64-encoded attaches as JSON is CPU and RAM hog.
Third, in any way, this method is sloooooow. Reason is simple – list functions are not streams. It means first entire response of view fn is grabbed and serialized, then list processor deserializes it again, and then result is processed using list function. And then, again serialized.
I'm not sure using list is the best option to restrict the access to resources since list are functions that are used to render the ouupt of a view in specific format (RSS, CSV, config files, HTML,...).
Have you considered using a document containing users and their permissions? I found a post by Kore Nordmann which explains how to convert the classical user/group/permissions from relational databases to the CouchDB model:
Depending on its permissions, a user would have access to only a set of defined views.
CouchDB offers validation functions but they only get called when a document is created or updated. The O’Reilly book states that "The authentication system is pluggable, so you can integrate with existing services to authenticate users to CouchDB using an http layer, LDAP integration, or through other means". But since you mentioned a middle tier is not an option, the list could be a temporary solution until more authentication support is added to CouchDB.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With