Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best approach for performance when filtering for permissions in Laravel

I am working on an application where a user can have access to many forms through many different scenarios. I am trying to build the approach with the best performance when returning an index of forms to the user.

A user can have access to forms via the following scenarios:

  • Owns Form
  • Team owns Form
  • Has permissions to a group that owns a Form
  • Has permissions to a team that owns a Form
  • Has permission to a Form

As you can see there are 5 possible ways that the user can access a form. My problem is how to i most efficiently return an array of the accessible forms to the user.

Form Policy:

I have tried to get all of the Forms from model and then filter the forms by the form policy. This seems to be a performance issue as on each filter iteration the form gets passed through a contains() eloquent method 5 times as shown below. The more forms in the database means this becomes slower.

FormController@index

public function index(Request $request)
{
   $forms = Form::all()
      ->filter(function($form) use ($request) {
         return $request->user()->can('view',$form);
   });
}

FormPolicy@view

public function view(User $user, Form $form)
{
   return $user->forms->contains($form) ||
      $user->team->forms->contains($form) ||
      $user->permissible->groups->forms($contains);
}

Although the above method works it is a performance bottle neck.

From what i can see my following options are:

  • FormPolicy filter (current approach)
  • Query all permissions (5) and merge into single collection
  • Query all identifiers for all permissions (5), then query the Form model using the identifiers in an IN() statement

My question:

Which method would provide the best performance and is there any other option which would provide better performance?

like image 376
Tim Avatar asked Oct 17 '19 23:10

Tim


2 Answers

I believe you can use Lazy Collections for that (Laravel 6.x) and eager load the relationships before they are accessed.

public function index(Request $request)
{
   // Eager Load relationships
   $request->user()->load(['forms', 'team.forms', 'permissible.group']);
   // Use cursor instead of all to return a LazyCollection instance
   $forms = Form::cursor()->filter(function($form) use ($request) {
         return $request->user()->can('view', $form);
   });
}
public function view(User $user, Form $form)
{
   return $user->forms->contains($form) ||
      $user->team->forms->contains($form) ||
      // $user->permissible->groups->forms($contains); // Assuming this line is a typo
      $user->permissible->groups->contains($form);
}
like image 88
IGP Avatar answered Oct 24 '22 12:10

IGP


Short answer

The third option: Query all identifiers for all permissions (5), then query the Form model using the identifiers in an IN() statement

$teamMorphType  = Relation::getMorphedModel('team');
$groupMorphType = Relation::getMorphedModel('group');
$formMorphType  = Relation::getMorphedModel('form');

$permissible = [
    $teamMorphType  => [$user->team_id],
    $groupMorphType => [],
    $formMorphType  => [],
];

foreach ($user->permissible as $permissible) {
    switch ($permissible->permissible_type) {
        case $teamMorphType:
        case $groupMorphType:
        case $formMorphType:
            $permissible[$permissible->permissible_type][] = $permissible->permissible_id;
            break;
    }
}

$forms = Form::query()
             ->where('user_id', '=', $user->id)
             ->orWhereIn('id', $permissible[$fromMorphType])
             ->orWhereIn('team_id', $permissible[$teamMorphType])
             ->orWhereIn('group_id', $permissible[$groupMorphType])
             ->get();

Long answer

On one hand, (almost) everything that you can do in code, is better performance-wise, than do it in queries.

On the other hand, getting more data from the database than necessary would be too much data already (RAM usage and so on).

From my perspective, you need something in between, and only you will know where would be the balance, depending on the numbers.

I would suggest running several queries, the last option you proposed (Query all identifiers for all permissions (5), then query the Form model using the identifiers in an IN() statement):

  1. Query all identifiers, for all permissions (5 queries)
  2. Merge all the forms results in memory, and get unique values array_unique($ids)
  3. Query the Form model, using the identifiers in an IN() statement.

You can try the three options you proposed and monitor the performance, using some tool to run the query multiple times, but I am 99% sure that the last one will give you the best performance.

This may also change a lot, depending on which database are you using, but if we are talking about MySQL, for instance; In A very big query would use more database resources, which will not only spend more time than simple queries, but also will lock the table from writes, and this can produce deadlock errors (unless you use a slave server).

On the other side, if the number of forms ids is very big, you can have errors for too many placeholders, so you may want to chunk the queries in groups of, let's say, 500 ids (this depends a lot, as the limit is in size, not in number of bindings), and merge the results in memory. Even if you don't get a database error, you may see a big difference in performance as well (I am still talking about MySQL).


Implementation

I will assume that this is the database scheme:

users
  - id
  - team_id

forms
  - id
  - user_id
  - team_id
  - group_id

permissible
  - user_id
  - permissible_id
  - permissible_type

So permissible would be an already configured polymorphic relationship.

Therefore, the relations would be:

  • Owns Form: users.id <-> form.user_id
  • Team owns Form: users.team_id <-> form.team_id
  • Has permissions to a group that owns a Form: permissible.user_id <-> users.id && permissible.permissible_type = 'App\Team'
  • Has permissions to a team that owns a Form: permissible.user_id <-> users.id && permissible.permissible_type = 'App\Group'
  • Has permission to a Form: permissible.user_id <-> users.id && permissible.permissible_type = 'App\From'

Simplify version:

$teamMorphType  = Relation::getMorphedModel('team');
$groupMorphType = Relation::getMorphedModel('group');
$formMorphType  = Relation::getMorphedModel('form');

$permissible = [
    $teamMorphType  => [$user->team_id],
    $groupMorphType => [],
    $formMorphType  => [],
];

foreach ($user->permissible as $permissible) {
    switch ($permissible->permissible_type) {
        case $teamMorphType:
        case $groupMorphType:
        case $formMorphType:
            $permissible[$permissible->permissible_type][] = $permissible->permissible_id;
            break;
    }
}

$forms = Form::query()
             ->where('user_id', '=', $user->id)
             ->orWhereIn('id', $permissible[$fromMorphType])
             ->orWhereIn('team_id', $permissible[$teamMorphType])
             ->orWhereIn('group_id', $permissible[$groupMorphType])
             ->get();

Detailed version:

// Owns Form
// users.id <-> forms.user_id
$userId = $user->id;

// Team owns Form
// users.team_id <-> forms.team_id
// Initialise the array with a first value.
// The permissions polymorphic relationship will have other teams ids to look at
$teamIds = [$user->team_id];

// Groups owns Form was not mention, so I assume there is not such a relation in user.
// Just initialise the array without a first value.
$groupIds = [];

// Also initialise forms for permissions:
$formIds = [];

// Has permissions to a group that owns a Form
// permissible.user_id <-> users.id && permissible.permissible_type = 'App\Team'
$teamMorphType = Relation::getMorphedModel('team');
// Has permissions to a team that owns a Form
// permissible.user_id <-> users.id && permissible.permissible_type = 'App\Group'
$groupMorphType = Relation::getMorphedModel('group');
// Has permission to a Form
// permissible.user_id <-> users.id && permissible.permissible_type = 'App\Form'
$formMorphType = Relation::getMorphedModel('form');

// Get permissions
$permissibles = $user->permissible()->whereIn(
    'permissible_type',
    [$teamMorphType, $groupMorphType, $formMorphType]
)->get();

// If you don't have more permissible types other than those, then you can just:
// $permissibles = $user->permissible;

// Group the ids per type
foreach ($permissibles as $permissible) {
    switch ($permissible->permissible_type) {
        case $teamMorphType:
            $teamIds[] = $permissible->permissible_id;
            break;
        case $groupMorphType:
            $groupIds[] = $permissible->permissible_id;
            break;
        case $formMorphType:
            $formIds[] = $permissible->permissible_id;
            break;
    }
}

// In case the user and the team ids are repeated:
$teamIds = array_values(array_unique($teamIds));
// We assume that the rest of the values will not be repeated.

$forms = Form::query()
             ->where('user_id', '=', $userId)
             ->orWhereIn('id', $formIds)
             ->orWhereIn('team_id', $teamIds)
             ->orWhereIn('group_id', $groupIds)
             ->get();

Resources used:

Database performance:

  • Queries to the database (excluding the user): 2; one to get the permissible and another one to get the forms.
  • No joins!!
  • The minimum ORs possible (user_id = ? OR id IN (?..) OR team_id IN (?...) OR group_id IN (?...).

PHP, in memory, performance:

  • foreach looping the permissible with a switch inside.
  • array_values(array_unique()) to avoid repeating the ids.
  • In memory, 3 arrays of ids ($teamIds, $groupIds, $formIds)
  • In memory, relevant permissions eloquent collection (this can be optimised, if needed).

Pros and cons

PROS:

  • Time: The sum of times of single queries is less than the time of a big query with joins and OR.
  • DB Resources: The MySQL resources used by a query with join and or statements, are bigger than the used by the sum of its separate queries.
  • Money: Fewer database resources (processor, RAM, disc read, etc.), which are more expensive than PHP resources.
  • Locks: In case you are not querying a read-only slave server, your queries will make fewer rows read locks (the read lock is shared in MySQL, so it won't lock another read, but it will block any write).
  • Scalable: This approach allows you to make more performance optimizations such as chunk the queries.

CONS:

  • Code resources: Making calculations in code, rather than in the database, will obviously consume more resources in the code instance, but especially in the RAM, storing the middle information. In our case, this would be just an array of ids, which shouldn't be a problem really.
  • Maintenance: If you use Laravel's properties and methods, and you do any change in the database, it will be easier to update in code than if you make more explicit queries and processing.
  • Overkilling?: In some cases, if the data is not that big, optimising the performance may be overkilling.

How to measure performance

Some clues about how to measure the performance?

  1. Slow query logs
  2. ANALYZE TABLE
  3. SHOW TABLE STATUS LIKE
  4. EXPLAIN; Extended EXPLAIN Output Format; using explain; explain output
  5. SHOW WARNINGS

Some interesting profiling tools:

  • phpMyAdmin
  • Percona Query Digest (pt-query-digest)
  • log-top
like image 45
Gonzalo Avatar answered Oct 24 '22 11:10

Gonzalo