Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to find all children of a Matlab class?

The Matlab function superclasses returns the names of all parents of a given class.

Is there an equivalent to find all classes derived from a given class, i.e. children classes ? The function allchild seems to be restricted to graphical handles.

If not, what strategy could be adopted to get such a list ? Is brute force path scanning the only option ?

Let's restrict ourselves to the classes in Matlab's path.

like image 286
Ratbert Avatar asked Jun 15 '16 08:06

Ratbert


1 Answers

Intro:

During the course of the solution I seem to have found an undocumented static method of the meta.class class which returns all cached classes (pretty much everything that gets erased when somebody calls clear classes) and also (entirely by accident) made a tool that checks classdef files for errors.


Since we want to find all subclasses, the sure way to go is by making a list of all known classes and then checking for each one if it's derived from any other one. To achieve this we separate our effort into 2 types of classes:

  • "Bulk classes" - here we employ the what function to make a list of files that are just "laying around" on the MATLAB path, which outputs a structure s (described in the docs of what having the following fields: 'path' 'm' 'mlapp' 'mat' 'mex' 'mdl' 'slx' 'p' 'classes' 'packages'. We will then select some of them to build a list of classes. To identify what kind of contents an .m or a .p file has (what we care about is class/not-class), we use exist. This method is demonstrated by Loren in her blog. In my code, this is mb_list.
  • "Package classes" - this includes class files that are indexed by MATLAB as part of its internal package structure. The algorithm involved in getting this list involves calling meta.package.getAllPackages and then recursively traversing this top-level package list to get all sub-packages. Then a class list is extracted from each package, and all lists are concatenated into one long list - mp_list.

The script has two input flags (includeBulkFiles,includePackages) that determine whether each type of classes should be included in the output list.

The full code is below:

function [mc_list,subcls_list] = q37829489(includeBulkFiles,includePackages)
%% Input handling
if nargin < 2 || isempty(includePackages)
  includePackages = false;
  mp_list = meta.package.empty;
end
if nargin < 1 || isempty(includeBulkFiles)
  includeBulkFiles = false;
  mb_list = meta.class.empty; %#ok
  % `mb_list` is always overwritten by the output of meta.class.getAllClasses; 
end
%% Output checking
if nargout < 2
  warning('Second output not assigned!');
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% Get classes list from bulk files "laying around" the MATLAB path:
if includeBulkFiles
  % Obtain MATLAB path:
  p = strsplit(path,pathsep).';
  if ~ismember(pwd,p)
    p = [pwd;p];
  end
  nPaths = numel(p);
  s = what; s = repmat(s,nPaths+20,1); % Preallocation; +20 is to accomodate rare cases 
  s_pos = 1;                           %  where "what" returns a 2x1 struct.
  for ind1 = 1:nPaths  
    tmp = what(p{ind1});
    s(s_pos:s_pos+numel(tmp)-1) = tmp;
    s_pos = s_pos + numel(tmp);
  end
  s(s_pos:end) = []; % truncation of placeholder entries.
  clear p nPaths s_pos tmp
  %% Generate a list of classes:
  % from .m files:
  m_files = vertcat(s.m);
  % from .p files:
  p_files = vertcat(s.p);
  % get a list of potential class names:
  [~,name,~] = cellfun(@fileparts,[m_files;p_files],'uni',false);
  % get listed classes:
  listed_classes = s.classes;
  % combine all potential class lists into one:
  cls_list = vertcat(name,listed_classes);
  % test which ones are actually classes:
  isClass = cellfun(@(x)exist(x,'class')==8,cls_list); %"exist" method; takes long
  %[u,ia,ic] = unique(ext(isClass(1:numel(ext)))); %DEBUG:

  % for valid classes, get metaclasses from name; if a classdef contains errors,
  % will cause cellfun to print the reason using ErrorHandler.
  [~] = cellfun(@meta.class.fromName,cls_list(isClass),'uni',false,'ErrorHandler',...
     @(ex,in)meta.class.empty(0*fprintf(1,'The classdef for "%s" contains an error: %s\n'...
                                         , in, ex.message)));
  % The result of the last computation used to be assigned into mc_list, but this 
  % is no longer required as the same information (and more) is returned later
  % by calling "mb_list = meta.class.getAllClasses" since these classes are now cached.
  clear cls_list isClass ind1 listed_classes m_files p_files name s
end
%% Get class list from classes belonging to packages (takes long!):

if includePackages
  % Get a list of all package classes:
  mp_list = meta.package.getAllPackages; mp_list = vertcat(mp_list{:});  
  % see http://www.mathworks.com/help/matlab/ref/meta.package.getallpackages.html

  % Recursively flatten package list:
  mp_list = flatten_package_list(mp_list);

  % Extract classes out of packages:
  mp_list = vertcat(mp_list.ClassList);
end
%% Combine lists:
% Get a list of all classes that are in memory:
mb_list = meta.class.getAllClasses; 
mc_list = union(vertcat(mb_list{:}), mp_list);
%% Map relations:
try
  [subcls_list,discovered_classes] = find_superclass_relations(mc_list);
  while ~isempty(discovered_classes)
    mc_list = union(mc_list, discovered_classes);
    [subcls_list,discovered_classes] = find_superclass_relations(mc_list);
  end
catch ex % Turns out this helps....
  disp(['Getting classes failed with error: ' ex.message ' Retrying...']);
  [mc_list,subcls_list] = q37829489;
end

end

function [subcls_list,discovered_classes] = find_superclass_relations(known_metaclasses)
%% Build hierarchy:
sup_list = {known_metaclasses.SuperclassList}.';
% Count how many superclasses each class has:
n_supers = cellfun(@numel,sup_list);
% Preallocate a Subclasses container: 
subcls_list = cell(numel(known_metaclasses),1); % should be meta.MetaData
% Iterate over all classes and 
% discovered_classes = meta.class.empty(1,0); % right type, but causes segfault
discovered_classes = meta.class.empty;
for depth = max(n_supers):-1:1
  % The function of this top-most loop was initially to build a hierarchy starting 
  % from the deepest leaves, but due to lack of ideas on "how to take it from here",
  % it only serves to save some processing by skipping classes with "no parents".
  tmp = known_metaclasses(n_supers == depth);
  for ind1 = 1:numel(tmp)
    % Fortunately, SuperclassList only shows *DIRECT* supeclasses. Se we
    % only need to find the superclasses in the known classees list and add
    % the current class to that list.
    curr_cls = tmp(ind1);
    % It's a shame bsxfun only works for numeric arrays, or else we would employ: 
    % bsxfun(@eq,mc_list,tmp(ind1).SuperclassList.');
    for ind2 = 1:numel(curr_cls.SuperclassList)
      pos = find(curr_cls.SuperclassList(ind2) == known_metaclasses,1);
      % Did we find the superclass in the known classes list?
      if isempty(pos)
        discovered_classes(end+1,1) = curr_cls.SuperclassList(ind2); %#ok<AGROW>
  %       disp([curr_cls.SuperclassList(ind2).Name ' is not a previously known class.']);
        continue
      end      
      subcls_list{pos} = [subcls_list{pos} curr_cls];
    end    
  end  
end
end

% The full flattened list for MATLAB R2016a contains about 20k classes.
function flattened_list = flatten_package_list(top_level_list)
  flattened_list = top_level_list;
  for ind1 = 1:numel(top_level_list)
    flattened_list = [flattened_list;flatten_package_list(top_level_list(ind1).PackageList)];
  end
end

The outputs of this function are 2 vectors, who in Java terms can be thought of as a Map<meta.class, List<meta.class>>:

  • mc_list - an object vector of class meta.class, where each entry contains information about one specific class known to MATLAB. These are the "keys" of our Map.
  • subcls_list - A (rather sparse) vector of cells, containing known direct subclasses of the classes appearing in the corresponding position of mc_list. These are the "values" of our Map, which are essentially List<meta.class>.

Once we have these two lists, it's only a matter of finding the position of your class-of-interest in mc_list and getting the list of its subclasses from subcls_list. If indirect subclasses are required, the same process is repeated for the subclasses too.

Alternatively, one can represent the hierarchy using e.g. a logical sparse adjacency matrix, A, where ai,j==1 signifies that class i is a subclass of j. Then the transpose of this matrix can signify the opposite relation, that is, aTi,j==1 means i is a superclass of j. Keeping these properties of the adjaceny matrix in mind allows very rapid searches and traversals of the hierarchy (avoiding the need for "expensive" comparisons of meta.class objects).

Several notes:

  • For reasons unknown (caching?) the code may fail due to an error (e.g. Invalid or deleted object.), in that case re-running it helps. I have added a try/catch that does this automatically.
  • There are 2 instances in the code where arrays are grown inside a loop. This is of course unwanted and should be avoided. The code was left like that due to a lack of better ideas.
  • If the the "discovery" part of the algorithm cannot be avoided (by somehow finding all the classes in the first place), one can (and should) optimize it so that every iteration only operates on previously unknown classes.
  • An interesting unintended benefit of running this code is that it scans all known classdefs and reports any errors in them - this can be a useful tool to run every once in a while for anyone who works on MATLAB OOP :)
  • Thanks @Suever for some helpful pointers.

Comparison with Oleg's method:

To compare these results with Oleg's example, I will use the output of a run of the above script on my computer (containing ~20k classes; uploaded here as a .mat file). We can then access the class map the following way:

hRoot = meta.class.fromName('sde');
subcls_list{mc_list==hRoot}

ans = 

  class with properties:

                     Name: 'sdeddo'
              Description: ''
      DetailedDescription: ''
                   Hidden: 0
                   Sealed: 0
                 Abstract: 0
              Enumeration: 0
          ConstructOnLoad: 0
         HandleCompatible: 0
          InferiorClasses: {0x1 cell}
        ContainingPackage: [0x0 meta.package]
             PropertyList: [9x1 meta.property]
               MethodList: [18x1 meta.method]
                EventList: [0x1 meta.event]
    EnumerationMemberList: [0x1 meta.EnumeratedValue]
           SuperclassList: [1x1 meta.class]

subcls_list{mc_list==subcls_list{mc_list==hRoot}} % simulate recursion

ans = 

  class with properties:

                     Name: 'sdeld'
              Description: ''
      DetailedDescription: ''
                   Hidden: 0
                   Sealed: 0
                 Abstract: 0
              Enumeration: 0
          ConstructOnLoad: 0
         HandleCompatible: 0
          InferiorClasses: {0x1 cell}
        ContainingPackage: [0x0 meta.package]
             PropertyList: [9x1 meta.property]
               MethodList: [18x1 meta.method]
                EventList: [0x1 meta.event]
    EnumerationMemberList: [0x1 meta.EnumeratedValue]
           SuperclassList: [1x1 meta.class]

Here we can see that the last output is only 1 class (sdeld), when we were expecting 3 of them (sdeld,sdemrd,heston) - this means that some classes are missing from this list1.

In contrast, if we check a common parent class such as handle, we see a completely different picture:

subcls_list{mc_list==meta.class.fromName('handle')}

ans = 

  1x4059 heterogeneous class (NETInterfaceCustomMetaClass, MetaClassWithPropertyType, MetaClass, ...) array with properties:

    Name
    Description
    DetailedDescription
    Hidden
    Sealed
    Abstract
    Enumeration
    ConstructOnLoad
    HandleCompatible
    InferiorClasses
    ContainingPackage
    PropertyList
    MethodList
    EventList
    EnumerationMemberList
    SuperclassList

To conclude this in several words: this method attempts to index all known classes on the MATLAB path. Building the class list/index takes several minutes, but this is a 1-time process that pays off later when the list is searched. It seems to miss some classes, but the found relations are not restricted to the same packages, paths etc. For this reason it inherently supports multiple inheritance.


1 - I currently have no idea what causes this.

like image 110
Dev-iL Avatar answered Sep 28 '22 05:09

Dev-iL