Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to extract functions used in a python code file?

I would like to create a list of all the functions used in a code file. For example if we have following code in a file named 'add_random.py'

`

import numpy as np
from numpy import linalg

def foo():
    print np.random.rand(4) + np.random.randn(4)
    print linalg.norm(np.random.rand(4))

`

I would like to extract the following list: [numpy.random.rand, np.random.randn, np.linalg.norm, np.random.rand]

The list contains the functions used in the code with their actual name in the form of 'module.submodule.function'. Is there something built in python language that can help me do this?

like image 933
Shishir Pandey Avatar asked Sep 24 '14 10:09

Shishir Pandey


People also ask

How do you retrieve a function in Python?

We use the getsource() method of inspect module to get the source code of the function. Returns the text of the source code for an object. The argument may be a module, class, method, function, traceback, frame, or code object.

How do I show package functions in Python?

We can list down all the functions present in a Python module by simply using the dir() method in the Python shell or in the command prompt shell.

What is extractor in Python?

Entity extraction, also called named entity extraction or named entity recognition (NER) is a text analysis technique that uses natural language processing (NLP) to identify named entities and extract them from raw text.

Can Python find source code?

Since Python is open source you can read the source code. To find out what file a particular module or function is implemented in you can usually print the __file__ attribute. Alternatively, you may use the inspect module, see the section Retrieving Source Code in the documentation of inspect .


1 Answers

You can extract all call expressions with:

import ast

class CallCollector(ast.NodeVisitor):
    def __init__(self):
        self.calls = []
        self.current = None

    def visit_Call(self, node):
        # new call, trace the function expression
        self.current = ''
        self.visit(node.func)
        self.calls.append(self.current)
        self.current = None

    def generic_visit(self, node):
        if self.current is not None:
            print "warning: {} node in function expression not supported".format(
                node.__class__.__name__)
        super(CallCollector, self).generic_visit(node)

    # record the func expression 
    def visit_Name(self, node):
        if self.current is None:
            return
        self.current += node.id

    def visit_Attribute(self, node):
        if self.current is None:
            self.generic_visit(node)
        self.visit(node.value)  
        self.current += '.' + node.attr

Use this with a ast parse tree:

tree = ast.parse(yoursource)
cc = CallCollector()
cc.visit(tree)
print cc.calls

Demo:

>>> tree = ast.parse('''\
... def foo():
...     print np.random.rand(4) + np.random.randn(4)
...     print linalg.norm(np.random.rand(4))
... ''')
>>> cc = CallCollector()
>>> cc.visit(tree)
>>> cc.calls
['np.random.rand', 'np.random.randn', 'linalg.norm']

The above walker only handles names and attributes; if you need more complex expression support, you'll have to extend this.

Note that collecting names like this is not a trivial task. Any indirection would not be handled. You could build a dictionary in your code of functions to call and dynamically swap out function objects, and static analysis like the above won't be able to track it.

like image 115
Martijn Pieters Avatar answered Oct 04 '22 06:10

Martijn Pieters