I'm having difficulty understanding the import statement and its variations.
Suppose I'm using the lxml
module for scraping websites.
The following examples show...
from lxml.html import parse
parse( 'http://somesite' )
...Google's python style guide prefers the basic import statement, to preserve the namespaces.
I'd prefer to do that, but when I try this:
import lxml
lxml.html.parse( 'http://somesite' )
...then I get the following error message:
AttributeError: 'module' object has no attribute 'html'
Can anyone help me understand what is going on? I'd much prefer to use modules within their namespaces, but need some assistance understanding the semantics.
The import statement syntax is: import modulename. Python is accompanied by a number of built-in modules that allow you to perform common operations in your code. int(), for example, converts a value to an integer. sum() calculates the sum of all items in a list.
There are generally three groups: standard library imports (Python's built-in modules) related third party imports (modules that are installed and do not belong to the current application) local application imports (modules that belong to the current application)
Python code in one module gains access to the code in another module by the process of importing it. The import statement is the most common way of invoking the import machinery, but it is not the only way. Functions such as importlib.
The difference between import and from import in Python is: import imports the whole library. from import imports a specific member or members of the library.
import lxml.html as LH
doc = LH.parse('http://somesite')
lxml.html
is a module. When you import lxml
, the html
module is not imported into the lxml
namespace. This is a developer's decision. Some packages automatically import some modules, some don't. In this case, you have to do it yourself with import lxml.html
.
import lxml.html as LH
imports the html
module and binds it to the name LH
in the current module's namespace. So you can access the parse function with LH.parse
.
If you want to delve deeper into when a package (like lxml
) imports modules (like lxml.html
) automatically, open a terminal and type
In [16]: import lxml
In [17]: lxml
Out[17]: <module 'lxml' from '/usr/lib/python2.7/dist-packages/lxml/__init__.pyc'>
Here is you see the path to the lxml
package's __init__.py
file.
If you look at the contents you find it is empty. So no submodules are imported. If you look in numpy's __init__.py
, you see lots of code, amongst which is
import linalg
import fft
import polynomial
import random
import ctypeslib
import ma
These are all submodules which are imported into the numpy
namespace. So from a user's perspective, import numpy
automatically gives you access to numpy.linalg
, numpy.fft
, etc.
Let's take an example of a package pkg
with two module in it a.py
and b.py
:
--pkg
|
| -- a.py
|
| -- b.py
|
| -- __init__.py
in __init__.py
you are importing a.py
and not b.py
:
import a
So if you open your terminal and do:
>>> import pkg
>>> pkg.a
>>> pkg.b
AttributeError: 'module' object has no attribute 'b'
As you can see because we have imported a.py
in pkg's __init__.py
, we was able to access it as an attribute of pkg
but b
is not there, so to access this later we should use:
>>> import pkg.b # OR: from pkg import b
HTH,
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With