Complete Guide to Imports in Python: Absolute, Relative, and More

Complete Guide to Imports in Python: Absolute, Relative, and More

How to plan your code so imports are clear and clean
Aquiles Carattino 2019-10-04 importing import relative absolute package

Importing is not only a matter of using external libraries, it also allows you to keep your code clean and organized. In this tutorial, we are going to discuss from the very basics of importing to complex topics such as lazy loading of modules in packages.

Introduction to importing

In Python it is important to distinguish between modules and packages in order to have a clear communication. Modules are, in essence, files with a .py extension. They can define variables, functions, classes, and they can also run code. Packages are collections of modules in a hierarchical structure, which in the end means organizing the module files in folders.

If we have a file called module.py, we can simply use the following syntax to import it:

import module

We can also use modules that are bundled with Python, such as sys:

import sys

In this case, sys is a module that belongs to the Python Standard Library. It provides functions to interact with the interpreter itself. For example, we can use it to find out if any arguments where passed while executing a script. We can create a file, test_argv.py, with the following code:

import sys

print(sys.argv)

And we run it to see its output:

python test_argv.py -b 1
['test_argv.py', '-b', '1']

Python comes bundled with plenty of libraries for different tasks. You can find them all here.

Importing the entire sys module may not be what we want since we are only using one of its elements. In this case, we can also be selective with the importing procedure while keeping the same output:

from sys import argv

print(argv)

Using the full import or just a selection of what we need depends to a great extent on personal preferences, and on how different packages where designed.

Importing *

In the examples above, we have either imported one entire module or just one element from sys. To import more elements from the same module, we can specify them on the same line:

from sys import argv, exit

print(argv)
exit()

We can import as many things as we need from the same module. To avoid lines becoming too long and hard to read, it is possible to stack the items vertically. For example:

from sys import (api_version,
                 argv,
                 base_exec_prefix,
                 exit,
                 )

Note that in this case we must use a set of ( and ) to make a clear list of imports. It is also possible to import all the elements from a module by using a * in the statement, for example:

from sys import *

print(api_version)
print(argv)
exit()

However, this is a highly discouraged practice. When we import things without control, some functions may get overwritten without even realizing. The code becomes harder to read and understand. Let's see it with the following example:

from time import *
from asyncio import *

print('Here')
sleep(1)
print('After')

Most programmers will be familiar with the sleep function from the time module, which halts the execution of a program for a given number of seconds. If we run the script, however, we will notice that there is no delay between the lines 'Here' and 'After'. This can be puzzling at first, and for larger projects can indeed become daunting. In this case, both time and asyncio define a function sleep which behaves in very different ways. If in such a simple and compact example the risks of the * imports become evident, it is easy to understand why almost all developers avoid using the * in their programs.

The case of time and asyncio is special, because both of them belong to the Python Standard Library and are very well documented. However, not all programs are as organized and clean. Therefore, it becomes harder and harder to remember all the modules and functions defined in packages. Moreover, some names are so handy (like sleep), that it is easy to find them defined in different packages and for many purposes.

The script above becomes much clearer with the following syntax:

import time
import asyncio

print('Here')
asyncio.sleep(1)
print('After')
time.sleep(1)
print('Finally')

There are no doubts of what is going on, and where problems may be arising, even if we haven't used the asyncio library before.

Importing As

Sometimes we face the situation in which we want to import specific functions from different modules but their names are the same. For example, both time and asyncio define sleep and we are interested in using them. To avoid a name clash when importing, Python allows us to change the name of what we are importing by doing the following:

from asyncio import sleep as async_sleep
from time import sleep as time_sleep

print('Here')
async_sleep(1)
print('After')
time_sleep(1)
print('Finally')

In this way, we can use either sleep from asyncio or from timeavoiding name clashes. Of course this flexibility must be taken seriously because it can also generate unreadable code. The following code would become very hard to udnerstand for anybody reading it:

from time import sleep as exit

exit(1)

The use of the import as is not only practical, in some cases it is the de-facto way of working. For example, numpy, pandas, matplotlib are almost always imported in the same way:

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

The three lines above are ubiquitous in many scientific programs. They are so common that code editors such as Pycharm can suggest you to import numpy if they see a line that includes something like np.. Changing the name of the import can be useful not only to prevent name clashes, but also to shorten the notation. Instead of doing:

matplotlib.pyplot.plot(x, y)

We simply have:

plt.plot(x, y)

Different packages have different shortcuts. For example PyQtGraph is normally shortened as pg, and for sure different fields use different abbreviations. Importing Numpy as np or Pandas as pd is not mandatory. However, since it is what the community does, it will make the code much more readable.

The use of this notation is so widespread that, for example, even in StackOverflow numpy is used as np without even showing the import statement.

Importing your code

So far, we have seen how to import packages and modules developed by other people. Importing, however, is a great tool to structure different parts of the code. It makes it easier to maintan and collaborate. Therefore, sooner or later we are going to find ourselves importing our code. We can start simple and slowly build the complexity.

We can create a file first.py, with following code:

def first_function():
    print('This is the first function')

In a file called second.py, we can add the following code:

from first import first_function

first_function()

And we run it:

$ python second.py
This is the first function

That is as easy as it gets. We define a function in a file, but we use it in another file. Once we have many files, it becomes handier to start creating some structure to organize the program. We can create a folder called package_a, and we add a new file, called third.py. The folder structure is like this:

$ tree
.
├── first.py
├── package_a
│   └── third.py
└── second.py

In third we create a new function, appropriately called third_function:

def third_function():
    print('This is the third function')

The examples are very basic, but they already start to show some patterns and caveats in the importing procedures. If we want to use the new function from the second.py, we need to import it:

from first import first_function
from package_a.third import third_function

first_function()
third_function()

If we run the code, we'll get the following output:

This is the first function
This is the third function

Pay attention to the notation we used to import the third_function. We specified the folder, in this case, package_a and then we referred to the file with a dot: .. This is the way in which the hyerarchy of folders and files is transformed into packages and modules in Python. In this case, replacing the folder separators by a . we end up having package_a.third, and we stripped the .py extension.

The use of the __init__ file

Most likely our code will not be isolated from other projects. When we install packages, they will have dependencies, and very quickly we lose track of what is actually installed. We can understand where the problem arises wtih a very simple example. We can assume that we have numpy already installed but we are not aware of that. If we create a new folder, called numpy, with a file called sleep.py, the folder structure will end up looking like this:

.
├── first.py
├── package_a
│   └── third.py
├── numpy
│   └── sleep.py
└── second.py

And in the file sleep.py, we can add the following function:

def sleep():
    print('Sleep')

In the same way we did before, we can update second.py to use the new function sleep:

from numpy.sleep import sleep

sleep()

Of course the code above will raise many alarms, but the best is to run it to see what happens:

Traceback (most recent call last):
  File "second.py", line 3, in <module>
    from numpy.sleep import sleep
ModuleNotFoundError: No module named 'numpy.sleep'

The biggest challenge in this case is that the exception is utterly hard to understand. It is telling us that Python tried to look for a module called sleep in the numpy package. If we open the folder numpy we find the module sleep. Therefore, there must be something else going on. In this example it is clear that Python is looking for the module sleep within the official numpy package and not in our folder.

The quick solution to this problem is to create an empty file called __init__.py in our numpy folder:

.
├── first.py
├── package_a
│   └── third.py
├── numpy
│   ├── __init__.py
│   └── sleep.py
└── second.py

We can run the code without problems this time:

$ python second.py
Sleep

It is important to understand what is going on and not just through quick solutions to see whether they work to voercome the immediate problems. The quid is to know how Python looks for packages on the computer. The topic is complex, and Python allows a great deal of customization. The official documentation shines some light into the matter once we have experience.

In short, Python will first look at whether we are trying to import, and check if it belongs to the standard library. If the folder was time instead of numpy, the behavior would have been different. Adding the __init__.py file wouldn't make a difference. Once Python knows the module is not in the standard library, it will check for external modules. First, it starts searching the current directory. Then, it moves to the directories where packages are installed, for example, where numpy ends up after doing pip install numpy.

This raises a very interesting question: why did our code fail in the first attempt and it started working only after adding the __init__.py file. In order for Python to consider that a folder is a package, it must contain an * __init__.py* file. This is by design, exactly to prevent unintended name clashes unless we explicitly want them.

If Python does not find the package within the local or default installation directories, it moves to look into the folders. That is why package_a works even if we never defined the __init__.py file.

Bear in mind that once Python finds the package, it won't keep searching. Once it finds numpy in our local folder, it won't look for another numpy elsewhere. Therefore, we can't mix modules from different packages with the same name.

The PATH list of directories

A useful thing to do is to check the directories Python uses to look for modules and packages. We can see it by running the following code:

import sys

for path in sys.path:
    print(path)

That will list something between 4 and 6 different folders. Most of them are quite logical: where Python is installed, the virtual environment folders, etc.

Adding Folder to the Path

An easy way of extending the capabilities of Python is to add folders to the list where it looks for packages. The first option is to do it at runtime. We can easily append a directory to the variable sys.path. To add the current directory to the list, we can do the following:

import os
import sys

CURR_DIR = os.path.dirname(os.path.abspath(__file__))
print(CURR_DIR)
sys.path.append(CURR_DIR)
for path in sys.path:
    print(path)

We can add any directory, not only the current one. In this approach, we modify the system path only while the program runs. If we run two different programs, each will have its own path.

Another option is to modify a variable in the operating system itself. This has the advantage that it can be made permament and all programs will share the same information. For our application, we have to modify the PYTHONPATH environment variable. Environment variables are available on every operating system, how to set and modify them varies.

On Linux or Mac, the command to set these variables is export. We can do the following:

export PYTHONPATH=$PYTHONPATH':/home/user/'
echo $PYTHONPATH

The first line appends the folder /home/user to the variable PYTHONPATH. Note that we have used : as a directory separator.

On Windows, we need to right-click on "Computer", select "Properties". In the "Advanced System Settings" there is the option "Environment variables". If PYTHONPATH exists, we can modify it, if it does not exist, we can create it by clicking on "New". Bear in mind that on Windows, you have to use ; to separate directories, since : is part of the folder path (e.g.: C:\Users\Test\...).

We can check whether the modifications to the system environment variables worked by running the same code:

import sys

for path in sys.path:
    print(path)

Adding information to the Python Path is a great way of developing a structure on your own computer, with code in different folders, etc. However, it is also important to note that it also makes harder to maintain. The environmental variables in one computer are not the same in another, Python may be loading legacy code from an abscure place on the computer. On the other hand, environmental variables are very useful in contexts like a web server, where the definitions can be loaded before running a program.

As a quick side-note, it is worth mentioning that Python allows to read environment variables at runtime:

import os

print(os.environ.get('PYTHONPATH'))

Note that on Windows, the changes to environment variables are permanent, but on Linux and Mac we need to follow extra steps if we want them to stay.

PYTHONPATH and Virtual Environments

When we work with virtual environments, we can modify environment variables when we activate or deactivate them. This works seamlessly on Linux and Mac, but Windows users may require some tinkering to adapt the examples below.

If we inspect the activate script (located in the folder venv/bin), we can get inspiration about what is done with the PATHvariable, for example. The first step is to store the old variable, before modifying it, then we append whatever we want. When we deactivate the virtual environment, we set the old variable back.

Virtual environments have three hooks to achieve this behavior. Next to the activate script, we can also see three files called postactivate, postdeactivate and predeactivate. We can modify postactivate, which should be empty, and add the following:

PYTHONPATH_OLD="$PYTHONPATH"
PYTHONPATH=$PYTHONPATH":/home/user"
export PYTHONPATH
export PYTHONPATH_OLD

Next time we activate the virtual environment, we will have the directory /home/user added to the PYTHONPATH. It is a good practice to go back to the original version of the python path once we deactivate tne environment. We can do it directly in the predeactivate file:

PYTHONPATH="$PYTHONPATH_OLD"
unset $PYTHONPATH_OLD

We set the variable to the status it had before activating, and we remove the extra variable we created. Note that in case we don't deactivate the environment, but simply close the terminal, the changes to the PYTHONPATH won't be saved. The predeactivate script is important if you switch from one environment to another and keep using the same terminal.

PYTHONPATH and PyCharm

Users of PyCharm, and probably most other IDE's around will be similar, can change the environment variables directly from within the program. If we open the Run menu, and select Edit Configurations we will be presented with the following menu:

<:image:PyCharm_config.png>

In between the options we can see "Add content roots to PYTHONPATH". This is what makes the imports work out of the box when we are in Pycharm but if we run the same code directly from the terminal may give you some issues. We can also edit the environment variables if we click on the small icon to the right of where it says "environment variables".

Keeping an eye on the environment variables can avoid problems in the long run. Especially if, for example, two developers share the computer. Although strange in many settings, lab computers are normally shared between people, and the software can be edited by multiple users. Perhaps one sets environment variables pointing to specific paths which are not what the second person is expecting.

Absolute Imports

In the examples of the previous sections, we imported a function downstream in the file system. This means, that the function was inside a folder next to the main script file. However, we should also study what happens if we want to import from a sibling package. Imagine we have the following situation:

├── __init__.py
├── pkg_a
│   ├──  mod_a.py
│   └── __init__.py
├── pkg_b
│   ├── mod_b.py
│   └── __init__.py
└── start.py

We have a start file in the top-level directory and two packages, pkg_a and pkg_b. Each one has its own * __init__ file. The question is how can we have access to the contents of mod_a from withing mod_b. From the start* file, the import procedure is easy:

from pkg_a import mod_a
from pkg_b import mod_b

We can create some dummy code in order to have a concrete example. First, in the file mod_a, we can create a function:

def simple():
    print('This is simple A')

Which, from the start file we can use as follows:

from pkg_a.mod_a import simple

simple()

If we want to use the same function within the mod_b, the first thing we can try is to simply copy the same line. Thus, in mod_b we can try:

from pkg_a.mod_a import simple


def bsimple():
    print('This is simple B')
    simple()

To make it complete, we can trigger it directly from withing the start file:

from pkg_b import mod_b

mod_b.bsimple()

If we run it, we will get the output we were expecting:

$ python start.py
This is simple B
This is simple

However, and this is very big, HOWEVER, sometimes we don't want to runstart. Instead, we want to run directly * mod_b*. If we try to run it, the following happens:

$ python mod_b.py
Traceback (most recent call last):
  File "mod_b.py", line 1, in <module>
    from pkg_a.mod_a import simple
ModuleNotFoundError: No module named 'pkg_a'

And here we start to realize the headaches that the importing in Python can generate as soon as the program gets a bit more sophisticated. In the end, the error was expected. When we run python mod\_b.py, Python will try to find pkg\_a in the same folder, and not one level up. When we trigger start there is no problem, because from that directory both pkg\_a and pkg\_b are visible.

The same problem will appear if we trigger python from any other location in the computer:

$ python /path/to/project/start.py

What we did in the examples above is called absolute imports. It means that we specify the full path to the module we want to import. What we have to remember is that the folder from which you trigger Python is the first place where the program looks for modules. Then it goes to the paths stored in sys.path. If we want the code to work, we need to be sure that Python knows where pkg_a and pkg_b are stored.

The proper way would be to include the folder in the PYTHONPATH environment variable, as we explained earlier. A dirtier way would be to append the folder at runtime, we can add the following lines to mod_b.py:

import os
import sys

BASE_PATH = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
sys.path.append(BASE_PATH)

from pkg_a.mod_a import simple

This is very similar to what we have done earlier. It is important to highlight that the definition of BASE_PATH is the full path to the folder one level above where the current file (mod_b.py) is. Note also that we need to append to the sys.path before we try to import pkg_a, or it will fail in the same way it did before.

If we think for a second about this approach, we can quickly notice that it has several drawbacks. The most obvious one is that we should add those lines to every single file you are working with. Another problem is that we are adding a folder that contains many packages, which can give some collisions. Imagine we are a theoretical physicists working on string theory and we develop a module called string. The code would look like this:

from string import m_theory

And it will give us problems because string belongs to Python's standard library.

Therefore, it is always better to develop projects in their own folder, even if that forces a bit of a name repetition. In the very simple case we are dealing with here, the structure would be like this:

code
├── pkg_a
│   ├── docs
│   └── pkg_a
│       ├── __init__.py
│       └── mod_a.py
└── pkg_b
    ├── docs
    └── pkg_b
        ├── __init__.py
        └── mod_b.py

In the folder tree above we have a base folder code. Inside there are two packages, a and b. Although the name of the folders repeat (we have twice pkg_a, twice pkg_b, for example), there are several advantages to working in this way. The most important one is the granularity. We can add code/pkg_a or code/pkg_b to the PYTHONPATH. Having control is always better than getting blanket results.

The most important thing to remember is that in Python, absolute is relative. While importing, we are not specifying a path in the file system, but rather an import path. Therefore, the imports are always relative to the PYTHONPATH, even if called absolute.

Relative Imports

Another option for importing modules is to define the relative path. We can continue building on the example from the previous section. Imagine We have a folder structure like this:

code
├── pkg_a
│   ├── mod_a.py
│   └── __init__.py
├── pkg_b
│   ├──  mod_b.py
│   ├── __init__.py
│   └── pkg_a
│       ├──  mod_c.py
│       └── __init__.py
└── start.py

Let's assume that each mod_X.py defines a function called function_X (where X is the letter of the file). The function simply prints the name of the function. It should be clear that if we want to import function_c from file_c, the start.py file should look like:

from pkg_b.pkg_a.mod_c import function_c

The situation becomes more interesting when we want to import function_a in mod_b. It is important to pay attention because there are two different pkg_a defined in our program. If we add the following to mod_b:

from pkg_a.mod_c import function_c

It would work, regardless of how we run the script:

$ python pkg_b/mod_b.py
$ cd pkg_b
$ python mod_b.py

But this is not what we wanted! We want function_a from mod_a. If we, however, add the following to mod_b:

from pkg_a.mod_a import function_a

We would get the following error:

$ python pkg_b/mod_b.py
Traceback (most recent call last):
  File "pkg_b/mod_b.py", line 1, in <module>
    from pkg_a.mod_a import function_a
ImportError: No module named pkg_a

In this case is where relative imports become very handy. From mod_b, the module we want to import is one folder up. To indicate that, we can use the .. notation in Python:

from ..pkg_a.mod_a import function_a


def function_b():
    print('This is simple B')
    function_a()


function_b()

Generally speaking, the first . means in this directory, while the second means going one level up, etc. However, if we run the file, there will be problems. If we run the file, we get the following error:

$ python3 mod_b.py
Traceback (most recent call last):
  File "mod_b.py", line 1, in <module>
    from ..pkg_a.mod_a import function_a
ValueError: attempted relative import beyond top-level package

It doesn't matter if we change folders, if we move one level up, we will get the same problem:

$ python3 pkg_b/mod_b.py
Traceback (most recent call last):
  File "mod_b.py", line 1, in <module>
    from ..pkg_a.mod_a import function_a
ValueError: attempted relative import beyond top-level package

At some point, this becomes nerve-wracking. It doesn't matter if we add folders to the PATH, create __init__.py files, etc. It all boils down to the fact that we are not treating our files as a module when we run it. To instruct Python to run the file as part of a package, we would do:

$ python3 -m code.pkg_b.mod_b
This is function_b
This is function_a

Bear in mind that the only way of running the code like this is if python knows where to find the folder code. And this brings us back to the discussion of the PYTHONPATH variables. If we are in the folder that contains code and run Python from there, we won't see any problems. If we, however, are in any other folder, Python will follow to usual rules to try to understand where code is.

There is one more important detail to discuss with relative imports. We can imagine that mod_c has the following code:

from ..mod_b import function_b


def function_c():
    print('This is function c')
    function_b()


function_c()

Since mod_c is deeper in the tree, we can try to run it in different ways:

$ python -m code.pkg_b.pkg_a.mod_c
$ python -m pkg_b.pkg_a.mod_c

However, the second option is going to fail. mod_c is importing mod_b which in turn is importing mod_a. Therefore, Python needs to be able to go all the way to the root folder code. Therefore, when we plan our code, we should be mindful not only on how to write it, but on how the program is meant to be used.

The last detail to cover is that we can't mix relative and absolute imports. For example, the following won't work:

from ..pkg_a.mod_a import function_a
from pkg_a.mod_c import function_c


def function_b():
    print('This is function_b')
    function_a()


function_b()

We will get the following error:

$ python -m code.pkg_b.mod_b
Traceback (most recent call last):
[...]
ModuleNotFoundError: No module named 'pkg_a'

When we decide to run your code as a module (using the -m), then all the imports relative. One way of solving the problem would be to change the following:

from .pkg_a.mod_c import function_c

In this way it becomes clear that we want are importing from pkg_a which is in the same folder as mod_b.

Mixing Absolute and Relative

It is possible mixing relative and absolute imports without any secrets to it. We can change mod_b.py like this:

from ..pkg_a.mod_a import function_a
from code.pkg_b.pkg_a.mod_c import function_c


def function_b():
    print('This is function_b')
    function_c()


function_b()

Mixing relative and absolute is definitely a possibility. The question, as almost always, is why would we do it. The fact that we can does not mean we should.

Absolute or Relative: Conclusions

Deciding whether we want to use absolute imports or relative imports is basically up to the taste of the developer or the rules established by the group. If we are developing a package that has a lot of sub-packages and modules with several layers of nesting, using absolute imports can make the code clearer. For example, this is how the same import would look like in the two different cases:

from program.pkg_1.pkg_2.pkg_3.module import my_function
from .module import my_function

For some people the first line is much clearer, there are no doubts about what are we importing. But it can get tiresome to type the entire path all the time. However, it is important to consider that typing less is not the only factor at play here.

If we are planning on allowing some files to run directly we should be mindful about the requirements for the relative imports. If we have many modules with similar names, sometimes the explicit path makes the code much clearer. It is really up to the developer to have enough sensitivity to decide whether the absolute import or the relative import makes the code clearer and the execution easier.

The example code for this article can be found on Github

Article written by Aquiles Carattino
Join our newsletter!
If you liked the content, sign up to never miss an update.

Share your thoughts with us!

Support Us

If you like the content of this website, consider buying a copy of the book Python For The Lab