How To Create Your First Python Package and Deploy To GitHub
In certain scenarios, you may find yourself needing to re-use functionality in multiple projects. Packages provide an easy way to re-use code in various projects. Packages can be used as single utility apps that can easily be plugged in into projects. In other scenarios, you may use packages as wrappers e.g in wrapping the python requests function to access data. Assume for instance you have two APIs that need to use data fetched from a given source, say a third party source, but you don't want the two developers to have different ways of accessing it. You'll most probably create a python package that wraps the python requests library and either of the two can access the data. An example of a package that does this is the excellent airtable-python-wrapper package that helps you access data from Airtable
In this tutorial, we will create a package called
testPackage. We will then deploy it to GitHub and show how to install the package from GitHub.
testPackage | |__ docs | |__ Books.md | |__ testPackage | |__ books.py | |__ __init__.py | |__ tests | |__ test_book.py | |__ CHANGES.txt |__ README.md |__ setup.py |__ LICENSE.txt
The 'most important' file while creating a python package is the setup.py file. This file defines the meta data of the package. A simple
setup.py file would look like:
import pathlib from setuptools import find_packages, setup HERE = pathlib.Path(__file__).parent VERSION = "0.1.0" PACKAGE_NAME = "shermantutorialpackage" AUTHOR = "Ouko Sherman" AUTHOR_EMAIL = "firstname.lastname@example.org" URL = "https://github.com/SHERMANOUKO/tutorialpackage" LICENSE = "MIT" DESCRIPTION = "A simple python package" LONG_DESCRIPTION = (HERE / "README.md").read_text() LONG_DESC_TYPE = "text/markdown" INSTALL_REQUIRES = ["numpy", "pandas"] PYTHON_REQUIRES = '>=3.7' setup( name=PACKAGE_NAME, version=VERSION, description=DESCRIPTION, long_description=LONG_DESCRIPTION, long_description_content_type=LONG_DESC_TYPE, author=AUTHOR, license=LICENSE, author_email=AUTHOR_EMAIL, url=URL, install_requires=INSTALL_REQUIRES, packages=find_packages(), python_requires=PYTHON_REQUIRES )
Lets explain the arguments that go into the
- name - This is the name with which you want your package to be distributed with. If you intend to deploy the package to Pypi then it shouldn't be a name that's already taken on pypi. I like to keep my names unique even if I'm deploying privately. This helps eliminate any possibilities of import conflicts. This name can only contain letters, underscore and hyphen
- version - Your package's version number.
- description - A short description of your package. Python foundation recommends it to be a single sentence
- long_description - This is the long description of the package. I prefer to use a markdown file for this which is usually the projects README file (as in the case of the above file structure example)
- long_description_content_type - The content type of the long description. In this case its a text/markdown file.
- author - The author of the package
- license - The license type of the package. In this case its an MIT license
- author_email - Email of the author
- url - The link to the home page of the package
- install_requires - Defines the minimal dependencies needed by the package to run effectively. If your package is installed via pip, these dependencies will also be automatically installed. It is often good practice to define the limits of these dependencies e.g
install_requires = ['pandas>=2.0']
- packages - is a list of all Python import packages that should be included in the distribution package Instead of listing each package manually, we can use
find_packages()to automatically discover all packages and sub-packages.
- python_requires - Important for python environment compatibility where your package runs for specific python versions only.
There are many other arguments that can be passed, but this should be enough to get you going. You can read about other arguments here
Apart from the setup file, we have other folders in the project above. These folders are not mandatory for a python package, but structuring it this way makes the code cleaner and easy to navigate to.
docs - This folder stores the documentation of the whole package. In my case, especially with projects hosted on GitHub, I prefer using markdown files. You can also use tools such as Sphinx to generate documentation
tests - This folder is used to store all the package's test files.
testPackages - You'll notice a folder called testPackages. This folder holds the code files for the package. The folder can take any name you wish to give it. In my case its preferable to call it the same name as the name of the package but its not mandatory. In the folder there's an
__init__.py file. This file enables the python interpreter know that the directory has importable models.
CHANGES.txt - Maintains a record of the changes that have taken place since the deployment of the package
README.md - The README file to describe and even offer some guidance on how to get started and use the package.
LICENSE.txt - Describes the terms and conditions on which your package can be used.
Deploying to GitHub
Deploying the package to GitHub is as simple as just committing to GitHub. However, ensure that the URL argument of setup matches the URL of the GitHub repository you are deploying the package to.
To install your package from GitHub, you can clone the package repo. However, its best to install the package using pip. If you are accessing GitHub using ssh use,
pip install git+ssh://email@example.com/your_username/your_repo.git
Otherwise if you are not authenticating with ssh, then you can install with
pip install git+https://github.com/your_username/your_repo.git
That's it. You are now ready to create your first package. Feel free to drop a comment or email me in-case you get stuck.
Cheers!! Happy Coding :-)