.. SPDX-FileCopyrightText: 2020 - 2025 CERN .. SPDX-FileCopyrightText: 2023 - 2025 GSI Helmholtzzentrum für Schwerionenforschung .. SPDX-FileNotice: All rights not expressly granted are reserved. .. .. SPDX-License-Identifier: GPL-3.0-or-later OR EUPL-1.2+ Packaging Crash Course ====================== This tutorial teaches you briefly how to create a Python package, set up a CI pipeline and publish it to the `Acc-Py Package Index`_. It uses Acc-Py_ to simplify the process, but also explains what happens under the hood. For each topic, hyperlinks to further information are provided. Each section should be reasonably self-contained. Feel free to skip boring sections or go directly to the one that answers your question. See also the `Acc-Py deployment walkthrough`_ for an alternative approach that converts an unstructured repository of Python code into a deployable Python package. .. _Acc-Py Package Index: https://wikis.cern.ch/display/ACCPY/Python+package+index .. _Acc-Py: https://wikis.cern.ch/display/ACCPY/ .. _Acc-Py deployment walkthrough: https://wikis.cern.ch/display/ACCPY/Deployment+walk-through Loading Acc-Py -------------- If you trust your Python environment, feel free to skip this section. This serves as a baseline from which beginners can start and be confident that none of the experimentation here will impact their other projects. Start out by loading Acc-Py. We recommend using the latest Acc-Py Base distribution (2021.12 at the time of this writing): .. code-block:: shell-session $ source /acc/local/share/python/acc-py/base/pro/setup.sh If you put this line into your :file:`~/.bash_profile` script [#profile]_, it will be executed every time you log into your machine. If you don't want this, but you also don't want to have to remember this long path, consider putting an alias into your :file:`~/.bash_profile` instead: .. code-block:: shell-session $ alias setup-acc-py='source /acc/local/share/python/acc-py/base/pro/setup.sh' This way, you can load Acc-Py by invoking :command:`setup-acc-py` on your command line. .. note:: If you want to use Acc-Py outside of the CERN network, the `Acc-Py Package Index`_ wiki page has instructions on how to access it from outside. If you want to use multiple Python versions on the same machine, you may use a tool like Pyenv_, Pyflow_ or Miniconda_. .. _Pyflow: https://github.com/David-OConnor/pyflow, .. _Pyenv: https://github.com/pyenv/pyenv or .. _Miniconda: https://docs.conda.io/en/latest/miniconda.html. Further reading in the Acc-Py Wiki: - `Acc-Py Base`__ - `Acc-Py Interactive Eenvironment`__ __ https://wikis.cern.ch/display/ACCPY/Acc-Py+base+distribution __ https://wikis.cern.ch/display/ACCPY/Interactive+environment .. [#profile] See `here `_ for the difference between :file:`.bash_profile` and :file:`.profile`. Creating a Virtual Environment ------------------------------ Virtual environments (or :doc:`venvs ` for short) separate dependencies of one project from another. This way, you can work on one project that uses PyTorch 1.x, switch your venv, then work on another project that uses PyTorch 2.x. Venvs also allow you to install dependencies that are not available in the Acc-Py distribution. This approach is much more robust than installing them into your home directory via :command:`pip install --user`. The latter often leads to hard-to-understand import errors, so it is discouraged. If you're working on your `BE-CSS VPC`_, we recommend creating your venv in the :file:`/opt` directory, since space in your home directory is limited. Obviously, this does not work on LXPLUS_, where your home directory is the only choice. .. _BE-CSS VPC: https://wikis.cern.ch/display/ACCADM/VPC+Virtual+Machines+BE-CSS .. _LXPLUS: https://lxplusdoc.web.cern.ch/ .. code-block:: shell-session $ # Create a directory for all your venvs. $ sudo mkdir -p /opt/home/$USER/venvs $ # Make it your own (instead of root's). $ sudo chown "$USER:" /opt/home/$USER/venvs $ acc-py venv /opt/home/$USER/venvs/coi-example .. note:: The :command:`acc-py venv` command is a convenience wrapper around the :mod:`std:venv` standard library module. In particular, it passes the ``--system-site-packages`` flag. This flag ensures that everything that is pre-installed in the Acc-Py distribution also is available in your new environment. Without it, you would have to install common dependencies such as :doc:`NumPy `. Once the virtual environment is created, you can activate it like this: .. code-block:: shell-session $ source /opt/home/$USER/venvs/coi-example/bin/activate $ which python # Where does our Python interpreter come from? /opt/home/.../venvs/coi-example/bin/python $ # deactivate # Leave the venv again. After activating the environment, you can give it a test run by upgrading the Pip package manager. This change should be visible only within your virtual environment: .. code-block:: shell-session $ pip install --upgrade pip Further reading in the Acc-Py Wiki: - `Getting started with Acc-Py`__ - `Acc-Py Development advice`__ __ https://wikis.cern.ch/display/ACCPY/Getting+started+with+Acc-Py __ https://wikis.cern.ch/display/ACCPY/Development+advice Setting up the Project ---------------------- Time to get started! Go into your projects folder and initialize a project using Acc-Py: .. code-block:: shell-session $ cd ~/Projects $ acc-py init coi-example $ cd ./coi-example .. note:: Don't forget to hit the tab key while typing the above lines, so that your shell will auto-complete the words for you! The :command:`acc-py init` command creates a basic project structure for you. You can inspect the results via the :command:`tree` `command `_: .. _tree: http://mama.indstate.edu/users/ice/tree/ .. code-block:: shell-session $ tree . ├── coi_example │ ├── __init__.py │ └── tests │ ├── __init__.py │ └── test_coi_example.py ├── README.md └── setup.py This is usually enough to get started. However, there are two useful files that Acc-Py does not create for us: :file:`.gitignore` and :file:`pyproject.toml`. If you're not in a hurry, we suggest you create them now. Otherwise, continue with :ref:`tutorials/packaging:Adding Dependencies`. Further reading in the Acc-Py wiki: - `Starting a new Python project`__ - `Project Layout`__ - `Creating a Python package from a directory of scripts`__ __ https://wikis.cern.ch/display/ACCPY/Getting+started+with+Acc-Py#GettingstartedwithAcc-Py-StartinganewPythonproject __ https://wikis.cern.ch/display/ACCPY/Project+layout __ https://wikis.cern.ch/display/ACCPY/Creating+a+Python+package+from+a+directory+of+scripts Adding :file:`.gitignore` (Optional) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The :file:`.gitignore` file tells Git which files to ignore. Ignored files will never show up as untracked or modified if you run :command:`git status`. This is ideal for caches, temporary files and build artifacts. Without :file:`.gitignore`, :command:`git status` would quickly become completely useless. While you can create this file yourself, we recommend you download Python.gitignore_; it is comprehensive and universally used. .. _Python.gitignore: https://github.com/github/gitignore/blob/master/Python.gitignore .. warning:: After downloading the file and putting it inside your project folder, don't forget to *rename* it to :file:`.gitignore`! It is very common to later add project-specific names of temporary and `glob patterns`_ to this list. Do not hesitate to edit it! It only serves as a starting point. .. _glob patterns: https://en.wikipedia.org/wiki/Glob_(programming) .. note:: If you use an IDE like `PyCharm`_, it is very common that IDE-specific config and manifest files will end up in your project directory. You *could* manually add these files to the :file:`.gitignore` file of every single project. However, it's easier in the long to instead add these file names to the `global gitignore `_ file that is used for your entire computer. This means you don't have to ignore these files in the next project again. .. _PyCharm: https://www.jetbrains.com/pycharm/ .. _git-excludelist: https://git-scm.com/book/en/v2/Customizing-Git-Git-Configuration#_core_excludesfile Further reading: - `A collection of useful .gitignore templates`__ on GitHub.com - `Ignoring Files`__ in the Git Book - `Gitignore reference`__ __ https://github.com/github/gitignore/ __ https://git-scm.com/book/en/v2/Git-Basics-Recording-Changes-to-the-Repository#_ignoring __ https://git-scm.com/docs/git-check-ignore Adding :file:`pyproject.toml` (Optional) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `Setuptools`_ is still the most common tool used to build and install Python packages. Traditionally, it expects project data (name, version, dependencies, …) to be declared in a :file:`setup.py` file. Many people don't like this approach. Executing arbitrary Python code is a security risk and it's hard to accommodate alternative, more modern build tools such as Poetry_, Flit_ or Meson_. For this reason, the Python community has been slowly moving towards a more neutral format. .. _Setuptools: https://setuptools.readthedocs.io/ .. _Poetry: https://python-poetry.org/docs/pyproject/#poetry-and-pep-517 .. _Flit: https://flit.pypa.io/en/latest/ .. _Meson: https://thiblahute.gitlab.io/mesonpep517/pyproject.html This format is the :file:`pyproject.toml` file. It allows a project to declare the build system that it uses and can be read without executing untrusted Python code. In addition, many Python tools (e.g. `Black `_, `Isort `_, `Pylint `_, `Pytest `_, `Setuptools-SCM `_) can be configured in this file. This reduces clutter in your project directory and makes it possible to do all configuration using a single file format. .. _Black-TOML: https://black.readthedocs.io/en/stable/usage_and_configuration/the_basics.html#what-on-earth-is-a-pyproject-toml-file .. _Isort-TOML: https://pycqa.github.io/isort/docs/configuration/config_files.html#pyprojecttoml-preferred-format .. _Pylint-TOML: https://pylint.pycqa.org/en/latest/user_guide/usage/run.html#command-line-options .. _Pytest-TOML: https://docs.pytest.org/en/latest/reference/customize.html#pyproject-toml .. _Setuptools-SCM-TOML: https://github.com/pypa/setuptools_scm#pyprojecttoml-usage If you wonder what a TOML_ file is, it is a config file format like YAML or INI, but with a focus on clarity and simplicity. .. _TOML: https://toml.io/en/ This is what a minimal :file:`pyproject.toml` file using Setuptools looks like: .. code-block:: toml # pyproject.toml [build-system] requires = ['setuptools'] build-backend = 'setuptools.build_meta' The section ``build-system`` tells Pip how to install our package. The key ``requires`` gives a list of necessary Python packages. The key ``build-backend`` points at a Python function that Pip calls to handle the rest. Between all of your Python projects, this section will almost never change. And this is a slightly more complex :file:`pyproject.toml`, that also configures a few tools. Note that the file would be only about 20 lines long: .. code-block:: toml # We can require minimum versions and [extras]! [build-system] requires = [ 'setuptools >= 64', 'setuptools-scm[toml] ~= 8.0', 'wheel', ] build-backend = 'setuptools.build_meta' # Tell isort to be compatible with the Black formatting style. # This is necessary if you use both tools. [tool.isort] profile = 'black' # Note that there is no section for Black itself. Normally, # we don't need to configure a tool just to use it! # Setuptools-SCM, however, is a bit quirky. The *presence* # of its config block is required to activate it. [tool.setuptools_scm] # PyTest takes its options in a nested table # called `ini_options`. Here, we tell it to also run # doctests, not just unit tests. [tool.pytest.ini_options] addopts = '--doctest-modules' # PyLint splits its configuration across multiple tables. # Here, we disable one warning and minimize their report # size. [tool.pylint.reports] reports = false score = false # Note how we quote 'messages control' because it contains # a space character. [tool.pylint.'messages control'] disable = ['similarities'] Further reading: - `What the heck is pyproject.toml?`__ - `PEP 518 introducting pyproject.toml`__ - `Awesome Pyproject.toml`__ __ https://snarky.ca/what-the-heck-is-pyproject-toml/ __ https://www.python.org/dev/peps/pep-0518/ __ https://github.com/carlosperate/awesome-pyproject Adding Dependencies ------------------- Once this is done, we can edit the :file:`setup.py` file created for us and fill in the blanks. This is what the new requirements look like: .. code-block:: python # setup.py REQUIREMENTS: dict = { "core": [ "cernml-coi ~= 0.9.0", "gymnasium >= 0.29", "matplotlib ~= 3.0", "numpy ~= 1.0", "pyjapc ~= 2.0", ], "test": [ "pytest", ], } And this is the new ``setup()`` call: .. code-block:: python # setup.py (cont.) setup( name="coi-example", version="0.0.1.dev0", author="Your Name", author_email="your.name@cern.ch", description="An example for how to use the cernml-coi package", long_description=LONG_DESCRIPTION, long_description_content_type="text/markdown", packages=find_packages(), python_requires=">=3.9", classifiers=[ "Programming Language :: Python :: 3", "Intended Audience :: Science/Research", "Natural Language :: English", "Operating System :: OS Independent", "Programming Language :: Python :: 3 :: Only", "Programming Language :: Python :: 3.9", "Programming Language :: Python :: 3.10", "Programming Language :: Python :: 3.11", "Topic :: Scientific/Engineering :: Artificial Intelligence", "Topic :: Scientific/Engineering :: Physics", ], # Rest as before … ) Of all these changes, only the *description* and the *requirements* were really necessary. Things like classifiers are nice-to-have metadata that we could technically also live without. Further reading: - `Packaging of your module`__ in the Acc-Py Wiki - `Setuptools Quickstart`__ - `Dependency management in Setuptools`__ - `Setuptools keywords`__ __ https://wikis.cern.ch/display/ACCPY/Development+Guidelines#DevelopmentGuidelines-Packagingofyourmodule __ https://setuptools.readthedocs.io/en/latest/userguide/quickstart.html __ https://setuptools.readthedocs.io/en/latest/userguide/dependency_management.html __ https://setuptools.readthedocs.io/en/latest/references/keywords.html Version Requirements (Digression) --------------------------------- .. note:: This section is purely informative. If it bores you, feel free to skip ahead to :ref:`tutorials/packaging:Test run`. When specifying your requirements, you should make sure to put in a *reasonable* version range for two simple reasons: - Being **too lax** with your requirements means that a package that you use might change something and your code suddenly breaks without warning. - Being **too strict** with your requirements means that other people will have a hard time making your package work in conjunction with theirs, even though all the code is correct. There are two common ways to specify version ranges: - ``~= 0.4.2`` means: “I am compatible with version :samp:`0.4.2` and higher, but **not** with any version :samp:`0.5.{X}`.” This is a good choice if the target adheres to `Semantic Versioning`_. (Not all packages do! NumPy doesn't, for example!) - ``>=1.23, <1.49`` means: “I am compatible with version ``1.23`` and higher, but not with version ``1.49`` and beyond.” This is a reasonable choice if you know a version of the target that works for you and a version that doesn't. .. _Semantic Versioning: https://semver.org/ :pep:`Other version specifiers <440#version-specifiers>` mainly exist for strange edge cases. Only use them if you know what you're doing. Further reading: - `Dependency and release management`__ in the Acc-Py Wiki __ https://wikis.cern.ch/display/ACCPY/Dependency+and+release+management Test Run -------- With this minimum in place, your package already can be installed via Pip! Give it a try: .. code-block:: shell-session $ pip install . # "." means "the current directory". Once this is done, your package is installed in your venv and can be imported by other packages *without* any path hackery: .. code-block:: python >>> import coi_example >>> coi_example.__version__ '0.0.1' >>> import pkg_resources >>> pkg_resources.get_distribution('coi-example') coi-example 0.0.1.dev0 (/opt/home/.../venvs/coi-example/lib/python3.9/site-packages) Of course, you can always remove your package again: .. code-block:: shell-session $ pip uninstall coi-example .. warning:: Installation puts a **copy** of your package into your venv. This means that every time you change the code, you have to reinstall it for the changes to become visible. There is also the option to symlink from your venv to your source directory. In this case, all changes to the source code become visible *immediately*. This is bad for a production release, but extremely useful during development. This feature is called an *editable install*: .. code-block:: shell-session $ pip install --editable . # or `-e .` for short Further reading: - `When would the -e, --editable option be useful with pip install?`__ __ https://stackoverflow.com/questions/35064426 SDists and Wheels (Digression) ------------------------------ .. note:: This section is purely informative. If it bores you, feel free to skip ahead to :ref:`tutorials/packaging:Continuous Integration`. The act of bringing Python code into a publishable format has a lot of historical baggage. This section skips most of the history and explains the terms that are most relevant today. Python is an interpreted language. As such, one *could* think that there is no compilation step, and that the source code of a program is enough in order to run it. However, this assumption is wrong for a number of reasons: - :doc:`some libraries ` contain extension code written in C or FORTRAN that must be compiled before using them; - `some libraries `_ generate their own Python code during installation; - *all* libraries must provide :pep:`their metadata <345>` in a certain, standardized format. .. _PyTZ: https://launchpad.net/pytz As such, even Python packages must be built to some extent before publication. The publishable result of the build process is a :term:`pkg:distribution package` (confusingly often called *distribution* or *package* for short). There are several historical kinds of distribution packages, but only two remain relevant today: sdists and wheels. :term:`Sdists ` contain only the above mentioned metadata and all relevant source files. It does not contain project files that are not packaged by the author (e.g. :file:`.gitignore` or :file:`pyproject.toml`). Because it contains source code, any C extensions must be compiled during installation. For this reason, installation is a bit slower and may run arbitrary code. :term:`Wheels ` are a binary distribution format. Under the hood, they are zip files with a certain directory layout and file name. They come fully built and any C extensions are already compiled. This makes them faster and safer to install than sdists. The disadvantage is that *if* your project contains C extensions, you have to provide one wheel for each supported platform. Given that most projects will be written purely in Python, wheels are the preferred distribution format. Depending on circumstances, it may make sense to publish an sdist in addition. The way to manually create and upload a distribution to the package repository is `described elsewhere `_. See :ref:`tutorials/packaging:Releasing a Package via CI` for the preferred and supported method at CERN. .. _Acc-Py package upload: https://wikis.cern.ch/display/ACCPY/Development+Guidelines#DevelopmentGuidelines-CreationandUploadofyourpackage Further reading: - `What are Python wheels and why should you care?`__ - `Building wheels for Python packages`__ on the Acc-Py Wiki - :doc:`Python packaging user guides ` - `Twisted history of Python packaging`__ (2012) __ https://realpython.com/python-wheels/ __ https://wikis.cern.ch/display/ACCPY/Building+wheels+for+Python+packages __ https://www.youtube.com/watch?v=lpBaZKSODFA (2012) Continuous Integration ---------------------- `Continuous integration`_ is a strategy that prefers to merge features into the main development branch frequently and early. This ensures that different branches never diverge too much from each other. To facilitate this, websites like Gitlab offer `CI pipelines`_ that build and test code on each push *automatically*. .. _Continuous integration: https://en.wikipedia.org/wiki/Continuous_integration .. _CI pipelines: https://gitlab.cern.ch/help/ci/quick_start/index.md `Continuous delivery`_ takes this a step further and also automates the release of software. When people talk about “CI/CD”, they usually refer to having an automated pipeline of tests and releases. .. _Continuous delivery: https://en.wikipedia.org/wiki/Continuous_delivery Why do we care about all of this? Because Gitlab's CI/CD pipeline is the *only* supported way to put our Python package on the `Acc-Py package index`_. You configure the pipeline with a file called :file:`.gitlab-ci.yml` at the root of your project. Run the command :command:`acc-py init-ci` to have a template of this file generated in your project directory. It should look somewhat like this: .. code-block:: yaml # Use the acc-py CI templates documented at # https://acc-py.web.cern.ch/gitlab-mono/acc-co/devops/python/acc-py-gitlab-ci-templates/docs/templates/master/ include: - project: acc-co/devops/python/acc-py-gitlab-ci-templates file: v2/python.gitlab-ci.yml variables: project_name: coi_example # The PY_VERSION and ACC_PY_BASE_IMAGE_TAG variables control the # default Python and Acc-Py versions used by Acc-Py jobs. It is # recommended to keep the two values consistent. More details # https://acc-py.web.cern.ch/gitlab-mono/acc-co/devops/python/acc-py-gitlab-ci-templates/docs/templates/master/generated/v2.html#global-variables. PY_VERSION: '3.9' ACC_PY_BASE_IMAGE_TAG: '2021.12' # Build a source distribution for foo. build_sdist: extends: .acc_py_build_sdist # Build a wheel for foo. build_wheel: extends: .acc_py_build_wheel # A development installation of foo tested with pytest. test_dev: extends: .acc_py_dev_test # A full installation of foo (as a wheel) tested with pytest on an # Acc-Py image. test_wheel: extends: .acc_py_wheel_test # Release the source distribution and the wheel to the acc-py # package index, only on git tag. publish: extends: .acc_py_publish Let's see what these pieces do. ``include`` The first block makes a number of `Acc-Py CI templates`_ available to you. These templates are a pre-bundled set of configurations that make it easier for us to define our pipeline in a bit. You can distinguish job templates from regular jobs by because their names `start with a period `_ (``.``). .. _Acc-Py CI templates: https://acc-py.web.cern.ch/gitlab-mono/acc-co/devops/python/acc-py-gitlab-ci-templates/docs/templates/master/ .. _hidden jobs: https://gitlab.cern.ch/help/ci/jobs/index.md#hide-jobs ``variables`` The next block defines a set of variables that we can use in our job definitions with the syntax :samp:`${variable-name}`. The variables defined here are not special on their own, but the `Acc-Py CI templates`_ happen to use them to fill some blanks, such as which Python version you want to use. ``build_sdist`` This is our first **job definition**. The name has no special meaning; in principle, you can name your jobs whatever you want (though you should obviously pick something descriptive). Each job has a **trigger**, i.e. the conditions under which it runs. Examples are: on every push to the server, on every pushed Git tag, on every push to the ``master`` branch, or only when triggered manually. Each job also and a **stage** that determines at which point in the pipeline it will run. Though you can define and order stages as you like, the default is: build → test → deploy. Whenever a trigger fires, all relevant jobs are collected into a pipeline and run, one stage after the other. In our case, each job contains only one line; it tells us that our job **extends** a template. This means that it takes over all properties from that template. If you define any further attributes for this job, they will generally override the same properties of the template. See `here `_ for an example of what these templates look like. This gives you an idea of the keys you can and might want to override. Note that a job can extend multiple other jobs; the `merge details`_ for how this works are documented on Gitlab. .. _cI job code example: https://gitlab.cern.ch/acc-co/devops/python/acc-py-gitlab-ci-templates/-/blob/d515d27c/v2/python.gitlab-ci.yml#L156-177 .. _Merge details: https://gitlab.cern.ch/help/ci/yaml/yaml_optimization.md#merge-details Further reading: - `Get started with GitLab CI/CD`__ - `Keyword reference for the .gitlab-ci.yml file`__ __ https://gitlab.cern.ch/help/ci/quick_start/index.md __ https://gitlab.cern.ch/help/ci/yaml/index.md Testing Your Package -------------------- As you might have noticed, the :command:`acc-py init` call created a sub-package of your package called “tests”. This package is meant for *unit tests*, small functions that you can write to ensure the data transformation logic that you wrote does what you think it does. Acc-Py initializes your :file:`.gitlab-ci.yml` file with two jobs for testing: - a `dev test`_ that runs the tests directly in your source directory, - a `wheel test`_ that installs your package and runs the tests in the installed copy. This is particularly important, as it ensures that your package will work not just for you, but also for your users. .. _dev test: https://acc-py.web.cern.ch/gitlab-mono/acc-co/devops/python/acc-py-gitlab-ci-templates/docs/templates/master/generated/v2.html#acc-py-dev-test .. _wheel test: https://acc-py.web.cern.ch/gitlab-mono/acc-co/devops/python/acc-py-gitlab-ci-templates/docs/templates/master/generated/v2.html#acc-py-wheel-test Both use the same program, PyTest_, to discover and run your unit tests. The way it does that PyTest is simple: It searches for files that match the pattern :file:`test_*.py` and, inside, searches for functions that match ``test_*``. All functions that it finds are run without arguments. As long as they don't raise an exception, PyTest assumes they succeeded. :ref:`std:assert` should be used liberally in your unit tests to verify your assumptions. .. _Pytest: https://pytest.org/ If you have any non-trivial logic in your code – anything beyond getting and setting parameters – *strongly* recommend to put it into separate functions. These functions should only depend on their parameters and no global state. This way, it becomes *much* easier to write unit tests to ensure that they work as expected. And most importantly: that future changes that you make won't silently break them! If you're writing a COI optimization problem that does not depend on JAPC or LSA, there is one easy test case you can always add: run the COI checker with your class to catch some common pitfalls: .. code-block:: python # coi_example/tests/test_coi_example.py from cernml import coi def test_checker(): env = coi.make("YourEnv-v0") coi.check(env, warn=True, headless=True) If your program is in a very strange niche where it is impossible to test it reliably, you can also remove the testing code: remove the “tests” package, and delete the two test jobs from your :file:`.gitlab-ci.yml` file. Further reading: - :mod:`std:unittest.mock` standard library module - :mod:`std:doctest` standard library module - `Tests as part of application code`__ on the Acc-Py Wiki - `GUI testing`__ on the Acc-Py Wiki - `PAPC – a pure Python PyJapc offline simulator`__ on the Acc-Py Wiki - `Example CI setup to test projects that rely on Java`__ __ https://docs.pytest.org/en/latest/explanation/goodpractices.html#tests-as-part-of-application-code __ https://wikis.cern.ch/display/ACCPY/GUI+Testing __ https://wikis.cern.ch/display/ACCPY/papc+-+a+pure+Python+PyJapc+offline+simulator __ https://gitlab.cern.ch/scripting-tools/pyjapc/-/blob/master/.gitlab-ci.yml Releasing a Package via CI -------------------------- Once CI has been set up and tests have been written (or disabled), your package is ready for publication! Outside of CERN, Twine_ is the command of choice to upload a package to PyPI_, but Acc-Py already does this job for us. .. _Twine: https://twine.readthedocs.io/en/latest/ .. _PyPI: https://pypi.org/ .. warning:: Publishing a package is **permanent**! Once your code has been uploaded to the index, you *cannot* remove it again. And once a project name has been claimed, it usually cannot be transferred to another project. Be doubly and triply sure that everything is correct before following the next steps! If your project is not in a Git repository yet, this is the time to check it in: .. code-block:: shell-session $ git init $ git add --all $ git commit --message="Initial commit." $ git remote add origin ... # The clone URL of your Gitlab repo $ git push --set-upstream origin master Then, all that is necessary to publish the next (or first) version of your package is to create a `Git tag`_ and upload it to Gitlab. .. _Git tag: https://git-scm.com/book/en/v2/Git-Basics-Tagging .. code-block:: shell-session $ # The tag name doesn't actually matter, $ # but let's stay consistent. $ git tag v0.0.1.dev0 $ git push --tags This will trigger a CI pipeline that builds, tests and eventually `releases `_ your code. Once this pipeline has finished successfully (which includes running your tests), your package is published and immediately available anywhere inside CERN: .. _upload on tag: https://gitlab.cern.ch/acc-co/devops/python/acc-py-devtools/-/blob/master/acc_py_devtools/templates/gitlab-ci/upload_on_tag.yml .. code-block:: shell-session $ cd ~ $ pip install coi-example .. warning:: The **version of your package** is determined by :file:`setup.py`, *not* by the **tag name** you choose! If you tag another commit but don't update the version number, and you push this tag, your pipeline will kick off, run through to the deploy stage and then fail due to the version conflict. Further reading: - `Python package index `_ on the Acc-Py Wiki Extra Credit ------------ You are done! The following sections give only a little bit more background information on Python packaging, but they are not necessary for you to get off the ground. Especially if you're a beginner, feel free to stop here and maybe return later. Getting Rid of :file:`setup.py` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ While it is the standard that Acc-Py generates for us, there are several problems with putting all your project metadata into :file:`setup.py`: - No tools other than Setuptools can read the format. - It's impossible to extract metadata without executing arbitrary, possibly untrusted Python code. - The logic before the ``setup()`` call quickly becomes hard to read. - Most projects don't need the full flexibility of arbitrary Python to declare their metadata. For this reason, Setuptools recommends to put all your metadata into :file:`pyproject.toml`, like you already do for most other Python tools. The most important programming patterns you know from :file:`setup.py` can be easily replicated there using dedicates keys or values. Take for example this setup script: .. code-block:: python # setup.py from pathlib import Path from setuptools import find_packages, setup # Find the source code of our package. PROJECT_ROOT = Path(__file__).parent.absolute() PKG_DIR = PROJECT_ROOT / "my_package" # Find the version string without actually executing our package. with open(PKG_DIR / "__init__.py", encoding="utf-8") as infile: for line in infile: name, equals, version = line.partition("=") name = name.strip() version = version.strip() if name == "VERSION" and version[0] == version[-1] == '"': version = version[1:-1] break else: raise ValueError("no version number found") # Read our long description out of the README file. with open(PROJECT_ROOT / "README.rst", encoding="utf-8") as infile: readme = infile.read() setup( name="my_package", version=version, author="My Name", author_email="my.name@cern.ch", long_description=readme, packages=find_packages(), install_requires=[ "requests", "importlib_metadata; python_version < 3.8", ] extras_require={ "pdf": ["ReportLab>=1.2", "RXP"], "rest": ["docutils>=0.3", "pack == 1.1, == 1.3"], }, ) does the same as this configuration file: .. code-block:: toml # pyproject.toml [build-system] requires = ['setuptools'] build-backend = 'setuptools.build_meta' # ^^^ same as before ^^^ [project] name = 'my_package' readme = { file = 'README.rst' } dynamic = ['version'] authors = [ { name = 'My Name', email = 'my.name@cernch'}, # More than one author supported now! ] dependencies = [ 'requests', 'importlib_metadata; python_version < "3.8"' # String inside string! ] [project.optional-dependencies] pdf = ['ReportLab>=1.2', 'RXP'] rest = ['docutils>=0.3', 'pack ==1.1, ==1.3'] [tool.setuptools.dynamic] version = { attr = 'my_package.VERSION' } # [tool.setuptools.packages.find] # ^^^ Not needed, Setuptools does the right thing automatically! And with Setuptools version 40.9 or higher (released in 2019), you can completely remove the :file:`setup.py` file after this change. With old versions, you would still need this stub file: .. code-block:: python # setup.py from setuptools import setup setup() Further reading: - :doc:`Setuptools quickstart ` - :doc:`setuptools:userguide/pyproject_config` - `Why you shouldn't invoke setup.py directly`__ __ https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html Single-Sourcing Your Version Number ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Over time, it becomes annoying to increase your version number every time you release a new version of your package. On top of that, Acc-Py :ref:`requires us to use Git tags to publish our package `, but doesn't actually use the name of the tag at all. It would be nice if we could just make the tag name our version number and read that into our project metadata. `Setuptools-SCM`_ is a plugin for Setuptools that does precisely that. It generates your version number automatically based on your Git tags and feeds it directly into Setuptools. The minimal setup looks as follows: .. _Setuptools-SCM: https://github.com/pypa/setuptools_scm .. code-block:: toml # pyproject.toml [build-system] requires = [ 'setuptools>=45', 'setuptools_scm[toml]>=6.2', ] # Warn Setuptools that the version key is # generated dynamically. [project] dynamic = ['version'] # This section is ALWAYS necessary, even # if it's empty. [tool.setuptools_scm] You can also add a key ``write_to`` to the configuration section in :file:`pyproject.toml` to automatically generate – *during installation!* – a source file in your package that contains the version number: .. code-block:: toml # pyproject.toml [tool.setuptools_scm] write_to = 'my_package/version.py' .. code-block:: python # my_package/__init__.py from .version import version as __version__ ... .. warning:: Don't do this! Adding a ``__version__`` variable to your package is :pep:`deprecated <396#pep-rejection>`. If you need to gather a package's version programmatically, do this: .. code-block:: python # Use backport on older Python versions. try: from importlib import metadata except ImportError: import importlib_metadata as metadata version = metadata.version("name-you-gave-to-pip-install") which is provided by the :mod:`std:importlib.metadata` standard library package (Python 3.8+) or its :doc:`backport ` (Python 3.6+). Here are some very clever solutions that people come up every now and then with that are all broken for one reason or another: Passing :samp:`{my_package}.__version__` to ``setup()`` in :file:`setup.py` This requires you to import your own package while you're trying to install it. As soon as you try to import one of your dependencies, this will break because Pip hasn't had *a chance* to install your dependencies yet. Specify :samp:`version = attr: {my_package}.__version__` in :file:`setup.cfg` On Setuptools before version 46.4, this does the same as the first option. It unconditionally attempts to import the package before it is installed. Thus it also has the same problems. If you don't know what :file:`setup.cfg` is, don't worry about it; it was an intermediate format before :file:`pyproject.toml` became popular. As above, *but* require ``setuptools>=46.4`` in :file:`pyproject.toml`: New versions of Setuptools textually analyze your code and try to find ``__version__`` without executing any of your code. If this fails, however, it still falls back to importing your package and break again. Specify :samp:`attr = '{my_package}.__version__'` in :file:`pyproject.toml` This is in fact exactly identical to the previous approach. Further reading: - :doc:`pkg:guides/single-sourcing-package-version` in the Setuptools user guide - `Zest.releaser `_ Automatic Code Formatting ^^^^^^^^^^^^^^^^^^^^^^^^^ Although a lot of programmers have needlessly strong opinions on it, consistent code formatting has two undeniable advantages: 1. it makes it easier to spot typos and related bugs; 2. it makes it easier for other people to read your code. At the same time, it requires a lot of pointless effort to: - pick, - follow - and enforce a particular style guide. Ideally, code formatting would be consistent, automatic and require as little human input as possible. Luckily, :doc:`Black ` does all of these: - It is :doc:`automatic `. You write your code however messily as you want. You simply run ``black .`` at the end and it adjusts your files in-place to be formatted completely uniformly. - :doc:`black:integrations/editors` for is is almost universal. No matter which IDE you use, you can configure it such that Black runs every time you save your file or make a Git commit. This way, you can stop thinking about formatting entirely. - :doc:`black:the_black_code_style/current_style` has little configurability. This obviates pointless style discussions as they are known in the C++ world and allows people to focus on the discussions that matter. On top of it, you may also want to run ISort_ so that your import statements are always grouped correctly and cleaned up. Like Black, it is supported by `a large number of editors `_. To make it compatible with Black, add these lines to your configuration: .. code-block:: python # pyproject.toml [tool.isort] profile = "black" .. _ISort: https://pycqa.github.io/isort/ .. _ISort Plugins: https://github.com/pycqa/isort/wiki/isort-Plugins Linting ^^^^^^^ With Python being the dynamically typed scripting language that it is, it is much easier to put accidental bugs into your code. Just a small typo and you can spend half an hour wondering why a variable doesn't get updated. Static analysis tools that scan your code for bugs and anti-patterns are often called *linters* as they work like a lint trap in a clothes dryer. For Python beginners, the most comprehensive choice is Pylint_. It's a general-purpose linter that catches, among other things: - style issues (line too long), - excessive complexity (too many lines per function), - suspicious patterns (unused variables), - outright bugs (undefined variable). .. _Pylint: http://pylint.pycqa.org/ In contrast to :ref:`Black `, PyLint is *extremely* configurable and encourages users to enable or disable lints as necessary. Here is an example configuration: .. code-block:: python # pyproject.toml [tool.pylint.format] # Compatibility with Black. max-line-length=88 # Lines with URLs shouldn't be marked as too long. ignore-long-lines = '?$' [tool.pylint.reports] # Don't show a summary, just print the errors. reports = false score = false # TOML quirk: because of the space in "messages control", # we need quotes here. [tool.pylint.'messages control'] # Every Pylint warning has a name that you can put in this # list to turn it off for the entire package. disable = [ 'duplicate-code', 'unbalanced-tuple-unpacking', ] Sometimes, PyLint gives you a warning that you find *generally* useful, but *just this time*, you think it shouldn't apply and the code is actually correct. In this case you can add a comment like this to suppress the warning: .. code-block:: python # pylint: disable = unused-import These comments respect scoping. If you put them within a function, they apply to only that function. If you put them at the end of a line, they only apply to that line. You can prevent bugs from silently sneaking into your code by running PyLint in your :ref:`CI/CD pipeline ` every time you push code to Gitlab: .. code-block:: yaml # .gitlab-ci.yml test_lint: extends: .acc_py_base stage: test before_script: - python -m pip install pylint black isort - python -m pip install -e . script: # Run each linter, but don't abort on error. Only abort # at the end if any linter failed. This way, you get all # warnings at once. - pylint ${project_name} || pylint_exit=$? - black --check . || black_exit=$? - isort --check . || isort_exit=$? - if [[ pylint_exit+black_exit+isort_exit -gt 0 ]]; then false; fi If you write Python code that is used by other people, you might also want to add :pep:`type annotations <483>` and use a type checker like Mypy_ or PyRight_. .. _MyPy: https://mypy.readthedocs.io/en/latest/getting_started.html .. _PyRight: https://github.com/microsoft/pyright/blob/master/docs/getting-started.md Further reading: - `Python static code analysis tools`__ __ https://pawamoy.github.io/posts/python-static-code-analysis-tools/