Friday, May 26, 2023

Packaging code for development and deployment


For any programming environment, the application or library you are making will have dependencies. Different libraries and different components are used to fulfill the responsibilities of the application functionality. 
This article is a bit of a comparison of how different languages and environment package all the dependencies so the software can run on the target machine, or environment its intended to be hosted in.

When developing the application, it can be an easy win to include the library in your IDE or just on your machine in a global environment setting. This sort of pattern really makes sure that the app is only going to work as expected on your machine; rather than being portable enough to work consistently in the QA, staging and production. Save yourself the hassle and setup package management so that you have a good foundation that will work in different environments.

Goal: portable, consistent builds and deployments

How to do this is to make your workflow deterministic. This means that the operation has the same result when run many times and run in many different places. For package management in your build this means ensuring that the versions of the various dependent libraries are the same. You need to lock them down so you can guarantee consistency.

So, for any language and environment, the goals are similar:
  1. make the build deterministic.
  2. the environment setup when developing should be straightforward
  3. aligned versions of the dependent packages.
  4. enable portability and consistency between environments
    1. Your IDE can help, but don't stop there. make sure it works in a different environment to call it done. Continuous integration tools are great for this.
So much of this depends on the environment. Is this an interpreted language that runs on a virtual machine, or compiled to object code that is specific to the machine? With these concepts, let us use some different programming environments to see it all in action. We will compare python, java, node and C++.

Different environments don't need to be that different

Local machine, dev, production.

For portability and maintainability, the code from the single developers machine must work consistently in a developer environment and integrate with the code from other members. The easiest path to this collaboration is to take a pragmatic approach to code.

Maintainable code with tooling

To enable this velocity and reduce changes in the review cycle, the code from the developer's machine aligns with the common standards of the product. This can be a manual effort but can be automated with the source control and integration actions. For this example, we will use python and github( or gitlab) actions to run the checks

If the quality checks happen in the integration environment, how can a developer ensure this process happens smoothly? by running the same quality checks locally. 

The easiest way to do this is to run ‘pre-commit’ actions locally and run those same checks when merging to main branch.

pre-commmit - run black flake8 mypy bandit vulture and add coverage to the pytest

Run locally first
- pre-commit run --all-files

python 

With Python, you have dependent libraries that are managed by a program called pip, and in the requirements.txt file you should see the various packages that are required by the application. You also have a global environment, and these can work together, but you have to keep tabs on what is deployed in what environment. This gets dicey quickly

Packaging for portability. It's a good idea to experience the non-deterministic nature of building projects with one requirements file with wildcard versions. You learn there is a better way, and with Pipenv and poetry the locking of dependency versions and packaging is now largely taken care of. There are benefits and drawbacks to both, so it would be a good plan to try them both and see. 
  • gold: poetry and pyenv
  • silver: pipenv and pyenv
  • bronze: requirements. txt with wildcard

Poetry

Poetry and Pipenv are essentially the same on the functionality level, but poetry is faster and better maintained. 
Poetry uses the pyproject.toml file, so your packages and other tools are all in one file. Clean!

When using poetry, its a good idea to stick with the install and update commands. Using Add can get the cache in weird state. Poetry keeps cached results for application code and pypi dependencies, and you may need to clear these periodically and reset the lock file

Pipenv

Pipenv combines the virtualenv and pip tools to produce a pipenv.lock file. Pipenv will organize your packages and variables in one single environment, but also take a snapshot of all this in one file: Pipfile.lock. When this file is present with all the settings contained, you are able to deploy to another environment using pipenv. This is a great piece of software, and comes from the same author as Requests. He knows his stuff! (https://www.kennethreitz.org/projects)

pyenv

Pyenv allows you to set a particular python version globally on your system, or to a specific local directory. This is handy when working on different projects that are based on different versions.
Key output; Do not set and use a global python version, let the OS manage its own python and pyenv manage the use of different versions. 

virtualenv

To isolate the packages just used by your application and to be sure you are just using a specific set of packages, a virtual environment is a great tool. This basically makes a sandbox on your machine and only uses the packages contained therein.
With Pipenv and poetry, this is taken care of. Its key to isolate any changes to the local scope.

environment variables

So, now that the packages are contained, the run time behaviour of the application will need to use environment variables. This introduces the same problem of the dependent packages.

So, with these moving parts it is pretty easy to see how that can get out of hand. It would be great to have a tool that both manages packages, and environment variables so you have a consistent experience. Good practice will get you there. Make sure any secrets are not in the code, but in the environment. 

Java

In java the environment is the java virtual machine; so the settings for anything specific to the hardware isn't really something you need to care about. Packaging dependencies and their versions are, and there are two tools in regular use for package management: maven and gradle.

Building/compiling

Since you are compiling to bytecode the results are class files, and those are zipped up into a jar file (or not) so the virtual machine can run the portable byte code on the target machine.
maven

Packaging

Maven is a XML based declarative language for dependencies and build management.

Gradle was created to use a nicer language (groovy) and it builds a better dependency graph will maven is really just a set of jobs that you need to put in the same order. If you use maven and your maven file is bigger than 500 lines, it might be a good idea to try gradle. Gradle is also s nice intro to the groovy language, and with different tools (jenkins) using groovy; its worth the time to learn it.


C++

C, D, and C++ compile to the actual target machine, so your dependency management will change depending on the OS you are using. Dll files in windows, and .so files in linux are used to package libraries.

Node

Have you ever grown a garden? If the garden grows, you get some vegetables and some weeds. It happens, but when you pick your vegetables do you take the weeds out, or just make a new garden bed and double down on your fun? If you like option #2 then you will love npm!

Kidding aside, npm is just a victim of its own success. The growth of javascript on the server-side came on so quickly that without a strict overseer of the project it just became what it is today.

Building

Yarn is a more stable binary in my experience (at the time of writing)

Packaging

Use package-lock.json to set your versions appropriately. If not, you just get what is on the host machine and things get wild pretty quickly. To get even better results try the shrikwrap package:
https://docs.npmjs.com/cli/shrinkwrap


No comments:

Post a Comment