Friday, May 26, 2023

Packaging code for development and deployment

Packaging application code for deployment


For any programming environment, the application or library you are making will have dependencies. Different libraries and different components are used to fulfill the responsibilities of the application functionality. 
This article is a bit of a comparison of how different languages and environment package all the dependencies so the software can run on the target machine, or environment its intended to be hosted in.

When developing the application, it can be an easy win to include the library in your IDE or just on your machine in a global environment setting. This sort of pattern really makes sure that the app is only going to work as expected on your machine; rather than being portable enough to work consistently in the QA, staging and production. Save yourself the hassle and setup package management so that you have a good foundation that will work in different environments.

Goal: portable, consistent builds and deployments

How to do this is to make your workflow deterministic. This means that the operation has the same result when run many times and run in many different places. For package management in your build this means ensuring that the versions of the various dependent libraries are the same. You need to lock them down so you can guarantee consistency.

So, for any language and environment the goals are similar:
  1. make the build deterministic.
  2. the environment setup when developing should be straightforward
  3. aligned versions of the dependent packages.
  4. enable portability and consistency between environments
    1. Your IDE can help, but don't stop there. make sure it works in a different environment to call it done. Continuous integration tools are great for this.
So much of this depends on the environment. Is this an interpreted language that runs on a virtual machine, or compiled to object code that is specific to the machine? With these concepts, let us use some different programming environments to see it all in action. We will compare python, java, node and C++.

python 


With python, you have dependent libraries that are managed by a program called pip, and in the requirements.txt file you should see the various packages that are required by the application. You also have a global environment, and these can work together, but you have to keep tabs on what is deployed in what environment. This gets dicey quickly

pyenv

Pyenv allows you to set a particular python version globally on your system, or to a specific local directory. This is handy when working on different projects that are based on different versions

virtualenv

To isolate the packages just used by your application and to be sure you are just using a specific set of packages, a virtual environment is a great tool. This basically makes a sandbox on your machine and only uses the packages contained therein.

environment variables

So, now that the packages are contained, the run time behavior of the application may need to use environment variables. This introduces the same problem of the dependent packages.

So, with these moving parts it pretty easy to see how that can get out of hand. It would be great to have a tool that both manages packages, and environment variables so you have a consistent experience.

Pipenv 

It turns out there is a great tool for this: pipenv. Pipenv will organize your packages and variables in one single environment, but also take a snapshot of all this in one file: Pipfile.lock. When this file is present with all the settings contained, you are able to deploy to another environment using pipenv. This is a great piece of software, and comes from the same author as Requests. He knows his stuff! (https://www.kennethreitz.org/projects)

Java

In java the environment is the java virtual machine; so the settings for anything specific to the hardware isn't really something you need to care about. Packaging dependencies and their versions are, and there are two tools in regular use for package management: maven and gradle.

Building/compiling

Since you are compiling to bytecode the results are class files, and those are zipped up into a jar file (or not) so the virtual machine can run the portable byte code on the target machine.
maven

Packaging

Maven is a XML based declarative language for dependencies and build management.

Gradle was created to use a nicer language (groovy) and it builds a better dependency graph will maven is really just a set of jobs that you need to put in the same order. If you use maven and your maven file is bigger than 500 lines, it might be a good idea to try gradle. Gradle is also s nice intro to the groovy language, and with different tools (jenkins) using groovy; its worth the time to learn it.


C++

C, D, and C++ compile to the actual target machine, so your dependency management will change depending on the OS you are using. Dll files in windows, and .so files in linux are used to package libraries.

Node

Have you ever grown a garden? If the garden grows, you get some vegetables and some weeds. It happens, but when you pick your vegetables do you take the weeds out, or just make a new garden bed and double down on your fun? If you like option #2 then you will love npm!

Kidding aside, npm is just a victim of its own success. The growth of javascript on the server-side came on so quickly that without a strict overseer of the project it just became what it is today.

Building

Yarn is a more stable binary in my experience (at the time of writing)

Packaging

Use package-lock.json to set your versions appropriately. If not, you just get what is on the host machine and things get wild pretty quickly. To get even better results try the shrikwrap package:
https://docs.npmjs.com/cli/shrinkwrap


No comments:

Post a Comment