Cheat sheet

Modified

September 16, 2024

Collection of useful commands for package and environment management.

Environment descriptors

  • Git: git log -1 and git status -u. In python, use the following command for a specific module version: .version.git_revision) or .__git_version__.
  • R: sessionInfo() or devtools::session_info(): In R, these functions provide detailed information about the current session and list all dependencies.
  • Python: the best way seems to import sys, and then, __version__, __file__ which will display package versions and their location without having to list all packages using pip. It’s essential to load the package first and then use the following code to print all its dependencies:

SHELL

# finds the location of a command by searching through the PATH environment variable
which <program>

# lists all occurrences of a command found in the PATH
which -a <program> 

# shows the shared libraries required by a specific program
ldd <program> 

# provides detailed system information, including machine name and kernel version
uname -a 

# displays operating system identification data (such as Debian, Ubuntu, etc.)
cat /etc/os-release 

# provides operating system identification data and is recommended if available. This command might not be installed by default but is part of the lsb-release package on Debian-based systems
lsb_release -a 

# Environmental variables for locations 
$HOME # home directory
$PYTHONPATH # empty by default
$PYTHONHOME # python libraries
$RHOME # R libraries
$LD_LIBRARY_PATH # dynamic loader for shared libraries when running a program

Understanding PYTHONPATH

  • PYTHONPATH: you can set the $PYTHONPATH environment variable to include additional directories where Python will look for modules and packages. This allows you to extend the search path beyond the default locations and load packages that has been installed in a different directory. It is highly discouraged to mix different versions of libraries and interpreters! as some libraries are complex packages with dependencies.
export PYTHONPATH=/path/to/packages/
# unset the variable 
unset PYTHONPATH

PIP

pip3 show <module>
pip3 install <module>      # install the latest version (and dependencies)
pip3 uninstall <module>    # remove
pip3 freeze                # output installed packages in requirements.txt format (similar to pip3 list) which can conveniently be used with: pip3 install -r requirements.txt

We highly recommend to avoid using pip and start using Python virtualenv management tools, pipenv.

  • Advantages: it will generates and checks file hashes for locked dependencies when installing from Pipfile.lock and it creates a virtualenv in a standard customizable location.

PYTHON

# version info on installed packages

def print_imported_modules():
  import sys
  for name, val in sorted(sys.modules.items()):
      if(hasattr(val, '__version__')): 
          print(val.__name__, val.__version__)
      else:
          print(val.__name__, "(unknown version)")

print("==== Package list after loading pandas ====");
import <module>
print_imported_modules()

R

install.packages()     # install a given package
library()              # loads a given package
remove.packages()      # install a given package

We highly recommend using renv.

CONDA

conda -V
conda list 
conda env list 

DPKG

Low-level tool for package management on Debian-based systems

dpkg -S package_name # Seach   
dpkg -I package.deb  # --info
dpkg -L package_name # --list files installed by a package
dpkg -i package.deb  # --install   [requires sudo]
dpkg -r   # --remove:  Remove debian_package        [requires sudo]
dpkg --get-selections    # List all the packages known by dpkg and whether they are installed or not
dpkg --set-selections    # Set which package should be installed     [requires sudo]
dpkg-query -W -f='${Package} == ${Version}\n' # package version 

APT

High-level tool for package management on Debian-based systems. It also handles package dependencies and repositories.

apt
apt update                      # Update the package list                  [sudo]
apt search <package_name>
apt show <package_name>
apt install <package_name>      # Install a debian package                 [sudo]
apt upgrade                     # Upgrade all your packages                [sudo]
apt clean                       # Delete all the .deb you've downloaded (save some space)
apt remove --purge <package_name> # Remove a debian package                  [sudo]
apt-cache search keyword # Search for packages containing a keyword.

Docker

Useful commands to build and deploy a Docker image.

docker pull         # download image from a registry e.g. Docker Hub 
docker images       # list all Docker images on your local machine
docker run -it <image_name>  # creates and starts a new container from the specified Docker image, -it flag means interactive virtual machine which is very useful during the development-phase for testing the container 
docker build -t <my-app>  # build a Docker image form a Dockerfile and a context 
docker tag <my-app> <myrepo/my-app:v1.0> # creates an new alias for an existing Docker image. Useful for versioning
docker push         # upload image from local machine to Docker registry 
docker login        # logs you into a Docker register (after pull and push), username and pw needed

Other common commands use in Dockerfiles to clean up the image and reduce its size:

Dockerfile
RUN apt-get -y autoclean && \
    apt-get -y autoremove && \
    rm -rf /var/lib/apt/lists/* && \
    rm -rf /var/cache/apt/archives/*deb