In a research project, there are multiple programming, simulation and writing tasks. Therefore, it is necessary to streamline the workflow to accelerate my research.
Python is my primary programming language for analysis code and quick model implementation. For my typical workload, lightweight Visual Studio Code is a decent choice of development environment.
It is always a good idea to isolate Python environment where thirdparty packages are installed. Several options are available, however, I prefer to use the built-in venv module due to its simplicity. On macOS with brand new Apple Silicon, numpy has kind of performance issue since the absence of MKL library and a possible fix is by using conda and leveraging Apple's Accelerate library (see issue). However, after my own experiment, the performance different is not significant enough and I will stay with venv. It is pretty straightforward to install packages inside a virtual environment via pip but the only two remarks are made here:
-
use module method instead of the command itself, i.e.,
python -m pip install <package name>instead of
pip install <package name> -
upgrade pip first:
python -m pip install --upgrade pip
A environment is portable by creating a requirements.txt
python -m pip freeze > requirements.txt
python -m pip install -r requirements.txt
Since I use only a small collection of packages,
- numpy
- scipy
- pandas
- scikit-learn
- matplotlib
- notebook
- pytest
This file is provided in this template repo for quick start.
I have been a big fan of conda for years. However, I switch to venv recently. The python is installed by apt and python3.8-venv must be installed.
sudo apt install python3.8 python3.8-venvFor individual project, a virtual environment must be created, activated and upgraded.
python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pipand select the proper interpreter in vscode by Python: Select Interpreter. Since venv hardcodes the shebang line, the virtual environment is not portable across different folder and platform. A requirements.txt will be used to reproduced the developement environment by
python -m pip freeze > requirements.txt # create
python -m pip install -r requirements.txt # setupCommon functions are organized in packages located in ${workspaceFolder}/src for convenient reuse. However, it is nasty to configure Python search path in visual studio code. I find this discussion pretty useful. To make start debugging work, add the following line to ${workspaceFolder}/.vscode/launch.json
"env": {"PYTHONPATH": "${workspaceFolder}/src"}. This will actually append PYTHONPATH with specified value, therefore, the search configured before entering vscode will be untouched. Two remarks must be made here,
- Shortcuts
F5(Start Debugging) andCtrl+F5(Run Without Debugging) is functional keys on my current keyboard and I also need to holdfnkey. - The environment variable will not have an affect on
Run Python FileandDebug Python File, i.e., they will not work at all. To make life easier, I will insist on the former way at least for code development. For jupyer notebooks, an ugly way is adopted now, i.e., add this cell at the beginning of every notebook
%matplotlib inline
# thirdparty packages
import matplotlib.pyplot as plt
import numpy as np
# local development
import sys
sys.path.append('<absolute path to package root (src)>')
from package_a import fooObsidian to take notes
Sometimes, I want to test a new software relying on multiple thirdparty packages Use Guest Additions CD image to solve the small screen problem