WIP ๐ง ๐ pre spellcheck
Hello ๐
After setting/reinstalling a couple of machines from scratch in the last few months, I decided for once and for all to document my default data science settings and tools I typically used.
๐ก A pro tip ๐๐ผ avoid dropping a cup of โ๏ธ on your machine ๐คฆ๐ปโโ๏ธ
That includes installing programming languages such as Python ๐ and R. In addition, setting up the terminal, git, and install supporting tools such as iTerm2, oh-my-zsh, Docker ๐ณ, etc.
Last Update: January 1st, 2025
Update: This setting is up-to-date with macOS Sequoia โค๏ธ. However, most of the tools in this document should be OS agnostic (e.g., Windows, Linux, etc.) with some minor modifications.
This document covers the following:
- Set Up Git and SSH
- Command Lines Tools
- Install Docker
- Set Terminal Tools
- Set VScode
- Set Python
- Install R and Positron
- Install Postgres
- Miscellaneous
- License
This section focuses on the core git settings, such as global definitions and setting SSH with your Github account.
All the settings in the sections are done through the command line (unless mentioned otherwise).
Let's start by checking the git version running the following:
git --versionIf this is a new computer or you did not set it before, it should prompt a window and ask you if you want to install the command line developer tools:
The command line developer tools is required to run git commands. Once installed, we can go back to the terminal and set the global git settings.
Git enables setting both local and global options. The global options will be used as default settings any time a new repository with the git init command is triggered. You can override the global settings on a specific repo by using local settings. Below, we will define the following global settings:
- Git user name
- Git user email
- Default branch name
- Global git ignore file
- Default editor (for merging comments)
Setting global user name and email by using the config --global command:
git config --global user.name "USER_NAME"
git config --global user.email "[email protected]"Next, let's set the default branch name as main using the init.defaultBranch argument:
git config --global init.defaultBranch mainThe global .gitignore file enables you to set general ignore roles that will apply automatically to all repositories in your machine. This is useful when having repetitive cases of files you wish to ignore by default. A good example on Mac is the system file - .DS_Store, which is auto-generated on each folder, and you probably do not want to commit it. First, let's create the global .gitignore file using the touch command:
touch ~/.gitignoreNext, let's define this file as global:
git config --global core.excludesFile ~/.gitignoreOnce the global ignore file is set, we can start adding the files we want git to ignore systematically. For example, let's add the .DS_Store to the global ignore file:
echo .DS_Store >> ~/.gitignoreNote: You want to be careful about the files you add to the global ignore file. Unless it is applicable to all cases, such as the .DS_Store example, you should not add it to the global settings and define it locally to avoid a git disaster.
Git enables you to set the default shell code editor to create and edit your commit messages with the core.editor argument. Git supports the main command line editors such as vim, emacs, nano, etc. I set the default CLI editor as vim:
git config --global core.editor "vim"By default, all the global settings are saved to the config file under the .ssh folder. You can review the saved settings and modify them manually by editing the config file:
vim ~/.gitconfigSetting SSH key required to sync your local git repositories with the origin. By default, when creating the SSH keys, it writes the files under the .ssh folder if they exist. Otherwise, it is written down under the root folder. It is more "clean" to have it under the .ssh folder. Therefore, my settings below assume this folder exists.
Let's start by creating the .ssh folder:
mkdir ~/.sshThe ssh-keyget command creates the SSH keys files:
To set the SSH key on your local machine you need to use ssh-keyget:
ssh-keygen -t ed25519 -C "[email protected]"Note: The -t argument defines the algorithm type for the authentication key. I used ed25519, and the -C argument enables adding comments, in this case, the user name email for reference.
After runngint the ssh-keygen command, it will prompt for setting file name and password (optional). By default, it will be saved under the root folder.
Note: This process will generate two files:
your_ssh_keyis the private key. You should not expose ityour_ssh_key.pubis the public key that will be used to set the SSH on Github
The next step is to register the key on your Github account. On your account main page go to the Settings menu and select on the main menu SSH and GPG keys (purple rectangle ๐๐ผ), and click on the New SSH key (yellow rectangle ๐๐ผ):
Next, set the key name under the title text box (purple rectangle ๐๐ผ), and paste your public key to the key box (turquoise rectangle ๐๐ผ):
Note: I set the machine nickname (e.g., MacBook Pro 2017, Mac Pro, etc.) as the key title to easily identify the relevant key in the future.
The next step is to update the config file on the ~/.ssh folder. You can edit the config file with vim:
vim ~/.ssh/config And add somewhere on the file the following code:
Host *
AddKeysToAgent yes
UseKeychain yes
IdentityFile ~/.ssh/your_ssh_keyWhere your_ssh_key is the private key file name
Last, run the following to load the key:
ssh-add --apple-use-keychain ~/.ssh/your_ssh_key
- Github documentation - https://docs.github.com/en/[email protected]/authentication/connecting-to-github-with-ssh/adding-a-new-ssh-key-to-your-github-account
ssh-keygetarguments - https://www.ssh.com/academy/ssh/keygen- A great video tutorial about setting SSH: https://www.youtube.com/watch?v=RGOj5yH7evk&t=1230s&ab_channel=freeCodeCamp.org
- Setting Git ignore - https://www.atlassian.com/git/tutorials/saving-changes/gitignore
- Initial Git setup - https://git-scm.com/book/en/v2/Getting-Started-First-Time-Git-Setup
This section covers core command line tools.
The Homebrew (or brew) enables you to install CL packages and tools for Mac. To install brew run from the terminal:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"After finishing the installation, you may need to run the following commands (follow the instructions at the end of the installation):
(echo; echo โeval โ$(/opt/homebrew/bin/brew shellenv)โโ) >> /Users/USER_NAME/.zprofile
eval โ$(/opt/homebrew/bin/brew shellenv)โMore info available: https://brew.sh/
The jq is a lightweight and flexible command-line JSON processor. You can install it with brew:
brew install jqTo spin a VM locally to run Docker we will set Docker Desktop.
Go to Docker website and follow the installation instructions according to your OS:
Note: Docker Desktop may require a license when used in enterprise settings
This section focuses on installing and setting tools for working on the terminal.
The terminal is the built-in emulator on Mac. I personally love to work with iTerm2 as it provides additional functionality and customization options. iTerm2 is available only for Mac and can be installed directly from the iTerm2 website or via homebrew:
> brew install --cask iterm2
.
.
.
==> Installing Cask iterm2
==> Moving App 'iTerm.app' to '/Applications/iTerm.app'
๐บ iterm2 was successfully installed!The next step is to install Z shell or zsh. The zsh is a shell flavor built on top of bash, providing a variety of add-in tools on the terminal. We will use homebrew again to install zsh:
> brew install zsh
.
.
.
==> Installing zsh
==> Pouring zsh--5.8_1.monterey.bottle.tar.gz
๐บ /usr/local/Cellar/zsh/5.8_1: 1,531 files, 14.7MBAfter installing the zsh we will install oh-my-zsh, an open-source framework for managing zsh configuration. We will install it with the curl command:
sh -c "$(curl -fsSL https://raw.githubusercontent.com/ohmyzsh/ohmyzsh/master/tools/install.sh)"You can note that your terminal view changed (you may need to reset your terminal to see the changes), and the default command line cursor looks like this:
โ ~The default setting of Oh My Zsh is stored on ~/.zshrc, and you can modify the default theme by editing the file:
vim ~/.zshrc
I use the powerlevel10k, which can be installed by cloning the Github repository (for oh-my-zsh):
git clone --depth=1 https://github.com/romkatv/powerlevel10k.git ${ZSH_CUSTOM:-$HOME/.oh-my-zsh/custom}/themes/powerlevel10kAnd then change the theme setting on the ~/.zshrc by ZSH_THEME="powerlevel10k/powerlevel10k". After restarting the terminal, and reopening it you will a sequence of questions that enables you to set the theme setting:
Install Meslo Nerd Font?
(y) Yes (recommended).
(n) No. Use the current font.
(q) Quit and do nothing.
Choice [ynq]:Note: the Meslo Nerd font is required to display symbols that are being used by the powerlevel10k theme
You can always modify your selection by using:
p10k configureThe terminal after adding the powerlevel10k theme looks like this:
Installing zsh-syntax-highlighting to add code highlight on the terminal:
brew install zsh-syntax-highlightingAfter the installation is done, you will need to clone the source code. I set the destination as the home folder, defining the target folder hidden:
git clone https://github.com/zsh-users/zsh-syntax-highlighting.git $HOME/.zsh-syntax-highlighting
echo "source $HOME/.zsh-syntax-highlighting/zsh-syntax-highlighting.zsh" >> ${ZDOTDIR:-$HOME}/.zshrcAfter you reset your terminal, you should be able to see the syntex highlight in green (in my case):
iTerm2- https://iterm2.com/index.htmloh my zsh- https://ohmyz.sh/- freeCodeCamp blog post - https://www.freecodecamp.org/news/how-to-configure-your-macos-terminal-with-zsh-like-a-pro-c0ab3f3c1156/
powerlevel10ktheme - https://github.com/romkatv/powerlevel10kzsh-syntax-highlighting- https://github.com/zsh-users/zsh-syntax-highlighting/blob/master/INSTALL.md#in-your-zshrc
VScode is a general-purpose IDE and my favorite development environment. VScode supports mutliple OS such as Lunix, MacOS, Windows, and Raspberry Pi.
Installing VScode is straightforward - go to the VScode website https://code.visualstudio.com/ and click on the Download button (purple rectangle ๐๐ผ):
Download the installation file and follow the instructions.
This section focuses on setting up tools for working with Python locally (without Docker container) with UV and miniconda. If you are interested in setting up a dockerized Python/R development environment with VScode, Docker, and the Dev Containers extension, please check out the following tutorials:
Also, you can leverage the following VScode templates:
- Python (using venv) - https://github.com/RamiKrispin/vscode-python-template
- Python (using uv) - https://github.com/RamiKrispin/vscode-python-uv-template
- R - https://github.com/RamiKrispin/vscode-r-template
UV is an extremely fast Python package and project manager written in Rust. Installing UV is straightforward, and I recommend checking the project documentation.
On Mac and Linux, you can use curl:
curl -LsSf https://astral.sh/uv/install.sh | shor with wget:
wget -qO- https://astral.sh/uv/install.sh | shOn Windows using powershell:
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"Miniconda is an alternative tool for setting up local Python environments. Go to the Miniconda installer page and download the installing package based on your operating system and Python version to install the most recent version. Once Miniconda is installed, you can install Python libraries with conda:
conda install pandasLikewise, you can use conda to create an environment:
conda create -n myenv python
Get a list of environments:
conda info --envsCreate an environment and set the Python version:
conda create --name myenv python=3.9
Get library available versions:
conda search pandas
Activate an environment:
conda activate myenv
Get a list of installed packages in the environment:
conda listDeactivate the environment:
conda deactivateRuff is an extremely fast Python linter and code formatter, written in Rust.
You can install Ruff directly from PyPi using pip:
pip install ruffOn Mac and Linux, using curl:
curl -LsSf https://astral.sh/ruff/install.sh | shLikewise, on Windows, using powershell:
powershell -c "irm https://astral.sh/ruff/install.ps1 | iex"- UV documentation - https://docs.astral.sh/uv/
- Miniconda - https://docs.anaconda.com/miniconda/
- Ruff documentation - https://docs.astral.sh/ruff/
To set up your machine R and Positron, you should start by installing R from CRAN. Go to https://cran.r-project.org/ and select the relevant OS:
Note: For macOS, there are two versions, depending on the type of your machine CPU - one for Apple silicon arm64 and a second for Intel 64-bit.
Once you finish downloading the build, open the pkg file and start to install it:
Note: Older releases available on CRAN Archive.
Once R is installed, you can install Positron. Go to https://positron.posit.co/download.html, select the relevant OS version and download it:
After downloading it, move the application into the Application folder (on Mac).
PostgreSQL supports most common OS systems, such as Windows, macOS, Linux, etc.
To download, go to Postgres project website and navigate to the Download tab, and select your OS, which will navigate it to the OS download page, and follow the instructions:
On Mac, I highly recommend installing PostgreSQL through the Postgres.app:
When opening the app, you should have a default server set to port 5432 (make sure that this port is available):
To launch the server, click on the start button:
By default, the server will create three databases - postgres, YOUR_USER_NAME, and template1. You can add an additional servers (or remove them) by clicking the + or - symbols on the left button.
To run Postgres from the terminal, you will have to define the path of the app on your zshrc file (on Mac) by adding the following line:
export PATH=$PATH:/Applications/Postgres.app/Contents/Versions/14/bin/Where /Applications/Postgres.app/Contents/Versions/14/bin/ is the local path on my machine.
Alternatively, you can set the alias from the terminal by running the following:
echo "export PATH=$PATH:/Applications/Postgres.app/Contents/Versions/14/bin/" >> ${ZDOTDIR:-$HOME}/.zshrcIf the port you set for the Postgres server is in use, you should expect to get the following message when trying to start the server:
This means that the port is either used by other Postgres servers or other applications. To check what ports are in use and by which applications you can use the lsof function on the terminal:
sudo lsof -i :5432 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
postgres 124 postgres 7u IPv6 0xc250a5ea155736fb 0t0 TCP *:postgresql (LISTEN)
postgres 124 postgres 8u IPv4 0xc250a5ea164aa3b3 0t0 TCP *:postgresql (LISTEN)The i argument enables the search by port number, as shown in the example above by 5432. As can be seen from the output, the port is used by other Postgres servers. You can clear the port by using the pkill command:
sudo pkill -u postgresWhere the u arugment enbales to define the port you want to clear by the USER field, in this case postgres.
Note: Before you clear the port, make sure you do not need the applications on that port.
- Tutorial - https://www.youtube.com/watch?v=qw--VYLpxG4&t=1073s&ab_channel=freeCodeCamp.org
- PostgreSQL - https://en.wikipedia.org/wiki/PostgreSQL
- Documentation - https://www.postgresql.org/docs/
Stats is a macOS system monitor in your menu bar. You can download it directly from the project repo, or use brew:
brew install statsHtop is an interactive cross-platform commend line process viewer. On Mac install htop with brew:
brew install htopFor other OS systems, follow the instraction on the project download page.
The XQuartz is an open-source project that provides required graphic applications (X11) for macOS (similar to the X.Org X Window System functionality). To install it, go to https://www.xquartz.org/ - download and install it.
Rectangle is a free and open-source tool for moving and resizing windows in Mac with keyboard shoortcuts. To install it, go to https://rectangleapp.com and download it. Once installed, you can modify the default setting:
Note: This functionality is built-in with macOS Sequoia, and it may be redundant to install Rectangle
- Change language - if you are using more than one language, you can add a keyboard shortcut to switch between them. Go to
System Preferences...->keyboardand select the shortcut tab. Under theInput Sourcestick theSelect the previous input source option:
Note: You can modify the keyboard shortcut by clicking the shortcut definition in that row
The drawio-desktop is a desktop version of the diagrams app for creating diagrams and workflow charts. The desktop version, per the project repository, is designed to be completely isolated from the Internet, apart from the update process.
Image credit: https://www.diagrams.net/
To install the desktop version, go to the project repository and select the version you wish to install under the releases section:
For macOS users, once you download the dmp file and open it, move the build to the applications folder:
- Draw.io documentation - https://www.diagrams.net/
- drawio-desktop repository - https://github.com/jgraph/drawio-desktop
- Online version - https://app.diagrams.net/
This tutorial is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.