Setting up Python

Here are a few notes on installing a user specific python distribution:

Get Miniconda

  wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
  chmod +x Miniconda3-latest-Linux-x86_64.sh
  conda install pip #better to use the pip in the base conda env than system pip
  • The difference between conda and pip: pip is a package manager specifically for python, whereas conda is a package manager for multiple languages as well as is an environment manager. Python module venv is python specific environment manager.

Set up a conda environment and activate it

conda create --name datasci-env python #or
conda create -n dataeng-env python jupyter pandas numpy matplotlib #or
conda create -n datasci-env scipy=0.15.0 #or
conda env create -f environment.yml

conda activate datasci-env
  • You don’t have to give names, can give prefixes where the env is saved, can create based on specific pages, can use explicit previous conda environments, yaml files, clone/update an existing one, etc. Use this link to get more information.

  • Specifying a path to a subdirectory of your project directory when creating an environment can keep everything 100% self contained.

  • To deactivate this environment, use conda deactivate datasci-env.

Install jupyter and pytorch (and tensorflow, keras, scikit-learn similarly) in a specific environment

conda install jupyter
conda install pytorch torchvision cpuonly -c pytorch # https://pytorch.org/
  • Change the command for pytorch installation if you do intend to use GPUs. In particular, install CUDA from conda after installing the latest NVidia drivers on the instance.