Anaconda Introduction
Robert Petkus, 6/27/23
- Introduction to Anaconda
- Why Anaconda is useful
- Using Anaconda
- Installing Anaconda
- Anaconda Basics
- Tips and tricks
- Exercise
- Anaconda vs. Miniconda
1. Introduction to Anaconda
Anaconda is a popular Python and R distribution that simplifies package management and deployment. It is designed to make it easy to install and manage data science packages, tools, and environments. It comes with a powerful package and environment manager called Conda, which helps manage dependencies and create isolated environments for different projects.
2. Why Anaconda is useful
- Simplifies package management and installation
- Supports multiple programming languages (Python and R)
- Provides a large number of pre-built packages for data science and machine learning
- Allows for easy creation and management of isolated environments for different projects
- Cross-platform compatibility (Windows, macOS, and Linux)
3. Using Anaconda
On a personal system (workstation, laptop) - you will need to
download and install Anaconda.
On Elzar you can use the version of Anaconda provided by Easybuild or download and install a version in your home directory
Easybuild:
> module load EBModules
> module load Anaconda3
4. (optional) Installing Anaconda safely in your home directory
👉 You can skip to Step 5 if you are using Anaconda as a module
To install Anaconda in your home directory, follow these steps:
- Download the latest version of Anaconda from the official website
- Open a terminal and navigate to the directory where the downloaded file is located.
- Run the following command to start the installation:
> bash Anaconda3-2021.05-Linux-x86_64.sh
Notes:
Current version of Anaconda may differ from the following example.
The latest version of Anaconda requires Python 3.10.
On Elzar (bamdev1/2) you will want to load a version of Python > 3.10 before proceeding (e.g., module load Python/3.10.8-GCCcore-12.2.0
)
Follow the on-screen prompts to complete the installation. Make sure to choose a custom installation location within your home directory.
Once the installation is complete, close and reopen your terminal. Verify that Anaconda is installed by running:
> conda --version
5. Anaconda Basics
Create a new environment:
> conda create --name myenv
Example: Create a new environment named "mytest" and install Python version 3.7:
> conda create --name mytest python=3.7
Activate an environment:
> conda activate mytest
👉 When activating an environment for the first time, you will need to initialize conda, then logout and login again:
> conda init bash
Deactivate an environment:
> conda deactivate
List all environments:
> conda env list
Remove an environment:
> conda env remove --name mytest
Update Anaconda to the latest version:
> conda update -n base conda
Install packages (after activating environment):
> conda install package_name
Update packages:
> conda update package_name
Remove packages:
> conda remove package_name
List installed packages in a particular environment:
> conda list -n mytest
6. Tips and Tricks
Keep your base environment clean by always working in a dedicated environment for each project.
Channels
Configure channels to use by default:> conda config --add channels conda-forge
List currently configured channels:
> conda config --show channels
> conda config --show default-channels channels
Search
Use the Conda Forge channel to install packages that are not available in the default channel:
> conda install -c conda-forge package_name
Export and Sharing
Export an environment for sharing:
> conda env export > my_env.yml
Create an environment from an existing yml file:
> conda env create -f my_env.yml
7. Exercise
Objective: gain hands-on experience creating, managing, and exporting Conda environments
- Create a new Conda environment named "exercise" and activate it.
- Install the following packages in the "exercise" environment: numpy, pandas, matplotlib, and scikit-learn.
- Export the environment to a file named "exercise_environment.yml".
- Deactivate and remove the "exercise" environment.
- Create a new environment named "exercise_recreated" using the "exercise_environment.yml" file and verify that the required packages are installed.
8. Supplemental: Anaconda vs. Miniconda
Miniconda is a minimal version of Anaconda that includes only Conda and Python. It does not come with any pre-installed packages. Users install only the packages they required.
Advantages
Smaller in size and thus faster to download and install
Allows users to install only the packages they required, reducing potential conflicts and ensuring a clean environment
Suitable for experienced users who prefer to have full control over their package installations
Disadvantages
Anaconda is a better choice for users who want a comprehensive, ready-to-use solution.
Miniconda is more suitable for users who prefer a minimal, customizable setup.