## Environment Management

![](../img/env_image.png)

### virtualenv-clone: Create a Copy of a Virtual Environment

Sometimes you might want to use the same virtual environment for 2 different directories. If you want to create a copy of a virtual environment, use virtualenv-clone. 

The code below shows how to use virtualenv-clone.

```bash
$ pip install virtualenv-clone
$ virtualenv-clone old_venv/ new_venv/

$ source new_venv/bin/activate
```

[Link to virtualenv-clone](https://github.com/edwardgeorge/virtualenv-clone).

### pip-autoremove: Remove a Package and Its Unused Dependencies

When using `pip uninstall`, you only remove a specific package. 

In [None]:
!pip install -U pandas-profiling[notebook]

```bash
$ pip uninstall pandas-profiling[notebook] -y
```

In [None]:
!pip uninstall pandas-profiling[notebook] -y

Wouldn't it be nice if you can uninstall that package and its unused dependencies? That is when `pip-autoremove` comes in handy. 

In [None]:
!pip install pip-autoremove

In [None]:
!pip install -U pandas-profiling[notebook]

```bash
$ pip-autoremove pandas-profiling[notebook] -y
```

In [5]:
!pip-autoremove pandas-profiling[notebook] -y

Jinja2 3.0.1 is installed but jinja2~=2.11.2 is required
Redoing requirement with just package name...
spacy 3.1.2 is installed but spacy<3.0.0 is required
Redoing requirement with just package name...
markdown-it-py 0.6.2 is installed but markdown-it-py~=1.0 is required
Redoing requirement with just package name...
attrs 21.2.0 is installed but attrs<21,>=19 is required
Redoing requirement with just package name...
typer 0.3.2 is installed but typer[all]>=0.4 is required
Redoing requirement with just package name...
fsspec 0.8.7 is installed but fsspec[http]>=2021.8.1 is required
Redoing requirement with just package name...
pandas-profiling 3.1.0 (/home/khuyen/book/venv/lib/python3.8/site-packages)
    seaborn 0.11.2 (/home/khuyen/book/venv/lib/python3.8/site-packages)
    htmlmin 0.1.12 (/home/khuyen/book/venv/lib/python3.8/site-packages)
    phik 0.12.0 (/home/khuyen/book/venv/lib/python3.8/site-packages)
    multimethod 1.6 (/home/khuyen/book/venv/lib/python3.8/site-packages)
    

By using pip-autoremove, pandas-profiling and its unused dependencies are removed!

[Link to pip-autoremove](https://github.com/invl/pip-autoremove).

### pipreqs: Generate requirements.txt File for Any Project Based on Imports

In [None]:
!pip install pipreqs

`pip freeze` saves all packages in the environment, including ones that you don't use in your current project. To generate a `requirements.txt` based on imports, use pipreqs. 

For example, to save all packages in your current project to a `requirements.txt` file, run:
```bash
$ pipreqs . 
```

In [2]:
!pipreqs . 

INFO: Successfully saved requirements file in ./requirements.txt


Your `requirements.txt` should look like below:
```txt
numpy==1.21.4
pandas==1.3.4
pyinstrument==4.0.3
typer==0.4.0
```

Usage of pipreqs:

```bash
Usage:
    pipreqs [options] [<path>]

Arguments:
    <path>                The path to the directory containing the application files for which a requirements file
                          should be generated (defaults to the current working directory)

Options:
    --use-local           Use ONLY local package info instead of querying PyPI
    --pypi-server <url>   Use custom PyPi server
    --proxy <url>         Use Proxy, parameter will be passed to requests library. You can also just set the
                          environments parameter in your terminal:
                          $ export HTTP_PROXY="http://10.10.1.10:3128"
                          $ export HTTPS_PROXY="https://10.10.1.10:1080"
    --debug               Print debug information
    --ignore <dirs>...    Ignore extra directories, each separated by a comma
    --no-follow-links     Do not follow symbolic links in the project
    --encoding <charset>  Use encoding parameter for file open
    --savepath <file>     Save the list of requirements in the given file
    --print               Output the list of requirements in the standard output
    --force               Overwrite existing requirements.txt
    --diff <file>         Compare modules in requirements.txt to project imports
    --clean <file>        Clean up requirements.txt by removing modules that are not imported in project
    --mode <scheme>       Enables dynamic versioning with <compat>, <gt> or <non-pin> schemes
                          <compat> | e.g. Flask~=1.1.2
                          <gt>     | e.g. Flask>=1.1.2
                          <no-pin> | e.g. Flask
```

[Link to pipreqs](https://github.com/bndr/pipreqs/).

### pydeps: Python Module Dependency Visualization

If you want to generate the graph showing the dependencies of your Python modules, try pydeps. 

For example, to generate the dependency graph for files in the folder [top_github_scraper](https://github.com/khuyentran1401/top-github-scraper/tree/master/top_github_scraper), I type:

```bash
$ pydeps top_github_scraper
```
The image below is the output of the command:

![image](../img/top_github_scraper.png)

The folder structure of top_github_scraper looks like the below:

```bash
top_github_scraper
├── __init__.py
├── scrape_repo.py
├── scrape_user.py
└── utils.py
```

[Link to pydeps](https://github.com/thebjorn/pydeps).

### Compare Dependencies of Two Requirements Files

In [None]:
!pip install compare-requirements

It can be cumbersome to compare the dependencies between two requirements files. Especially when there are a lot of dependencies in each file. To automate the comparison, use `compare-requirements`. 

For example, if your reqs1.txt looks like this:

In [1]:
%%writefile reqs1.txt
numpy==1.19.5
datacommons-pandas==0.0.3
pandas==1.3.3

Writing reqs1.txt


and your reqs2.txt looks like this:

In [2]:
%%writefile reqs2.txt
numpy==1.19.5
datacommons-pandas==0.0.3
pandas==1.3.4
pandas-datareader==0.10.0

Writing reqs2.txt


Running 
```bash
$ cmpreqs reqs1.txt reqs2.txt
```
will output:

In [2]:
!cmpreqs reqs1.txt reqs2.txt 


Different dependencies
Name    reqs1.txt  reqs2.txt
------  ---------  ---------
pandas  1.3.3      1.3.4    

Equal dependencies
Name                Version
------------------  -------
numpy               1.19.5 
datacommons-pandas  0.0.3  

Only available on reqs2.txt
Name               Version
-----------------  -------
pandas-datareader  0.10.0 

Only available on reqs1.txt
Name  Version
----  -------


[Link to compare-requirements](https://github.com/alsur/compare-requirements).

### Poetry: Python Tool for Dependency Management and Packaging 

Have you ever updated a dependency of your project to a new version, and your code suddenly broke? That could be due to the incompatibility of the current dependencies and the new dependency. Wouldn't it be nice if you can check the compatibility between dependencies before installing new ones? That is when Poetry comes in handy.

To understand how Poetry works, start with initializing Poetry:

```bash
$ poetry init
```

In [None]:
!poetry init

Next, install the latest versions of pandas and NumPy using:
```bash
$ poetry add pandas numpy
```

In [7]:
!poetry add pandas numpy

Using version [1m^1.4.1[0m for [36mpandas[0m
Using version [1m^1.22.2[0m for [36mnumpy[0m

[34mUpdating dependencies[0m
[2K[34mResolving dependencies...[0m [39;2m(0.3s)[0m

[34mWriting lock file[0m

[1mPackage operations[0m: [34m5[0m installs, [34m0[0m updates, [34m0[0m removals

  [34;1m•[0m [39mInstalling [0m[36msix[0m[39m ([0m[39;1m1.16.0[0m[39m)[0m: [34mPending...[0m
[1A[0J  [34;1m•[0m [39mInstalling [0m[36msix[0m[39m ([0m[39;1m1.16.0[0m[39m)[0m: [34mInstalling...[0m
[1A[0J  [32;1m•[0m [39mInstalling [0m[36msix[0m[39m ([0m[32m1.16.0[0m[39m)[0m
  [34;1m•[0m [39mInstalling [0m[36mnumpy[0m[39m ([0m[39;1m1.22.2[0m[39m)[0m: [34mPending...[0m
  [34;1m•[0m [39mInstalling [0m[36mpython-dateutil[0m[39m ([0m[39;1m2.8.2[0m[39m)[0m: [34mPending...[0m
  [34;1m•[0m [39mInstalling [0m[36mpytz[0m[39m ([0m[39;1m2021.3[0m[39m)[0m: [34mPending...[0m
[2A[0J  [34;1m•[0m [39mInstalling [0

Now your `pyproject.toml` file should look like this:

```yaml
# pyproject.toml
[tool.poetry.dependencies]
python = "^3.8"
pandas = "^1.4.1"
numpy = "^1.22.2"
```

You decide to use the earlier version of NumPy so you run:

```bash
$ poetry add 'numpy<1.18'
```

Since pandas==1.4.1 requires numpy>=1.18.5, numpy<1.18 is not installed. Thus, you avoid installing dependencies that are not compatible with the current dependencies.

In [9]:
!poetry add 'numpy<1.18'


[34mUpdating dependencies[0m
[2K[34mResolving dependencies...[0m [39;2m(0.1s)[0m

  [31;1mSolverProblemError[0m

  [1mBecause pandas (1.4.1) depends on numpy (>=1.18.5)
   and no versions of pandas match >1.4.1,<2.0.0, pandas (>=1.4.1,<2.0.0) requires numpy (>=1.18.5).
  So, because chapter6 depends on both pandas (^1.4.1) and numpy (<1.18), version solving failed.[0m

  at [32m~/.poetry/lib/poetry/puzzle/solver.py[0m:[1m241[0m in [36m_solve[0m
      [39;2m237[0m[39;2m│[0m [39m            packages [0m[39;2m= [0m[39mresult[0m[39;2m.[0m[39mpackages[0m
      [39;2m238[0m[39;2m│[0m [39m        [0m[35;1mexcept [0m[39mOverrideNeeded [0m[35;1mas [0m[39me[0m[39;2m:[0m
      [39;2m239[0m[39;2m│[0m [39m            [0m[35;1mreturn [0m[39;1mself[0m[39;2m.[0m[39msolve_in_compatibility_mode[0m[39;2m([0m[39me[0m[39;2m.[0m[39moverrides[0m[39;2m, [0m[39muse_latest[0m[39;2m=[0m[39muse_latest[0m[39;2m)[0m
      [39;2m240[0m

To view what sub-dependencies of a dependency, type:

```bash
$ poetry show pandas  
```

In [11]:
!poetry show pandas  

[34mname[0m         : [36mpandas[0m
[34mversion[0m      : [1m1.4.1[0m
[34mdescription[0m  : Powerful data structures for data analysis, time series, and
            statistics

[34mdependencies[0m
 - [36mnumpy[0m [1m>=1.18.5[0m
 - [36mnumpy[0m [1m>=1.19.2[0m
 - [36mnumpy[0m [1m>=1.20.0[0m
 - [36mnumpy[0m [1m>=1.21.0[0m
 - [36mpython-dateutil[0m [1m>=2.8.1[0m
 - [36mpytz[0m [1m>=2020.1[0m


Another cool thing about Poetry is that when you remove a dependency, it also removes sub-dependencies that are no longer needed in your project.

```bash
$ poetry remove pandas 
```

In [13]:
!poetry remove pandas 

[34mUpdating dependencies[0m
[2K[34mResolving dependencies...[0m [39;2m(0.1s)[0m

[34mWriting lock file[0m

[1mPackage operations[0m: [34m0[0m installs, [34m0[0m updates, [34m4[0m removals

  [34;1m•[0m [39mRemoving [0m[36mpandas[0m[39m ([0m[39;1m1.4.1[0m[39m)[0m: [34mPending...[0m
[1A[0J  [34;1m•[0m [39mRemoving [0m[36mpandas[0m[39m ([0m[39;1m1.4.1[0m[39m)[0m: [34mRemoving...[0m
[1A[0J  [32;1m•[0m [39mRemoving [0m[36mpandas[0m[39m ([0m[32m1.4.1[0m[39m)[0m
  [34;1m•[0m [39mRemoving [0m[36mpython-dateutil[0m[39m ([0m[39;1m2.8.2[0m[39m)[0m: [34mPending...[0m
[1A[0J  [34;1m•[0m [39mRemoving [0m[36mpython-dateutil[0m[39m ([0m[39;1m2.8.2[0m[39m)[0m: [34mRemoving...[0m
[1A[0J  [32;1m•[0m [39mRemoving [0m[36mpython-dateutil[0m[39m ([0m[32m2.8.2[0m[39m)[0m
  [34;1m•[0m [39mRemoving [0m[36mpytz[0m[39m ([0m[39;1m2021.3[0m[39m)[0m: [34mPending...[0m
[1A[0J  [34;1m•[0m [39mRe

[Link to Poetry](https://python-poetry.org/docs).

[My full article on how to publish your Python package to PyPI using Poetry](https://towardsdatascience.com/how-to-effortlessly-publish-your-python-package-to-pypi-using-poetry-44b305362f9f?gi=f5490e76b74)

### PyInstaller: Bundle a Python Application Into a Single Executable

In [None]:
!pip install pyinstaller

To package a Python application along with its dependencies into a single executable, use PyInstaller. With PyInstaller, users can run the packaged app without installing a Python interpreter or any modules.  

To see how PyInstaller works, let's start with creating a `main.py` script that depends on another Python script and various Python modules.

In [5]:
%%writefile get_data.py
import pandas as pd  
import numpy as np  

def get_data():
    return pd.DataFrame(np.random.randn(10, 2), columns=['A', 'B'])

Writing get_data.py


In [10]:
%%writefile main.py
from get_data import get_data

df = get_data()
print(f'Dataframe:\n{df}')

Overwriting main.py


Next, execute PyInstaller against the `main.py` script, specifying the `onefile` option to bundle the application into a single file.

```bash
$ pyinstaller main.py --onefile
```

In [None]:
!pyinstaller main.py --onefile

After the command completes, your directory structure will look like this:
```bash
.
├── build/
├── dist/
│   └── main
├── main.spec
├── main.py
└── get_data.py
```

The "dist/main" file contains all dependencies and executable for your application.

Now, running the "dist/main" file will execute the application.

```bash
$ ./dist/main
```

In [18]:
!./dist/main

Dataframe:
          A         B
0  0.255826 -1.038615
1 -0.850358  0.318558
2  1.255311  0.618789
3  1.434642  0.474813
4  0.676099  1.662942
5  2.314174 -0.142569
6 -0.704812 -0.095609
7 -0.156275 -0.999871
8  0.839902  0.366550
9 -1.818387 -1.512015


You can conveniently share this file with your colleagues, allowing them to run the application without any additional setup or installations.

[Link to PyInstaller](https://github.com/pyinstaller/pyinstaller)