QA Graphic

Determine Installed Python Packages and Their Disk Usage

Useful for Disk Cleanup

When working with Python, it's common to install numerous packages over time. Some of these packages might no longer be needed, and identifying them can help free up valuable disk space. In this blog post, we'll explore a practical way to list all installed Python packages, locate their directories, and estimate their size on disk.

The Command Breakdown

The following command pipeline is a powerful one-liner that accomplishes our goal:

pip list 
| tail -n +3 
| awk '{print $1}' 
| xargs pip show 
| grep -E 'Location:|Name:' 
| cut -d ' ' -f 2 
| paste -d ' ' - - 
| awk '{print $2 "/" tolower($1)}' 
| xargs du -sh 2> /dev/null 
| sort -hr

Let's break this command into digestible pieces:

  1. pip list: Lists all installed Python packages and their versions.

  2. tail -n +3: Removes the first two lines of the output (the header row) to leave only the package names and versions.

  3. awk '{print $1}': Extracts the first column, which contains the package names.

  4. xargs pip show: Feeds the package names to the pip show command to retrieve details about each package.

  5. grep -E 'Location:|Name:': Filters the output to include only the Location and Name fields.

  6. cut -d ' ' -f 2: Splits each line by spaces and extracts the second field, which is the value of the Location and Name fields.

  7. paste -d ' ' - -: Combines the Name and Location outputs into a single line per package.

  8. awk '{print $2 "/" tolower($1)}': Constructs the full path to each package by appending the package name to its location.

  9. xargs du -sh 2> /dev/null: Calculates the disk usage of each package directory and suppresses error messages (e.g., for inaccessible directories).

  10. sort -hr: Sorts the packages by size in descending order.

Example Output

Running this command produces output similar to:

12M /path/to/python/site-packages/numpy
8.5M /path/to/python/site-packages/pandas
3.4M /path/to/python/site-packages/scipy
...

This shows the size of each installed package, helping you identify large ones that might no longer be necessary.

Use Cases

  1. Disk Space Cleanup: Remove large, unused packages to free up space.

  2. Environment Management: Understand which packages are installed and ensure that only necessary ones are present in your environment.

Pro Tip: Automating Cleanup

Once you identify packages you no longer need, you can remove them using:

pip uninstall <package_name>

Considerations

  • Virtual Environments: Always run this command within the virtual environment you want to inspect to avoid confusion with global packages.
  • Dependencies: Be cautious when uninstalling packages as they may be dependencies for others.

By understanding your Python environment, you can keep it clean, efficient, and ready for action. Try out the command and see how much disk space you can reclaim!