Package Management with Pip

As mentioned, Python has a rich ecosystem of third-party open source libraries called packages, that can provide us with super-charged capabilities to make our lives easier.

If we want to use one of these packages, we must first install it. After installation, we can import its functionality and use it in our program.

In Google Colab, many of the most popular Python packages come pre-installed, so there is no need to install them again.

But it is very common to want to use other packages in the Python ecosystem as well, including any packages we might find shared on GitHub, or hosted more officially on the Python Package Index (PyPI), which is a centralized repository for all official Python packages.

To manage the Python package installation process, we use Python’s sidekick, a command-line tool called Pip.

Note

In Google Colab, when we use terminal commands (such as pip commands), we prefix them with an exclamation point (!), to differentiate them from Python code.

!pip --version
pip 23.1.2 from /usr/local/lib/python3.10/dist-packages/pip (python 3.10)
#!pip --help

Listing Installed Packages

We use a pip list command to list all packages installed in the current environment (which in this case is the Google Colab environment).

!pip list

Wow, we see there are dozens of packages already installed.

If you are looking for a specific package, you can search for it within the displayed output by piping a grep command to search for the package you want (e.g. pandas):

!pip list | grep pandas
geopandas                        0.13.2
pandas                           2.0.3
pandas-datareader                0.10.0
pandas-gbq                       0.19.2
pandas-stubs                     2.0.3.230814
sklearn-pandas                   2.2.0

Installing Packages

We use a pip install command to install a package, supplying the name of the package we want to install.

For example, if we wanted to install the pandas package, we would use a command pip install pandas (although we see this is already installed in Colab):

!pip install pandas
Requirement already satisfied: pandas in /usr/local/lib/python3.10/dist-packages (2.0.3)
Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.10/dist-packages (from pandas) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas) (2023.4)
Requirement already satisfied: tzdata>=2022.1 in /usr/local/lib/python3.10/dist-packages (from pandas) (2024.1)
Requirement already satisfied: numpy>=1.21.0 in /usr/local/lib/python3.10/dist-packages (from pandas) (1.25.2)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.8.2->pandas) (1.16.0)

Let’s try installing a package for real, such as the yahooquery package, which provides access to financial data:

%%capture
!pip install yahooquery
Pro Tip

If we optionally add a “capture magic” (%%capture) at the top of a cell, it will suppress outputs during the installation process.

When we install packages, Pip will download the source code from the PyPI into the current development environment.

!pip list | grep yahooquery
yahooquery                       2.3.7

Using Installed Packages

After installing a package, we now have access to import its functionality and use it. Generally this involves reading the package documentation to see what capabilities it provides. For example, after consulting the yahooquery package documentation, it says we can get started like this:

# https://yahooquery.dpguthrie.com/
# https://pypi.org/project/yahooquery/

from yahooquery import Ticker

t = Ticker('aapl')
t.summary_detail
{'aapl': {'maxAge': 1,
  'priceHint': 2,
  'previousClose': 208.14,
  'open': 209.08,
  'dayLow': 208.61,
  'dayHigh': 210.77,
  'regularMarketPreviousClose': 208.14,
  'regularMarketOpen': 209.08,
  'regularMarketDayLow': 208.61,
  'regularMarketDayHigh': 210.77,
  'dividendRate': 1.0,
  'dividendYield': 0.0047999998,
  'exDividendDate': '2024-05-10 00:00:00',
  'payoutRatio': 0.14930001,
  'fiveYearAvgDividendYield': 0.71,
  'beta': 1.25,
  'trailingPE': 32.734837,
  'forwardPE': 28.912773,
  'volume': 4248669,
  'regularMarketVolume': 4248669,
  'averageVolume': 68294093,
  'averageVolume10days': 122265030,
  'averageDailyVolume10Day': 122265030,
  'bid': 210.52,
  'ask': 210.4,
  'bidSize': 300,
  'askSize': 400,
  'marketCap': 3227597930496,
  'fiftyTwoWeekLow': 164.08,
  'fiftyTwoWeekHigh': 220.2,
  'priceToSalesTrailing12Months': 8.457556,
  'fiftyDayAverage': 187.4738,
  'twoHundredDayAverage': 183.01765,
  'trailingAnnualDividendRate': 0.96,
  'trailingAnnualDividendYield': 0.00461228,
  'currency': 'USD',
  'fromCurrency': None,
  'toCurrency': None,
  'lastMarket': None,
  'coinMarketCapLink': None,
  'algorithm': None,
  'tradeable': False}}

It is crucially important to read the package documentation to familiarize yourself with the package’s functionality. After further consulting the yahooquery package documentation, we find additional capabilities:

# https://yahooquery.dpguthrie.com/guide/ticker/intro/

t.summary_profile
{'aapl': {'address1': 'One Apple Park Way',
  'city': 'Cupertino',
  'state': 'CA',
  'zip': '95014',
  'country': 'United States',
  'phone': '408 996 1010',
  'website': 'https://www.apple.com',
  'industry': 'Consumer Electronics',
  'industryKey': 'consumer-electronics',
  'industryDisp': 'Consumer Electronics',
  'sector': 'Technology',
  'sectorKey': 'technology',
  'sectorDisp': 'Technology',
  'longBusinessSummary': 'Apple Inc. designs, manufactures, and markets smartphones, personal computers, tablets, wearables, and accessories worldwide. The company offers iPhone, a line of smartphones; Mac, a line of personal computers; iPad, a line of multi-purpose tablets; and wearables, home, and accessories comprising AirPods, Apple TV, Apple Watch, Beats products, and HomePod. It also provides AppleCare support and cloud services; and operates various platforms, including the App Store that allow customers to discover and download applications and digital content, such as books, music, video, games, and podcasts. In addition, the company offers various services, such as Apple Arcade, a game subscription service; Apple Fitness+, a personalized fitness service; Apple Music, which offers users a curated listening experience with on-demand radio stations; Apple News+, a subscription news and magazine service; Apple TV+, which offers exclusive original content; Apple Card, a co-branded credit card; and Apple Pay, a cashless payment service, as well as licenses its intellectual property. The company serves consumers, and small and mid-sized businesses; and the education, enterprise, and government markets. It distributes third-party applications for its products through the App Store. The company also sells its products through its retail and online stores, and direct sales force; and third-party cellular network carriers, wholesalers, retailers, and resellers. Apple Inc. was founded in 1976 and is headquartered in Cupertino, California.',
  'fullTimeEmployees': 150000,
  'companyOfficers': [],
  'irWebsite': 'http://investor.apple.com/',
  'maxAge': 86400}}

We can continue to explore the documentation to learn about additional functionality.

By installing and using packages, we are harnessing the power of the open source Python ecosystem.