38  Google Sheets Integrations

Google Colab notebooks can easily integrate with other Google products and services

In this notebook we will interface with Google Sheets to programmatically create sheets, and read and write data.

38.1 Document Setup

We first need to create a Google Sheets document, for instance this example document. If anyone else besides you will be running this notebook, also share the document with them.

Before proceeding, we observe the identifier of the Google Sheets document from the URL (between the “/d/” and “/edit”):

# URL: https://docs.google.com/spreadsheets/d/DOCUMENT_ID/edit?usp=sharing

DOCUMENT_ID = "1Y5JD6sLakLmcr723HA4pUBnqGQgzxW3q_hb2ERe_4us"

After noting the document identifier, we can use some boilerplate code to authenticate and access the document.

38.2 Authorization

Authorizing Google Colab to authenticate on behalf of the currently logged in user:

from google.colab import auth

auth.authenticate_user()

This process asks you to allow the notebook to access your credentials.

Screenshot of login screen that asks if its ok to “Allow this notebook to access your Google credentials?”

Getting credentials for the logged in user:

from google.auth import default

creds, _ = default()
creds
<google.auth.compute_engine.credentials.Credentials at 0x7c8615edcf50>

38.3 Connecting to Google Sheets

To interface with the Google Sheets API, we will be using the gspread package.

Verifying the package is installed:

!pip show gspread
Name: gspread
Version: 6.2.1
Summary: Google Spreadsheets Python API
Home-page: 
Author: 
Author-email: Anton Burnashev <fuss.here@gmail.com>
License: 
Location: /usr/local/lib/python3.12/dist-packages
Requires: google-auth, google-auth-oauthlib
Required-by: gspread-dataframe

Creating a client object that can make requests to the API:

import gspread

client = gspread.authorize(creds)
client
<gspread.client.Client at 0x7ea375575d00>

Accessing the document (referencing the document identifier from the “Document Setup” step):

doc = client.open_by_key(DOCUMENT_ID)
print("DOC:", type(doc))
DOC: <class 'gspread.spreadsheet.Spreadsheet'>

Accessing metadata about the document, such as it’s title:

print(doc.title)
Google Sheets Integration Demo

38.3.1 Listing Sheets

Listing worksheets in the document:

doc.worksheets()
[<Worksheet 'daily-prices-nflx' id:0>,
 <Worksheet 'daily-prices-nflx-copy' id:193822370>]

38.3.2 Accessing a Given Sheet

Accessing a specific sheet, referencing the sheet name:

sheet = doc.worksheet("daily-prices-nflx")
print("SHEET:", type(sheet))
SHEET: <class 'gspread.worksheet.Worksheet'>

Accessing metadata about the sheet, such as the title:

sheet.title
'daily-prices-nflx'

38.3.3 Reading Data

Reading data from a given sheet, using the sheet object’s get_all_records method:

records = sheet.get_all_records()
print("ROWS:", len(records))
ROWS: 5555

This returns a list of dictionaries, where the dictionary’s keys correspond with the column names.

Inspecting the first row (for example):

records[0]
{'timestamp': '2024-06-17',
 'open': 669.11,
 'high': 682.7099,
 'low': 665.1101,
 'close': 675.83,
 'adjusted_close': 675.83,
 'volume': 3631184,
 'dividend_amount': 0,
 'split_coefficient': 1}

Converting records to a dataframe, as desired:

from pandas import DataFrame

df = DataFrame(records)
df.head()
timestamp open high low close adjusted_close volume dividend_amount split_coefficient
0 2024-06-17 669.11 682.7099 665.1101 675.83 675.83 3631184 0 1.0
1 2024-06-14 655.05 675.5800 652.6400 669.38 669.38 4447116 0 1.0
2 2024-06-13 644.00 655.2500 642.3500 653.26 653.26 1863587 0 1.0
3 2024-06-12 652.21 655.7800 643.1100 650.06 650.06 2094381 0 1.0
4 2024-06-11 640.72 650.1900 640.5200 648.55 648.55 2167417 0 1.0

38.4 Helper Functions

Here are some functions to help you more easily interface with the gspread package.

Feel free to copy and adapt these helper functions to suit your own needs, as desired.

38.4.1 Find or Create Sheet

Helper function to find or create a given sheet:

Code
from gspread.exceptions import WorksheetNotFound


def find_or_create_sheet(sheet_name, doc=doc):
    """Access a sheet within the document, or create if not exists.

    Params:
        sheet_name (str): name of the sheet to access or create.

        doc (gspread.Spreadsheet): Google Sheets document object.

    Returns:
        gspread.Worksheet: sheet object
    """
    try:
        sheet = doc.worksheet(sheet_name)
    except WorksheetNotFound:
        print("CREATING NEW SHEET...")
        sheet = doc.add_worksheet(title=sheet_name, rows="10", cols="10")
    return sheet

Using the function to create a new sheet:

teams_sheet = find_or_create_sheet("teams-sheet")
print("SHEET:", type(teams_sheet))
print(teams_sheet.title)
CREATING NEW SHEET...
SHEET: <class 'gspread.worksheet.Worksheet'>
teams-sheet

38.4.2 Writing Data to the Sheet

Helper function for writing data:

Code
def write_data_to_sheet(data, sheet):
    """Writes a list of dictionaries to a given sheet.

    Infers the column names from the dictionary keys.

    Params:
        data (list): list of dictionaries you would like to write to sheet.

        sheet (gspread.Worksheet): sheet object you would like to write to.
    """
    # get column names from the first dictionary:
    header_row = list(data[0].keys())
    # convert dictionaries to lists of just the cell values:
    rows = [list(d.values()) for d in data]
    # assemble a single list of lists:
    all_rows = [header_row] + rows

    # clear sheet contents:
    sheet.clear()
    # write new contents to the sheet:
    sheet.update(all_rows)

Using the function to write data:

teams = [
    {"city": "New York", "name": "Yankees"},
    {"city": "New York", "name": "Mets"},
    {"city": "Boston", "name": "Red Sox"},
    {"city": "New Haven", "name": "Ravens"}
]
write_data_to_sheet(teams, teams_sheet)

Verifying the data has been written:

teams_sheet.get_all_records()
[{'city': 'New York', 'name': 'Yankees'},
 {'city': 'New York', 'name': 'Mets'},
 {'city': 'Boston', 'name': 'Red Sox'},
 {'city': 'New Haven', 'name': 'Ravens'}]

38.4.3 Writing a Dataframe to the Sheet

Helper function for writing a pandas.DataFrame object to a sheet:

Code
def write_dataframe_to_sheet(df, sheet):
    """Write dataframe contents to a given sheet.

    Params:
        df (pandas.DataFrame): dataframe you would like to write to sheet.

        sheet (gspread.Worksheet): sheet object you would like to write to.
    """
    # get column names:
    header_row = df.columns.tolist()
    # get row values:
    rows = df.values.tolist()
    # single object with column names and row values:
    all_rows = [header_row] + rows

    # clear current contents of the sheet:
    sheet.clear()
    # write new contents to the sheet:
    sheet.update(all_rows)

Creating a new dataframe:

from pandas import DataFrame

teams_df = DataFrame(teams)
teams_df.head()
city name
0 New York Yankees
1 New York Mets
2 Boston Red Sox
3 New Haven Ravens

Using the function to write the data:

write_dataframe_to_sheet(teams_df, teams_sheet)

Verifying the data got written:

teams_sheet.get_all_records()
[{'city': 'New York', 'name': 'Yankees'},
 {'city': 'New York', 'name': 'Mets'},
 {'city': 'Boston', 'name': 'Red Sox'},
 {'city': 'New Haven', 'name': 'Ravens'}]