# URL: https://docs.google.com/spreadsheets/d/DOCUMENT_ID/edit?usp=sharing
= "1Y5JD6sLakLmcr723HA4pUBnqGQgzxW3q_hb2ERe_4us" DOCUMENT_ID
38 Google Sheets Integrations
Google Colab notebooks can easily integrate with other Google products and services
In this notebook we will interface with Google Sheets to programmatically create sheets, and read and write data.
38.1 Document Setup
We first need to create a Google Sheets document, for instance this example document. If anyone else besides you will be running this notebook, also share the document with them.
Before proceeding, we observe the identifier of the Google Sheets document from the URL (between the “/d/” and “/edit”):
After noting the document identifier, we can use some boilerplate code to authenticate and access the document.
38.3 Connecting to Google Sheets
To interface with the Google Sheets API, we will be using the gspread
package.
Verifying the package is installed:
!pip show gspread
Name: gspread
Version: 6.2.1
Summary: Google Spreadsheets Python API
Home-page:
Author:
Author-email: Anton Burnashev <fuss.here@gmail.com>
License:
Location: /usr/local/lib/python3.12/dist-packages
Requires: google-auth, google-auth-oauthlib
Required-by: gspread-dataframe
Creating a client object that can make requests to the API:
import gspread
= gspread.authorize(creds)
client client
<gspread.client.Client at 0x7ea375575d00>
Accessing the document (referencing the document identifier from the “Document Setup” step):
= client.open_by_key(DOCUMENT_ID)
doc print("DOC:", type(doc))
DOC: <class 'gspread.spreadsheet.Spreadsheet'>
Accessing metadata about the document, such as it’s title:
print(doc.title)
Google Sheets Integration Demo
38.3.1 Listing Sheets
Listing worksheets in the document:
doc.worksheets()
[<Worksheet 'daily-prices-nflx' id:0>,
<Worksheet 'daily-prices-nflx-copy' id:193822370>]
38.3.2 Accessing a Given Sheet
Accessing a specific sheet, referencing the sheet name:
= doc.worksheet("daily-prices-nflx")
sheet print("SHEET:", type(sheet))
SHEET: <class 'gspread.worksheet.Worksheet'>
Accessing metadata about the sheet, such as the title:
sheet.title
'daily-prices-nflx'
38.3.3 Reading Data
Reading data from a given sheet, using the sheet object’s get_all_records
method:
= sheet.get_all_records()
records print("ROWS:", len(records))
ROWS: 5555
This returns a list of dictionaries, where the dictionary’s keys correspond with the column names.
Inspecting the first row (for example):
0] records[
{'timestamp': '2024-06-17',
'open': 669.11,
'high': 682.7099,
'low': 665.1101,
'close': 675.83,
'adjusted_close': 675.83,
'volume': 3631184,
'dividend_amount': 0,
'split_coefficient': 1}
Converting records to a dataframe, as desired:
from pandas import DataFrame
= DataFrame(records)
df df.head()
timestamp | open | high | low | close | adjusted_close | volume | dividend_amount | split_coefficient | |
---|---|---|---|---|---|---|---|---|---|
0 | 2024-06-17 | 669.11 | 682.7099 | 665.1101 | 675.83 | 675.83 | 3631184 | 0 | 1.0 |
1 | 2024-06-14 | 655.05 | 675.5800 | 652.6400 | 669.38 | 669.38 | 4447116 | 0 | 1.0 |
2 | 2024-06-13 | 644.00 | 655.2500 | 642.3500 | 653.26 | 653.26 | 1863587 | 0 | 1.0 |
3 | 2024-06-12 | 652.21 | 655.7800 | 643.1100 | 650.06 | 650.06 | 2094381 | 0 | 1.0 |
4 | 2024-06-11 | 640.72 | 650.1900 | 640.5200 | 648.55 | 648.55 | 2167417 | 0 | 1.0 |
38.4 Helper Functions
Here are some functions to help you more easily interface with the gspread
package.
Feel free to copy and adapt these helper functions to suit your own needs, as desired.
38.4.1 Find or Create Sheet
Helper function to find or create a given sheet:
Code
from gspread.exceptions import WorksheetNotFound
def find_or_create_sheet(sheet_name, doc=doc):
"""Access a sheet within the document, or create if not exists.
Params:
sheet_name (str): name of the sheet to access or create.
doc (gspread.Spreadsheet): Google Sheets document object.
Returns:
gspread.Worksheet: sheet object
"""
try:
= doc.worksheet(sheet_name)
sheet except WorksheetNotFound:
print("CREATING NEW SHEET...")
= doc.add_worksheet(title=sheet_name, rows="10", cols="10")
sheet return sheet
Using the function to create a new sheet:
= find_or_create_sheet("teams-sheet")
teams_sheet print("SHEET:", type(teams_sheet))
print(teams_sheet.title)
CREATING NEW SHEET...
SHEET: <class 'gspread.worksheet.Worksheet'>
teams-sheet
38.4.2 Writing Data to the Sheet
Helper function for writing data:
Code
def write_data_to_sheet(data, sheet):
"""Writes a list of dictionaries to a given sheet.
Infers the column names from the dictionary keys.
Params:
data (list): list of dictionaries you would like to write to sheet.
sheet (gspread.Worksheet): sheet object you would like to write to.
"""
# get column names from the first dictionary:
= list(data[0].keys())
header_row # convert dictionaries to lists of just the cell values:
= [list(d.values()) for d in data]
rows # assemble a single list of lists:
= [header_row] + rows
all_rows
# clear sheet contents:
sheet.clear()# write new contents to the sheet:
sheet.update(all_rows)
Using the function to write data:
= [
teams "city": "New York", "name": "Yankees"},
{"city": "New York", "name": "Mets"},
{"city": "Boston", "name": "Red Sox"},
{"city": "New Haven", "name": "Ravens"}
{
] write_data_to_sheet(teams, teams_sheet)
Verifying the data has been written:
teams_sheet.get_all_records()
[{'city': 'New York', 'name': 'Yankees'},
{'city': 'New York', 'name': 'Mets'},
{'city': 'Boston', 'name': 'Red Sox'},
{'city': 'New Haven', 'name': 'Ravens'}]
38.4.3 Writing a Dataframe to the Sheet
Helper function for writing a pandas.DataFrame
object to a sheet:
Code
def write_dataframe_to_sheet(df, sheet):
"""Write dataframe contents to a given sheet.
Params:
df (pandas.DataFrame): dataframe you would like to write to sheet.
sheet (gspread.Worksheet): sheet object you would like to write to.
"""
# get column names:
= df.columns.tolist()
header_row # get row values:
= df.values.tolist()
rows # single object with column names and row values:
= [header_row] + rows
all_rows
# clear current contents of the sheet:
sheet.clear()# write new contents to the sheet:
sheet.update(all_rows)
Creating a new dataframe:
from pandas import DataFrame
= DataFrame(teams)
teams_df teams_df.head()
city | name | |
---|---|---|
0 | New York | Yankees |
1 | New York | Mets |
2 | Boston | Red Sox |
3 | New Haven | Ravens |
Using the function to write the data:
write_dataframe_to_sheet(teams_df, teams_sheet)
Verifying the data got written:
teams_sheet.get_all_records()
[{'city': 'New York', 'name': 'Yankees'},
{'city': 'New York', 'name': 'Mets'},
{'city': 'Boston', 'name': 'Red Sox'},
{'city': 'New Haven', 'name': 'Ravens'}]