10  Strings

The string datatype represents textual messages, comprised of any number of alphanumeric characters.

10.1 Characteristics

There are three ways to construct a string: by using single quotes, double quotes, or a multi-line string.

Single quotes (') on the extremities:

message = 'HELLO WORLD'
print(message)
HELLO WORLD

Double quotes (") on the extremities:

message = "HELLO WORLD"
print(message)
HELLO WORLD

Multiline string, uses triple quotes (""") on the extremities:

message = """
This is a menu for our application.

To get started, follow these instructions:

  1. __________
  2. ____________
  3. _____________

"""
print(message)

This is a menu for our application.

To get started, follow these instructions:

  1. __________
  2. ____________
  3. _____________

In practice, we might prefer to use double quotes by default. This allows us to use single quotes inside for contractions, whereas this would otherwise break the quoting level:

message = "It's awesome!"
print(message)
It's awesome!
#message = 'It's awesome!' # NOT VALID (BREAKS QUOTING LEVEL)
#print(message)

10.2 Operations

In practice, some common operations we perform with strings include concatenation, case manipulation, and substring checking, among others.

10.2.1 Concatenation

message = "HELLO" + "WORLD"
print(message)

message = "HELLO " + "WORLD"
print(message)

message = "HELLO" + " WORLD"
print(message)

message = "HELLO" + " " + "WORLD"
print(message)
HELLOWORLD
HELLO WORLD
HELLO WORLD
HELLO WORLD

10.2.2 Format Strings

As an alternative to concatenation, we can use a format string to dynamically compile a string.

Recall that we are not able to concatenate strings with non-string datatypes such as numbers. To overcome this limitation, we could use string conversion function, or more commonly, a format string.

The format string allows us to inject one or more variables into a string.

To implement a format string, we need the letter f immediately preceding the string. And we need curly braces ({}) inside the string. This allows us to inject a variable inside the curly braces:

price = 4.5
print(f"PRICE: ${price}")
PRICE: $4.5

We can use a formatting directive such as :.2f to format a number to two decimal places, for example to format a number as US Dollars (USD):

price = 4.5
print(f"PRICE: ${price:.2f}")
PRICE: $4.50

A formatting directive such as :, to use a thousands separator:

shares = 120000
print(f"SHARES: {shares:,}")
SHARES: 120,000

Notice, it is possible to inject multiple variables in the format string, and to mix and match formatting directives:

price = 4.5
shares = 120000
stake = price * shares
print(f"PRICE: ${price:.2f} | SHARES: {shares:,} | STAKE: ${stake:,.2f}")
PRICE: $4.50 | SHARES: 120,000 | STAKE: $540,000.00

10.2.3 Case Manipulation

Converting to all uppercase:

"hello WorlD".upper()
'HELLO WORLD'

Converting to all lowercase:

"hello WorlD".lower()
'hello world'

Converting to title case, where the first letter of each word is capitalized:

"hello WorlD".title()
'Hello World'

10.2.4 Substring Checking

We can use the inclusion operator to perform substring checking:

"h" in "hello world"
True

Note this is case sensitive:

"H" in "hello world"
False

10.2.5 Substring Replacement

It is possible to replace all instances of a substring within a larger string:

"hello world. as the world turns.".replace("world", "globe")
'hello globe. as the globe turns.'

10.2.6 Length Checking

A string is like a list of individual characters.

Like a list, we can use the len function to count the number of characters in a string:

len("HELLO WORLD")
11

10.2.7 String Indexing and Slicing

Similar to list operations, we can use indices to reference a specific character in the string, or a sequence of characters (i.e. string slicing):

"HELLO WORLD"[0]
'H'
"HELLO WORLD"[0:5]
'HELLO'

10.2.8 String Splitting

The split method will split a string on a designated delimiter:

"hello world".split(" ")
['hello', 'world']
"first | second | third".split(" | ")
['first', 'second', 'third']