8  Scatter Plot with Trendlines, Revisited

We have previously studied how to create scatter plots with trendlines. We can do this with tabular data as well.

Constructing a DataFrame from raw data:

from pandas import DataFrame

scatter_data = [
    {"income": 30_000, "life_expectancy": 65.5},
    {"income": 35_000, "life_expectancy": 62.1},
    {"income": 50_000, "life_expectancy": 66.7},
    {"income": 55_000, "life_expectancy": 71.0},
    {"income": 70_000, "life_expectancy": 72.5},
    {"income": 75_000, "life_expectancy": 77.3},
    {"income": 90_000, "life_expectancy": 82.9},
    {"income": 95_000, "life_expectancy": 80.0},
]
df = DataFrame(scatter_data)
df.head()
income life_expectancy
0 30000 65.5
1 35000 62.1
2 50000 66.7
3 55000 71.0
4 70000 72.5

Linear trends using the “ols” trendline parameter value:

from plotly.express import scatter

fig = scatter(df, x="income", y="life_expectancy", height=350,
                title="Life Expectancy by Income",
                labels={"x": "Income", "life_expectancy": "Life Expectancy (years)"},
                trendline="ols", trendline_color_override="red"
)
fig.show()

Non-linear trends using the “lowess” trendline parameter value:

fig = scatter(df, x="income", y="life_expectancy", height=350,
                title="Life Expectancy by Income",
                labels={"x": "Income", "life_expectancy": "Life Expectancy (years)"},
                trendline="lowess", trendline_color_override="red"
)
fig.show()