Average Number of Years Lost For People Who Died of Coronavirus in France

We illustrate the use of the package by estimating the average number of years by which people’s lives are shortened due to coronavirus. Using data from here that gives us the distribution of ages of people who died from COVID-19 in France, with conservative assumptions (assuming gender of the dead person to be male, taking the middle of age ranges) we find that people’s lives are shortened by about 9 years on average. These estimates are conservative for one additional reason: there is likely an inverse correlation between people who die and their expected longevity. And note that given a bulk of the deaths are among older people, when people are more infirm, the quality adjusted years lost is likely yet more modest. Using the most recent SSA data, we find that number to be also 9 years. Assuming people live till 90, the average number of years lost is 7. If we use data from WHO, the average number of years lost (if we take the middle of the age range), is 11.

[1]:
import pandas as pd

from lost_years import lost_years_hld, lost_years_ssa, lost_years_who

Prepare example input in DataFrame

Please look at country codes here:- https://www.lifetable.de/cgi-bin/country_codes.php

[2]:
df = pd.read_csv(
    "data/covid-cedc-quot.csv",
    usecols=["cl_age90", "Dc_Elec_Covid_cum"],
    delimiter=";",
)
df.columns = ["age", "n_deaths"]
df.drop(df.loc[df.age == 0].index, inplace=True)
df
[2]:
age n_deaths
73 9 0
74 9 0
75 9 0
76 9 0
77 9 0
... ... ...
16055 90 0
16056 90 0
16057 90 0
16058 90 0
16059 90 0

14600 rows × 2 columns

[3]:
gdf = df.groupby("age").agg({"n_deaths": sum})
df = gdf.reset_index()
df2 = pd.DataFrame(
    {
        "lowest_age": [0, 10, 20, 30, 40, 50, 60, 70, 80, 90],
        "middle_age": [5, 15, 25, 35, 45, 55, 65, 75, 85, 99],
        "highest_age": [9, 19, 29, 39, 49, 59, 69, 79, 89, 99],
    }
)
df = df.join(df2)
df["year"] = 2020
df["country"] = "FRA"
df["sex"] = "M"
df
/tmp/ipykernel_2847/2269237630.py:1: FutureWarning: The provided callable <built-in function sum> is currently using SeriesGroupBy.sum. In a future version of pandas, the provided callable will be used directly. To keep current behavior pass the string "sum" instead.
  gdf = df.groupby("age").agg({"n_deaths": sum})
[3]:
age n_deaths lowest_age middle_age highest_age year country sex
0 9 11 0 5 9 2020 FRA M
1 19 68 10 15 19 2020 FRA M
2 29 437 20 25 29 2020 FRA M
3 39 1561 30 35 39 2020 FRA M
4 49 3628 40 45 49 2020 FRA M
5 59 14106 50 55 59 2020 FRA M
6 69 36555 60 65 69 2020 FRA M
7 79 76238 70 75 79 2020 FRA M
8 89 145018 80 85 89 2020 FRA M
9 90 92290 90 99 99 2020 FRA M

Get Human Life Table data columns from HLD dataset

[4]:
highest_ldf = lost_years_hld(
    df, {"age": "lowest_age", "country": "country", "sex": "sex", "year": "year"}
)
highest_ldf.head()
[4]:
age n_deaths lowest_age middle_age highest_age year country sex hld_country hld_age hld_sex hld_year hld_life_expectancy
0 9 11 0 5 9 2020 FRA M FRA 0 M 2020 79.18
1 19 68 10 15 19 2020 FRA M FRA 10 M 2020 69.60
2 29 437 20 25 29 2020 FRA M FRA 20 M 2020 59.73
3 39 1561 30 35 39 2020 FRA M FRA 30 M 2020 50.08
4 49 3628 40 45 49 2020 FRA M FRA 40 M 2020 40.55

Note that the year we are matching to is 2015.

Assuming all the people who died were at the bottom of the age ranges

[5]:
highest_ldf["years_lost"] = (
    highest_ldf["hld_life_expectancy"] * highest_ldf["n_deaths"] / highest_ldf["n_deaths"].sum()
)
highest_ldf
[5]:
age n_deaths lowest_age middle_age highest_age year country sex hld_country hld_age hld_sex hld_year hld_life_expectancy years_lost
0 9 11 0 5 9 2020 FRA M FRA 0 M 2020 79.18 0.002355
1 19 68 10 15 19 2020 FRA M FRA 10 M 2020 69.60 0.012794
2 29 437 20 25 29 2020 FRA M FRA 20 M 2020 59.73 0.070563
3 39 1561 30 35 39 2020 FRA M FRA 30 M 2020 50.08 0.211334
4 49 3628 40 45 49 2020 FRA M FRA 40 M 2020 40.55 0.397704
5 59 14106 50 55 59 2020 FRA M FRA 50 M 2020 31.35 1.195482
6 69 36555 60 65 69 2020 FRA M FRA 60 M 2020 22.87 2.260032
7 79 76238 70 75 79 2020 FRA M FRA 70 M 2020 15.41 3.175965
8 89 145018 80 85 89 2020 FRA M FRA 80 M 2020 8.84 3.465579
9 90 92290 90 99 99 2020 FRA M FRA 90 M 2020 4.00 0.997967
[6]:
highest_ldf["years_lost"].sum().round()
[6]:
np.float64(12.0)
[7]:
lowest_ldf = lost_years_hld(
    df, {"age": "highest_age", "country": "country", "sex": "sex", "year": "year"}
)
lowest_ldf.head()
[7]:
age n_deaths lowest_age middle_age highest_age year country sex hld_country hld_age hld_sex hld_year hld_life_expectancy
0 9 11 0 5 9 2020 FRA M FRA 9 M 2020 70.59
1 19 68 10 15 19 2020 FRA M FRA 19 M 2020 60.70
2 29 437 20 25 29 2020 FRA M FRA 29 M 2020 51.04
3 39 1561 30 35 39 2020 FRA M FRA 39 M 2020 41.49
4 49 3628 40 45 49 2020 FRA M FRA 49 M 2020 32.25

Assuming all the people who died were at the top of the age ranges

[8]:
lowest_ldf["years_lost"] = (
    lowest_ldf["hld_life_expectancy"] * lowest_ldf["n_deaths"] / lowest_ldf["n_deaths"].sum()
)
lowest_ldf
[8]:
age n_deaths lowest_age middle_age highest_age year country sex hld_country hld_age hld_sex hld_year hld_life_expectancy years_lost
0 9 11 0 5 9 2020 FRA M FRA 9 M 2020 70.59 0.002099
1 19 68 10 15 19 2020 FRA M FRA 19 M 2020 60.70 0.011158
2 29 437 20 25 29 2020 FRA M FRA 29 M 2020 51.04 0.060297
3 39 1561 30 35 39 2020 FRA M FRA 39 M 2020 41.49 0.175085
4 49 3628 40 45 49 2020 FRA M FRA 49 M 2020 32.25 0.316300
5 59 14106 50 55 59 2020 FRA M FRA 59 M 2020 23.68 0.902999
6 69 36555 60 65 69 2020 FRA M FRA 69 M 2020 16.12 1.592991
7 79 76238 70 75 79 2020 FRA M FRA 79 M 2020 9.44 1.945562
8 89 145018 80 85 89 2020 FRA M FRA 89 M 2020 4.36 1.709267
9 90 92290 90 99 99 2020 FRA M FRA 99 M 2020 2.01 0.501478
[9]:
lowest_ldf["years_lost"].sum().round()
[9]:
np.float64(7.0)
[10]:
middle_ldf = lost_years_hld(
    df, {"age": "middle_age", "country": "country", "sex": "sex", "year": "year"}
)
middle_ldf.head()
[10]:
age n_deaths lowest_age middle_age highest_age year country sex hld_country hld_age hld_sex hld_year hld_life_expectancy
0 9 11 0 5 9 2020 FRA M FRA 5 M 2020 74.56
1 19 68 10 15 19 2020 FRA M FRA 15 M 2020 64.63
2 29 437 20 25 29 2020 FRA M FRA 25 M 2020 54.90
3 39 1561 30 35 39 2020 FRA M FRA 35 M 2020 45.29
4 49 3628 40 45 49 2020 FRA M FRA 45 M 2020 35.89

Assuming all the people who died were at the middle of the age ranges

[11]:
middle_ldf["years_lost"] = (
    middle_ldf["hld_life_expectancy"] * middle_ldf["n_deaths"] / middle_ldf["n_deaths"].sum()
)
middle_ldf
[11]:
age n_deaths lowest_age middle_age highest_age year country sex hld_country hld_age hld_sex hld_year hld_life_expectancy years_lost
0 9 11 0 5 9 2020 FRA M FRA 5 M 2020 74.56 0.002217
1 19 68 10 15 19 2020 FRA M FRA 15 M 2020 64.63 0.011881
2 29 437 20 25 29 2020 FRA M FRA 25 M 2020 54.90 0.064857
3 39 1561 30 35 39 2020 FRA M FRA 35 M 2020 45.29 0.191120
4 49 3628 40 45 49 2020 FRA M FRA 45 M 2020 35.89 0.352000
5 59 14106 50 55 59 2020 FRA M FRA 55 M 2020 27.01 1.029983
6 69 36555 60 65 69 2020 FRA M FRA 65 M 2020 19.02 1.879572
7 79 76238 70 75 79 2020 FRA M FRA 75 M 2020 11.98 2.469050
8 89 145018 80 85 89 2020 FRA M FRA 85 M 2020 6.10 2.391406
9 90 92290 90 99 99 2020 FRA M FRA 99 M 2020 2.01 0.501478
[12]:
middle_ldf["years_lost"].sum().round()
[12]:
np.float64(9.0)

Assume the Longevity is the Same as People in the US

[13]:
ssa_middle_ldf = lost_years_ssa(df, {"age": "middle_age", "sex": "sex", "year": "year"})
ssa_middle_ldf.head()
[13]:
age n_deaths lowest_age middle_age highest_age year country sex ssa_age ssa_year ssa_life_expectancy
0 9 11 0 5 9 2020 FRA M 5 2022 70.29
1 19 68 10 15 19 2020 FRA M 15 2022 60.39
2 29 437 20 25 29 2020 FRA M 25 2022 51.03
3 39 1561 30 35 39 2020 FRA M 35 2022 42.08
4 49 3628 40 45 49 2020 FRA M 45 2022 33.32
[14]:
ssa_middle_ldf["years_lost"] = (
    ssa_middle_ldf["ssa_life_expectancy"]
    * ssa_middle_ldf["n_deaths"]
    / ssa_middle_ldf["n_deaths"].sum()
)
ssa_middle_ldf
[14]:
age n_deaths lowest_age middle_age highest_age year country sex ssa_age ssa_year ssa_life_expectancy years_lost
0 9 11 0 5 9 2020 FRA M 5 2022 70.29 0.002090
1 19 68 10 15 19 2020 FRA M 15 2022 60.39 0.011101
2 29 437 20 25 29 2020 FRA M 25 2022 51.03 0.060285
3 39 1561 30 35 39 2020 FRA M 35 2022 42.08 0.177574
4 49 3628 40 45 49 2020 FRA M 45 2022 33.32 0.326794
5 59 14106 50 55 59 2020 FRA M 55 2022 24.94 0.951047
6 69 36555 60 65 69 2020 FRA M 65 2022 17.48 1.727388
7 79 76238 70 75 79 2020 FRA M 75 2022 10.92 2.250587
8 89 145018 80 85 89 2020 FRA M 85 2022 5.75 2.254194
9 90 92290 90 99 99 2020 FRA M 99 2022 2.00 0.498984
[15]:
ssa_middle_ldf["years_lost"].sum().round()
[15]:
np.float64(8.0)

Assume Everyone Lives Till 90

[16]:
y90_middle_ldf = df.copy()
y90_middle_ldf["y90_life_expectancy"] = 90 - y90_middle_ldf["middle_age"]
y90_middle_ldf.head()
[16]:
age n_deaths lowest_age middle_age highest_age year country sex y90_life_expectancy
0 9 11 0 5 9 2020 FRA M 85
1 19 68 10 15 19 2020 FRA M 75
2 29 437 20 25 29 2020 FRA M 65
3 39 1561 30 35 39 2020 FRA M 55
4 49 3628 40 45 49 2020 FRA M 45
[17]:
y90_middle_ldf["years_lost"] = (
    y90_middle_ldf["y90_life_expectancy"]
    * y90_middle_ldf["n_deaths"]
    / y90_middle_ldf["n_deaths"].sum()
)
y90_middle_ldf
[17]:
age n_deaths lowest_age middle_age highest_age year country sex y90_life_expectancy years_lost
0 9 11 0 5 9 2020 FRA M 85 0.002528
1 19 68 10 15 19 2020 FRA M 75 0.013787
2 29 437 20 25 29 2020 FRA M 65 0.076789
3 39 1561 30 35 39 2020 FRA M 55 0.232096
4 49 3628 40 45 49 2020 FRA M 45 0.441348
5 59 14106 50 55 59 2020 FRA M 35 1.334669
6 69 36555 60 65 69 2020 FRA M 25 2.470520
7 79 76238 70 75 79 2020 FRA M 15 3.091465
8 89 145018 80 85 89 2020 FRA M 5 1.960169
9 90 92290 90 99 99 2020 FRA M -9 -2.245426
[18]:
y90_middle_ldf["years_lost"].sum().round()
[18]:
np.float64(7.0)

Get Human Life Table data columns from WHO dataset

[19]:
who_highest_ldf = lost_years_who(
    df, {"age": "lowest_age", "country": "country", "sex": "sex", "year": "year"}
)
who_highest_ldf.head()
[19]:
age n_deaths lowest_age middle_age highest_age year country sex who_age who_country who_sex who_year who_life_expectancy
0 9 11 0 5 9 2020 FRA M 1 FRA MLE 2020 79.064192
1 19 68 10 15 19 2020 FRA M 1 FRA MLE 2020 79.064192
2 29 437 20 25 29 2020 FRA M 1 FRA MLE 2020 79.064192
3 39 1561 30 35 39 2020 FRA M 1 FRA MLE 2020 79.064192
4 49 3628 40 45 49 2020 FRA M 1 FRA MLE 2020 79.064192

Assuming all the people who died were at the bottom of the age ranges

[20]:
who_highest_ldf["years_lost"] = (
    who_highest_ldf["who_life_expectancy"]
    * who_highest_ldf["n_deaths"]
    / who_highest_ldf["n_deaths"].sum()
)
who_highest_ldf
[20]:
age n_deaths lowest_age middle_age highest_age year country sex who_age who_country who_sex who_year who_life_expectancy years_lost
0 9 11 0 5 9 2020 FRA M 1 FRA MLE 2020 79.064192 0.002351
1 19 68 10 15 19 2020 FRA M 1 FRA MLE 2020 79.064192 0.014534
2 29 437 20 25 29 2020 FRA M 1 FRA MLE 2020 79.064192 0.093403
3 39 1561 30 35 39 2020 FRA M 1 FRA MLE 2020 79.064192 0.333645
4 49 3628 40 45 49 2020 FRA M 1 FRA MLE 2020 79.064192 0.775441
5 59 14106 50 55 59 2020 FRA M 1 FRA MLE 2020 79.064192 3.014986
6 69 36555 60 65 69 2020 FRA M 1 FRA MLE 2020 79.064192 7.813187
7 79 76238 70 75 79 2020 FRA M 1 FRA MLE 2020 79.064192 16.294945
8 89 145018 80 85 89 2020 FRA M 1 FRA MLE 2020 79.064192 30.995834
9 90 92290 90 99 99 2020 FRA M 1 FRA MLE 2020 79.064192 19.725865
[21]:
who_highest_ldf["years_lost"].sum().round()
[21]:
np.float64(79.0)

Get Human Life Table data columns from WHO dataset

[22]:
who_lowest_ldf = lost_years_who(
    df, {"age": "highest_age", "country": "country", "sex": "sex", "year": "year"}
)
who_lowest_ldf.head()
[22]:
age n_deaths lowest_age middle_age highest_age year country sex who_age who_country who_sex who_year who_life_expectancy
0 9 11 0 5 9 2020 FRA M 1 FRA MLE 2020 79.064192
1 19 68 10 15 19 2020 FRA M 1 FRA MLE 2020 79.064192
2 29 437 20 25 29 2020 FRA M 1 FRA MLE 2020 79.064192
3 39 1561 30 35 39 2020 FRA M 1 FRA MLE 2020 79.064192
4 49 3628 40 45 49 2020 FRA M 1 FRA MLE 2020 79.064192

Assuming all the people who died were at the top of the age ranges

[23]:
who_lowest_ldf["years_lost"] = (
    who_lowest_ldf["who_life_expectancy"]
    * who_lowest_ldf["n_deaths"]
    / who_lowest_ldf["n_deaths"].sum()
)
who_lowest_ldf
[23]:
age n_deaths lowest_age middle_age highest_age year country sex who_age who_country who_sex who_year who_life_expectancy years_lost
0 9 11 0 5 9 2020 FRA M 1 FRA MLE 2020 79.064192 0.002351
1 19 68 10 15 19 2020 FRA M 1 FRA MLE 2020 79.064192 0.014534
2 29 437 20 25 29 2020 FRA M 1 FRA MLE 2020 79.064192 0.093403
3 39 1561 30 35 39 2020 FRA M 1 FRA MLE 2020 79.064192 0.333645
4 49 3628 40 45 49 2020 FRA M 1 FRA MLE 2020 79.064192 0.775441
5 59 14106 50 55 59 2020 FRA M 1 FRA MLE 2020 79.064192 3.014986
6 69 36555 60 65 69 2020 FRA M 1 FRA MLE 2020 79.064192 7.813187
7 79 76238 70 75 79 2020 FRA M 1 FRA MLE 2020 79.064192 16.294945
8 89 145018 80 85 89 2020 FRA M 1 FRA MLE 2020 79.064192 30.995834
9 90 92290 90 99 99 2020 FRA M 1 FRA MLE 2020 79.064192 19.725865
[24]:
who_lowest_ldf["years_lost"].sum().round()
[24]:
np.float64(79.0)

Get Human Life Table data columns from WHO dataset

[25]:
who_middle_ldf = lost_years_who(
    df, {"age": "middle_age", "country": "country", "sex": "sex", "year": "year"}
)
who_middle_ldf.head()
[25]:
age n_deaths lowest_age middle_age highest_age year country sex who_age who_country who_sex who_year who_life_expectancy
0 9 11 0 5 9 2020 FRA M 1 FRA MLE 2020 79.064192
1 19 68 10 15 19 2020 FRA M 1 FRA MLE 2020 79.064192
2 29 437 20 25 29 2020 FRA M 1 FRA MLE 2020 79.064192
3 39 1561 30 35 39 2020 FRA M 1 FRA MLE 2020 79.064192
4 49 3628 40 45 49 2020 FRA M 1 FRA MLE 2020 79.064192

Assuming all the people who died were at the middle of the age ranges

[26]:
who_middle_ldf["years_lost"] = (
    who_middle_ldf["who_life_expectancy"]
    * who_middle_ldf["n_deaths"]
    / who_middle_ldf["n_deaths"].sum()
)
who_middle_ldf
[26]:
age n_deaths lowest_age middle_age highest_age year country sex who_age who_country who_sex who_year who_life_expectancy years_lost
0 9 11 0 5 9 2020 FRA M 1 FRA MLE 2020 79.064192 0.002351
1 19 68 10 15 19 2020 FRA M 1 FRA MLE 2020 79.064192 0.014534
2 29 437 20 25 29 2020 FRA M 1 FRA MLE 2020 79.064192 0.093403
3 39 1561 30 35 39 2020 FRA M 1 FRA MLE 2020 79.064192 0.333645
4 49 3628 40 45 49 2020 FRA M 1 FRA MLE 2020 79.064192 0.775441
5 59 14106 50 55 59 2020 FRA M 1 FRA MLE 2020 79.064192 3.014986
6 69 36555 60 65 69 2020 FRA M 1 FRA MLE 2020 79.064192 7.813187
7 79 76238 70 75 79 2020 FRA M 1 FRA MLE 2020 79.064192 16.294945
8 89 145018 80 85 89 2020 FRA M 1 FRA MLE 2020 79.064192 30.995834
9 90 92290 90 99 99 2020 FRA M 1 FRA MLE 2020 79.064192 19.725865
[27]:
who_middle_ldf["years_lost"].sum().round()
[27]:
np.float64(79.0)