Average Number of Years Lost For People Who Died of Coronavirus in France¶

We illustrate the use of the package by estimating the average number of years by which people’s lives are shortened due to coronavirus. Using data from here that gives us the distribution of ages of people who died from COVID-19 in France, with conservative assumptions (assuming gender of the dead person to be male, taking the middle of age ranges) we find that people’s lives are shortened by about 9 years on average. These estimates are conservative for one additional reason: there is likely an inverse correlation between people who die and their expected longevity. And note that given a bulk of the deaths are among older people, when people are more infirm, the quality adjusted years lost is likely yet more modest. Using the most recent SSA data, we find that number to be also 9 years. Assuming people live till 90, the average number of years lost is 7. If we use data from WHO, the average number of years lost (if we take the middle of the age range), is 11.

[1]:

import pandas as pd

from lost_years import lost_years_hld, lost_years_ssa, lost_years_who

Prepare example input in DataFrame¶

Please look at country codes here:- https://www.lifetable.de/cgi-bin/country_codes.php

[2]:

df = pd.read_csv(
    "data/covid-cedc-quot.csv",
    usecols=["cl_age90", "Dc_Elec_Covid_cum"],
    delimiter=";",
)
df.columns = ["age", "n_deaths"]
df.drop(df.loc[df.age == 0].index, inplace=True)
df

[2]:

	age	n_deaths
73	9	0
74	9	0
75	9	0
76	9	0
77	9	0
...	...	...
16055	90	0
16056	90	0
16057	90	0
16058	90	0
16059	90	0

14600 rows × 2 columns

[3]:

gdf = df.groupby("age").agg({"n_deaths": sum})
df = gdf.reset_index()
df2 = pd.DataFrame(
    {
        "lowest_age": [0, 10, 20, 30, 40, 50, 60, 70, 80, 90],
        "middle_age": [5, 15, 25, 35, 45, 55, 65, 75, 85, 99],
        "highest_age": [9, 19, 29, 39, 49, 59, 69, 79, 89, 99],
    }
)
df = df.join(df2)
df["year"] = 2020
df["country"] = "FRA"
df["sex"] = "M"
df

/tmp/ipykernel_2847/2269237630.py:1: FutureWarning: The provided callable <built-in function sum> is currently using SeriesGroupBy.sum. In a future version of pandas, the provided callable will be used directly. To keep current behavior pass the string "sum" instead.
  gdf = df.groupby("age").agg({"n_deaths": sum})

[3]:

	age	n_deaths	lowest_age	middle_age	highest_age	year	country	sex
0	9	11	0	5	9	2020	FRA	M
1	19	68	10	15	19	2020	FRA	M
2	29	437	20	25	29	2020	FRA	M
3	39	1561	30	35	39	2020	FRA	M
4	49	3628	40	45	49	2020	FRA	M
5	59	14106	50	55	59	2020	FRA	M
6	69	36555	60	65	69	2020	FRA	M
7	79	76238	70	75	79	2020	FRA	M
8	89	145018	80	85	89	2020	FRA	M
9	90	92290	90	99	99	2020	FRA	M

Get Human Life Table data columns from HLD dataset¶

[4]:

highest_ldf = lost_years_hld(
    df, {"age": "lowest_age", "country": "country", "sex": "sex", "year": "year"}
)
highest_ldf.head()

[4]:

	age	n_deaths	lowest_age	middle_age	highest_age	year	country	sex	hld_country	hld_age	hld_sex	hld_year	hld_life_expectancy
0	9	11	0	5	9	2020	FRA	M	FRA	0	M	2020	79.18
1	19	68	10	15	19	2020	FRA	M	FRA	10	M	2020	69.60
2	29	437	20	25	29	2020	FRA	M	FRA	20	M	2020	59.73
3	39	1561	30	35	39	2020	FRA	M	FRA	30	M	2020	50.08
4	49	3628	40	45	49	2020	FRA	M	FRA	40	M	2020	40.55

Note that the year we are matching to is 2015.

Assuming all the people who died were at the bottom of the age ranges¶

[5]:

highest_ldf["years_lost"] = (
    highest_ldf["hld_life_expectancy"] * highest_ldf["n_deaths"] / highest_ldf["n_deaths"].sum()
)
highest_ldf

[5]:

	age	n_deaths	lowest_age	middle_age	highest_age	year	country	sex	hld_country	hld_age	hld_sex	hld_year	hld_life_expectancy	years_lost
0	9	11	0	5	9	2020	FRA	M	FRA	0	M	2020	79.18	0.002355
1	19	68	10	15	19	2020	FRA	M	FRA	10	M	2020	69.60	0.012794
2	29	437	20	25	29	2020	FRA	M	FRA	20	M	2020	59.73	0.070563
3	39	1561	30	35	39	2020	FRA	M	FRA	30	M	2020	50.08	0.211334
4	49	3628	40	45	49	2020	FRA	M	FRA	40	M	2020	40.55	0.397704
5	59	14106	50	55	59	2020	FRA	M	FRA	50	M	2020	31.35	1.195482
6	69	36555	60	65	69	2020	FRA	M	FRA	60	M	2020	22.87	2.260032
7	79	76238	70	75	79	2020	FRA	M	FRA	70	M	2020	15.41	3.175965
8	89	145018	80	85	89	2020	FRA	M	FRA	80	M	2020	8.84	3.465579
9	90	92290	90	99	99	2020	FRA	M	FRA	90	M	2020	4.00	0.997967

[6]:

highest_ldf["years_lost"].sum().round()

[6]:

np.float64(12.0)

[7]:

lowest_ldf = lost_years_hld(
    df, {"age": "highest_age", "country": "country", "sex": "sex", "year": "year"}
)
lowest_ldf.head()

[7]:

	age	n_deaths	lowest_age	middle_age	highest_age	year	country	sex	hld_country	hld_age	hld_sex	hld_year	hld_life_expectancy
0	9	11	0	5	9	2020	FRA	M	FRA	9	M	2020	70.59
1	19	68	10	15	19	2020	FRA	M	FRA	19	M	2020	60.70
2	29	437	20	25	29	2020	FRA	M	FRA	29	M	2020	51.04
3	39	1561	30	35	39	2020	FRA	M	FRA	39	M	2020	41.49
4	49	3628	40	45	49	2020	FRA	M	FRA	49	M	2020	32.25

Assuming all the people who died were at the top of the age ranges¶

[8]:

lowest_ldf["years_lost"] = (
    lowest_ldf["hld_life_expectancy"] * lowest_ldf["n_deaths"] / lowest_ldf["n_deaths"].sum()
)
lowest_ldf

[8]:

	age	n_deaths	lowest_age	middle_age	highest_age	year	country	sex	hld_country	hld_age	hld_sex	hld_year	hld_life_expectancy	years_lost
0	9	11	0	5	9	2020	FRA	M	FRA	9	M	2020	70.59	0.002099
1	19	68	10	15	19	2020	FRA	M	FRA	19	M	2020	60.70	0.011158
2	29	437	20	25	29	2020	FRA	M	FRA	29	M	2020	51.04	0.060297
3	39	1561	30	35	39	2020	FRA	M	FRA	39	M	2020	41.49	0.175085
4	49	3628	40	45	49	2020	FRA	M	FRA	49	M	2020	32.25	0.316300
5	59	14106	50	55	59	2020	FRA	M	FRA	59	M	2020	23.68	0.902999
6	69	36555	60	65	69	2020	FRA	M	FRA	69	M	2020	16.12	1.592991
7	79	76238	70	75	79	2020	FRA	M	FRA	79	M	2020	9.44	1.945562
8	89	145018	80	85	89	2020	FRA	M	FRA	89	M	2020	4.36	1.709267
9	90	92290	90	99	99	2020	FRA	M	FRA	99	M	2020	2.01	0.501478

[9]:

lowest_ldf["years_lost"].sum().round()

[9]:

np.float64(7.0)

[10]:

middle_ldf = lost_years_hld(
    df, {"age": "middle_age", "country": "country", "sex": "sex", "year": "year"}
)
middle_ldf.head()

[10]:

	age	n_deaths	lowest_age	middle_age	highest_age	year	country	sex	hld_country	hld_age	hld_sex	hld_year	hld_life_expectancy
0	9	11	0	5	9	2020	FRA	M	FRA	5	M	2020	74.56
1	19	68	10	15	19	2020	FRA	M	FRA	15	M	2020	64.63
2	29	437	20	25	29	2020	FRA	M	FRA	25	M	2020	54.90
3	39	1561	30	35	39	2020	FRA	M	FRA	35	M	2020	45.29
4	49	3628	40	45	49	2020	FRA	M	FRA	45	M	2020	35.89

Assuming all the people who died were at the middle of the age ranges¶

[11]:

middle_ldf["years_lost"] = (
    middle_ldf["hld_life_expectancy"] * middle_ldf["n_deaths"] / middle_ldf["n_deaths"].sum()
)
middle_ldf

[11]:

	age	n_deaths	lowest_age	middle_age	highest_age	year	country	sex	hld_country	hld_age	hld_sex	hld_year	hld_life_expectancy	years_lost
0	9	11	0	5	9	2020	FRA	M	FRA	5	M	2020	74.56	0.002217
1	19	68	10	15	19	2020	FRA	M	FRA	15	M	2020	64.63	0.011881
2	29	437	20	25	29	2020	FRA	M	FRA	25	M	2020	54.90	0.064857
3	39	1561	30	35	39	2020	FRA	M	FRA	35	M	2020	45.29	0.191120
4	49	3628	40	45	49	2020	FRA	M	FRA	45	M	2020	35.89	0.352000
5	59	14106	50	55	59	2020	FRA	M	FRA	55	M	2020	27.01	1.029983
6	69	36555	60	65	69	2020	FRA	M	FRA	65	M	2020	19.02	1.879572
7	79	76238	70	75	79	2020	FRA	M	FRA	75	M	2020	11.98	2.469050
8	89	145018	80	85	89	2020	FRA	M	FRA	85	M	2020	6.10	2.391406
9	90	92290	90	99	99	2020	FRA	M	FRA	99	M	2020	2.01	0.501478

[12]:

middle_ldf["years_lost"].sum().round()

[12]:

np.float64(9.0)

Assume the Longevity is the Same as People in the US¶

[13]:

ssa_middle_ldf = lost_years_ssa(df, {"age": "middle_age", "sex": "sex", "year": "year"})
ssa_middle_ldf.head()

[13]:

	age	n_deaths	lowest_age	middle_age	highest_age	year	country	sex	ssa_age	ssa_year	ssa_life_expectancy
0	9	11	0	5	9	2020	FRA	M	5	2022	70.29
1	19	68	10	15	19	2020	FRA	M	15	2022	60.39
2	29	437	20	25	29	2020	FRA	M	25	2022	51.03
3	39	1561	30	35	39	2020	FRA	M	35	2022	42.08
4	49	3628	40	45	49	2020	FRA	M	45	2022	33.32

[14]:

ssa_middle_ldf["years_lost"] = (
    ssa_middle_ldf["ssa_life_expectancy"]
    * ssa_middle_ldf["n_deaths"]
    / ssa_middle_ldf["n_deaths"].sum()
)
ssa_middle_ldf

[14]:

	age	n_deaths	lowest_age	middle_age	highest_age	year	country	sex	ssa_age	ssa_year	ssa_life_expectancy	years_lost
0	9	11	0	5	9	2020	FRA	M	5	2022	70.29	0.002090
1	19	68	10	15	19	2020	FRA	M	15	2022	60.39	0.011101
2	29	437	20	25	29	2020	FRA	M	25	2022	51.03	0.060285
3	39	1561	30	35	39	2020	FRA	M	35	2022	42.08	0.177574
4	49	3628	40	45	49	2020	FRA	M	45	2022	33.32	0.326794
5	59	14106	50	55	59	2020	FRA	M	55	2022	24.94	0.951047
6	69	36555	60	65	69	2020	FRA	M	65	2022	17.48	1.727388
7	79	76238	70	75	79	2020	FRA	M	75	2022	10.92	2.250587
8	89	145018	80	85	89	2020	FRA	M	85	2022	5.75	2.254194
9	90	92290	90	99	99	2020	FRA	M	99	2022	2.00	0.498984

[15]:

ssa_middle_ldf["years_lost"].sum().round()

[15]:

np.float64(8.0)

Assume Everyone Lives Till 90¶

[16]:

y90_middle_ldf = df.copy()
y90_middle_ldf["y90_life_expectancy"] = 90 - y90_middle_ldf["middle_age"]
y90_middle_ldf.head()

[16]:

	age	n_deaths	lowest_age	middle_age	highest_age	year	country	sex	y90_life_expectancy
0	9	11	0	5	9	2020	FRA	M	85
1	19	68	10	15	19	2020	FRA	M	75
2	29	437	20	25	29	2020	FRA	M	65
3	39	1561	30	35	39	2020	FRA	M	55
4	49	3628	40	45	49	2020	FRA	M	45

[17]:

y90_middle_ldf["years_lost"] = (
    y90_middle_ldf["y90_life_expectancy"]
    * y90_middle_ldf["n_deaths"]
    / y90_middle_ldf["n_deaths"].sum()
)
y90_middle_ldf

[17]:

	age	n_deaths	lowest_age	middle_age	highest_age	year	country	sex	y90_life_expectancy	years_lost
0	9	11	0	5	9	2020	FRA	M	85	0.002528
1	19	68	10	15	19	2020	FRA	M	75	0.013787
2	29	437	20	25	29	2020	FRA	M	65	0.076789
3	39	1561	30	35	39	2020	FRA	M	55	0.232096
4	49	3628	40	45	49	2020	FRA	M	45	0.441348
5	59	14106	50	55	59	2020	FRA	M	35	1.334669
6	69	36555	60	65	69	2020	FRA	M	25	2.470520
7	79	76238	70	75	79	2020	FRA	M	15	3.091465
8	89	145018	80	85	89	2020	FRA	M	5	1.960169
9	90	92290	90	99	99	2020	FRA	M	-9	-2.245426

[18]:

y90_middle_ldf["years_lost"].sum().round()

[18]:

np.float64(7.0)

Get Human Life Table data columns from WHO dataset¶

[19]:

who_highest_ldf = lost_years_who(
    df, {"age": "lowest_age", "country": "country", "sex": "sex", "year": "year"}
)
who_highest_ldf.head()

[19]:

	age	n_deaths	lowest_age	middle_age	highest_age	year	country	sex	who_age	who_country	who_sex	who_year	who_life_expectancy
0	9	11	0	5	9	2020	FRA	M	1	FRA	MLE	2020	79.064192
1	19	68	10	15	19	2020	FRA	M	1	FRA	MLE	2020	79.064192
2	29	437	20	25	29	2020	FRA	M	1	FRA	MLE	2020	79.064192
3	39	1561	30	35	39	2020	FRA	M	1	FRA	MLE	2020	79.064192
4	49	3628	40	45	49	2020	FRA	M	1	FRA	MLE	2020	79.064192

Assuming all the people who died were at the bottom of the age ranges¶

[20]:

who_highest_ldf["years_lost"] = (
    who_highest_ldf["who_life_expectancy"]
    * who_highest_ldf["n_deaths"]
    / who_highest_ldf["n_deaths"].sum()
)
who_highest_ldf

[20]:

	age	n_deaths	lowest_age	middle_age	highest_age	year	country	sex	who_age	who_country	who_sex	who_year	who_life_expectancy	years_lost
0	9	11	0	5	9	2020	FRA	M	1	FRA	MLE	2020	79.064192	0.002351
1	19	68	10	15	19	2020	FRA	M	1	FRA	MLE	2020	79.064192	0.014534
2	29	437	20	25	29	2020	FRA	M	1	FRA	MLE	2020	79.064192	0.093403
3	39	1561	30	35	39	2020	FRA	M	1	FRA	MLE	2020	79.064192	0.333645
4	49	3628	40	45	49	2020	FRA	M	1	FRA	MLE	2020	79.064192	0.775441
5	59	14106	50	55	59	2020	FRA	M	1	FRA	MLE	2020	79.064192	3.014986
6	69	36555	60	65	69	2020	FRA	M	1	FRA	MLE	2020	79.064192	7.813187
7	79	76238	70	75	79	2020	FRA	M	1	FRA	MLE	2020	79.064192	16.294945
8	89	145018	80	85	89	2020	FRA	M	1	FRA	MLE	2020	79.064192	30.995834
9	90	92290	90	99	99	2020	FRA	M	1	FRA	MLE	2020	79.064192	19.725865

[21]:

who_highest_ldf["years_lost"].sum().round()

[21]:

np.float64(79.0)

Get Human Life Table data columns from WHO dataset¶

[22]:

who_lowest_ldf = lost_years_who(
    df, {"age": "highest_age", "country": "country", "sex": "sex", "year": "year"}
)
who_lowest_ldf.head()

[22]:

	age	n_deaths	lowest_age	middle_age	highest_age	year	country	sex	who_age	who_country	who_sex	who_year	who_life_expectancy
0	9	11	0	5	9	2020	FRA	M	1	FRA	MLE	2020	79.064192
1	19	68	10	15	19	2020	FRA	M	1	FRA	MLE	2020	79.064192
2	29	437	20	25	29	2020	FRA	M	1	FRA	MLE	2020	79.064192
3	39	1561	30	35	39	2020	FRA	M	1	FRA	MLE	2020	79.064192
4	49	3628	40	45	49	2020	FRA	M	1	FRA	MLE	2020	79.064192

Assuming all the people who died were at the top of the age ranges¶

[23]:

who_lowest_ldf["years_lost"] = (
    who_lowest_ldf["who_life_expectancy"]
    * who_lowest_ldf["n_deaths"]
    / who_lowest_ldf["n_deaths"].sum()
)
who_lowest_ldf

[23]:

	age	n_deaths	lowest_age	middle_age	highest_age	year	country	sex	who_age	who_country	who_sex	who_year	who_life_expectancy	years_lost
0	9	11	0	5	9	2020	FRA	M	1	FRA	MLE	2020	79.064192	0.002351
1	19	68	10	15	19	2020	FRA	M	1	FRA	MLE	2020	79.064192	0.014534
2	29	437	20	25	29	2020	FRA	M	1	FRA	MLE	2020	79.064192	0.093403
3	39	1561	30	35	39	2020	FRA	M	1	FRA	MLE	2020	79.064192	0.333645
4	49	3628	40	45	49	2020	FRA	M	1	FRA	MLE	2020	79.064192	0.775441
5	59	14106	50	55	59	2020	FRA	M	1	FRA	MLE	2020	79.064192	3.014986
6	69	36555	60	65	69	2020	FRA	M	1	FRA	MLE	2020	79.064192	7.813187
7	79	76238	70	75	79	2020	FRA	M	1	FRA	MLE	2020	79.064192	16.294945
8	89	145018	80	85	89	2020	FRA	M	1	FRA	MLE	2020	79.064192	30.995834
9	90	92290	90	99	99	2020	FRA	M	1	FRA	MLE	2020	79.064192	19.725865

[24]:

who_lowest_ldf["years_lost"].sum().round()

[24]:

np.float64(79.0)

Get Human Life Table data columns from WHO dataset¶

[25]:

who_middle_ldf = lost_years_who(
    df, {"age": "middle_age", "country": "country", "sex": "sex", "year": "year"}
)
who_middle_ldf.head()

[25]:

	age	n_deaths	lowest_age	middle_age	highest_age	year	country	sex	who_age	who_country	who_sex	who_year	who_life_expectancy
0	9	11	0	5	9	2020	FRA	M	1	FRA	MLE	2020	79.064192
1	19	68	10	15	19	2020	FRA	M	1	FRA	MLE	2020	79.064192
2	29	437	20	25	29	2020	FRA	M	1	FRA	MLE	2020	79.064192
3	39	1561	30	35	39	2020	FRA	M	1	FRA	MLE	2020	79.064192
4	49	3628	40	45	49	2020	FRA	M	1	FRA	MLE	2020	79.064192

Assuming all the people who died were at the middle of the age ranges¶

[26]:

who_middle_ldf["years_lost"] = (
    who_middle_ldf["who_life_expectancy"]
    * who_middle_ldf["n_deaths"]
    / who_middle_ldf["n_deaths"].sum()
)
who_middle_ldf

[26]:

	age	n_deaths	lowest_age	middle_age	highest_age	year	country	sex	who_age	who_country	who_sex	who_year	who_life_expectancy	years_lost
0	9	11	0	5	9	2020	FRA	M	1	FRA	MLE	2020	79.064192	0.002351
1	19	68	10	15	19	2020	FRA	M	1	FRA	MLE	2020	79.064192	0.014534
2	29	437	20	25	29	2020	FRA	M	1	FRA	MLE	2020	79.064192	0.093403
3	39	1561	30	35	39	2020	FRA	M	1	FRA	MLE	2020	79.064192	0.333645
4	49	3628	40	45	49	2020	FRA	M	1	FRA	MLE	2020	79.064192	0.775441
5	59	14106	50	55	59	2020	FRA	M	1	FRA	MLE	2020	79.064192	3.014986
6	69	36555	60	65	69	2020	FRA	M	1	FRA	MLE	2020	79.064192	7.813187
7	79	76238	70	75	79	2020	FRA	M	1	FRA	MLE	2020	79.064192	16.294945
8	89	145018	80	85	89	2020	FRA	M	1	FRA	MLE	2020	79.064192	30.995834
9	90	92290	90	99	99	2020	FRA	M	1	FRA	MLE	2020	79.064192	19.725865

[27]:

who_middle_ldf["years_lost"].sum().round()

[27]:

np.float64(79.0)