Úvod do GeoPandas¶
Nahrání dat¶
[1]:
filename = 'demo_r_pjangrp3_linear.csv.gz'
import pandas as pd
nuts_data = pd.read_csv(filename, compression='gzip')
nuts_data
[1]:
DATAFLOW | LAST UPDATE | freq | sex | unit | age | geo | TIME_PERIOD | OBS_VALUE | OBS_FLAG | |
---|---|---|---|---|---|---|---|---|---|---|
0 | ESTAT:DEMO_R_PJANGRP3(1.0) | 28/09/23 23:00:00 | A | F | NR | TOTAL | AL | 2014 | 1430827 | NaN |
1 | ESTAT:DEMO_R_PJANGRP3(1.0) | 28/09/23 23:00:00 | A | F | NR | TOTAL | AL | 2015 | 1424597 | NaN |
2 | ESTAT:DEMO_R_PJANGRP3(1.0) | 28/09/23 23:00:00 | A | F | NR | TOTAL | AL | 2016 | 1417141 | NaN |
3 | ESTAT:DEMO_R_PJANGRP3(1.0) | 28/09/23 23:00:00 | A | F | NR | TOTAL | AL | 2017 | 1423050 | NaN |
4 | ESTAT:DEMO_R_PJANGRP3(1.0) | 28/09/23 23:00:00 | A | F | NR | TOTAL | AL | 2018 | 1431715 | NaN |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
1125890 | ESTAT:DEMO_R_PJANGRP3(1.0) | 28/09/23 23:00:00 | A | T | NR | Y_LT5 | UKN16 | 2015 | 7764 | NaN |
1125891 | ESTAT:DEMO_R_PJANGRP3(1.0) | 28/09/23 23:00:00 | A | T | NR | Y_LT5 | UKN16 | 2016 | 7720 | NaN |
1125892 | ESTAT:DEMO_R_PJANGRP3(1.0) | 28/09/23 23:00:00 | A | T | NR | Y_LT5 | UKN16 | 2017 | 7712 | NaN |
1125893 | ESTAT:DEMO_R_PJANGRP3(1.0) | 28/09/23 23:00:00 | A | T | NR | Y_LT5 | UKN16 | 2018 | 7659 | NaN |
1125894 | ESTAT:DEMO_R_PJANGRP3(1.0) | 28/09/23 23:00:00 | A | T | NR | Y_LT5 | UKN16 | 2019 | 7654 | NaN |
1125895 rows × 10 columns
V tabulce chybí geografická část popisu NUTS jednotech. Pro další práci zvolíme data ve formátu Esri Shapefile.
[2]:
import geopandas as gpd
nuts = gpd.read_file("NUTS_RG_20M_2021_3035.shp")
nuts
[2]:
NUTS_ID | LEVL_CODE | CNTR_CODE | NAME_LATN | NUTS_NAME | MOUNT_TYPE | URBN_TYPE | COAST_TYPE | FID | geometry | |
---|---|---|---|---|---|---|---|---|---|---|
0 | FR | 0 | FR | France | France | 0.0 | 0 | 0 | FR | MULTIPOLYGON (((9954236.116 -3059379.316, 9961... |
1 | HR | 0 | HR | Hrvatska | Hrvatska | 0.0 | 0 | 0 | HR | MULTIPOLYGON (((4827385.889 2618351.326, 48483... |
2 | HU | 0 | HU | Magyarország | Magyarország | 0.0 | 0 | 0 | HU | POLYGON ((5214660.069 2880853.832, 5216710.220... |
3 | AL | 0 | AL | Shqipëria | Shqipëria | 0.0 | 0 | 0 | AL | POLYGON ((5129579.170 2204098.752, 5148385.473... |
4 | AT | 0 | AT | Österreich | Österreich | 0.0 | 0 | 0 | AT | POLYGON ((4742889.368 2876362.725, 4783217.798... |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
2005 | TRC21 | 3 | TR | Şanlıurfa | Şanlıurfa | 4.0 | 2 | 3 | TRC21 | POLYGON ((6904684.585 2120354.802, 6938677.828... |
2006 | TRC22 | 3 | TR | Diyarbakır | Diyarbakır | 4.0 | 2 | 3 | TRC22 | POLYGON ((6989716.599 2273670.524, 6982786.486... |
2007 | NO0B2 | 3 | NO | Svalbard | Svalbard | 3.0 | 3 | 1 | NO0B2 | MULTIPOLYGON (((4754167.335 6382461.409, 47465... |
2008 | NO0B | 2 | NO | Jan Mayen and Svalbard | Jan Mayen and Svalbard | NaN | 0 | 0 | NO0B | MULTIPOLYGON (((4754167.335 6382461.409, 47465... |
2009 | NO0B1 | 3 | NO | Jan Mayen | Jan Mayen | 3.0 | 3 | 1 | NO0B1 | POLYGON ((3642785.362 5404637.884, 3624291.510... |
2010 rows × 10 columns
Vybere prvky na NUTS urovni 0.
[3]:
nuts0 = nuts[nuts["LEVL_CODE"] == 0]
nuts0.head()
[3]:
NUTS_ID | LEVL_CODE | CNTR_CODE | NAME_LATN | NUTS_NAME | MOUNT_TYPE | URBN_TYPE | COAST_TYPE | FID | geometry | |
---|---|---|---|---|---|---|---|---|---|---|
0 | FR | 0 | FR | France | France | 0.0 | 0 | 0 | FR | MULTIPOLYGON (((9954236.116 -3059379.316, 9961... |
1 | HR | 0 | HR | Hrvatska | Hrvatska | 0.0 | 0 | 0 | HR | MULTIPOLYGON (((4827385.889 2618351.326, 48483... |
2 | HU | 0 | HU | Magyarország | Magyarország | 0.0 | 0 | 0 | HU | POLYGON ((5214660.069 2880853.832, 5216710.220... |
3 | AL | 0 | AL | Shqipëria | Shqipëria | 0.0 | 0 | 0 | AL | POLYGON ((5129579.170 2204098.752, 5148385.473... |
4 | AT | 0 | AT | Österreich | Österreich | 0.0 | 0 | 0 | AT | POLYGON ((4742889.368 2876362.725, 4783217.798... |
Vybrané prvky zobrazíme v interaktivní mapě.
[4]:
nuts0.explore()
/usr/lib/python3/dist-packages/scipy/__init__.py:146: UserWarning: A NumPy version >=1.17.3 and <1.25.0 is required for this version of SciPy (detected version 1.26.0
warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"
[4]:
Make this Notebook Trusted to load map: File -> Trust Notebook
Přidání informací o obyvatelstvu¶
[5]:
def population_by_nuts(df, year, level=0, sex='T', nuts=None):
data = df[(df["TIME_PERIOD"] == year) & (df["sex"] == sex) & (df["age"] == 'TOTAL') & (df["geo"].apply(lambda x: len(x) == level+2))]
if nuts is not None:
data = data[data["geo"].str.match(nuts)]
return data[["geo", "OBS_VALUE"]]
population_by_nuts(nuts_data, 2022).head()
[5]:
geo | OBS_VALUE | |
---|---|---|
752124 | AL | 2793592 |
752286 | AT | 8978929 |
752718 | BE | 11617623 |
753305 | BG | 6838937 |
753638 | CH | 8738791 |
[6]:
popu = population_by_nuts(nuts_data, 2022)
nuts0_popu = pd.merge(nuts0, popu, how='left', left_on='NUTS_ID', right_on='geo')
nuts0_popu.head()
[6]:
NUTS_ID | LEVL_CODE | CNTR_CODE | NAME_LATN | NUTS_NAME | MOUNT_TYPE | URBN_TYPE | COAST_TYPE | FID | geometry | geo | OBS_VALUE | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | FR | 0 | FR | France | France | 0.0 | 0 | 0 | FR | MULTIPOLYGON (((9954236.116 -3059379.316, 9961... | FR | 67871925.0 |
1 | HR | 0 | HR | Hrvatska | Hrvatska | 0.0 | 0 | 0 | HR | MULTIPOLYGON (((4827385.889 2618351.326, 48483... | HR | 3862305.0 |
2 | HU | 0 | HU | Magyarország | Magyarország | 0.0 | 0 | 0 | HU | POLYGON ((5214660.069 2880853.832, 5216710.220... | HU | 9689010.0 |
3 | AL | 0 | AL | Shqipëria | Shqipëria | 0.0 | 0 | 0 | AL | POLYGON ((5129579.170 2204098.752, 5148385.473... | AL | 2793592.0 |
4 | AT | 0 | AT | Österreich | Österreich | 0.0 | 0 | 0 | AT | POLYGON ((4742889.368 2876362.725, 4783217.798... | AT | 8978929.0 |
Zobrazíme prvky znázorňující množství obyvatel.
[7]:
nuts0_popu.explore(column='OBS_VALUE', legend=False, cmap='OrRd')
[7]:
Make this Notebook Trusted to load map: File -> Trust Notebook
Definujeme novou funkci, která výpočet automatizuje.
[8]:
def nuts_population(df_geo, df_data, year, level):
nuts_level = df_geo[df_geo["LEVL_CODE"] == level]
popu = population_by_nuts(df_data, year, level)
return pd.merge(nuts_level, popu, how='left', left_on='NUTS_ID', right_on='geo')
nuts_popu = nuts_population(nuts, nuts_data, 2022, 1)
nuts_popu
nuts_popu.explore(column='OBS_VALUE', legend=False, cmap='OrRd')
[8]:
Make this Notebook Trusted to load map: File -> Trust Notebook
[9]:
nuts_population(nuts, nuts_data, 2014, 2).explore(column='OBS_VALUE', legend=False, cmap='OrRd')
[9]:
Make this Notebook Trusted to load map: File -> Trust Notebook
Hustota obyvatelstva¶
[10]:
nuts0_popu.area.head()
[10]:
0 6.387388e+11
1 5.658665e+10
2 9.296786e+10
3 2.866265e+10
4 8.385113e+10
dtype: float64
Vypočteme hustotu obyvatelstva (počet osob/km2).
[11]:
(nuts0_popu["OBS_VALUE"] / (nuts0_popu.area / 1e6)).head(10)
[11]:
0 106.259284
1 68.254707
2 104.218925
3 97.464550
4 107.081788
5 377.748627
6 61.396334
7 212.851778
8 97.901453
9 133.296882
dtype: float64
[12]:
nuts0_popu["density"] = nuts0_popu["OBS_VALUE"] / (nuts0_popu.area / 1e6)
nuts0_popu.head()
[12]:
NUTS_ID | LEVL_CODE | CNTR_CODE | NAME_LATN | NUTS_NAME | MOUNT_TYPE | URBN_TYPE | COAST_TYPE | FID | geometry | geo | OBS_VALUE | density | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | FR | 0 | FR | France | France | 0.0 | 0 | 0 | FR | MULTIPOLYGON (((9954236.116 -3059379.316, 9961... | FR | 67871925.0 | 106.259284 |
1 | HR | 0 | HR | Hrvatska | Hrvatska | 0.0 | 0 | 0 | HR | MULTIPOLYGON (((4827385.889 2618351.326, 48483... | HR | 3862305.0 | 68.254707 |
2 | HU | 0 | HU | Magyarország | Magyarország | 0.0 | 0 | 0 | HU | POLYGON ((5214660.069 2880853.832, 5216710.220... | HU | 9689010.0 | 104.218925 |
3 | AL | 0 | AL | Shqipëria | Shqipëria | 0.0 | 0 | 0 | AL | POLYGON ((5129579.170 2204098.752, 5148385.473... | AL | 2793592.0 | 97.464550 |
4 | AT | 0 | AT | Österreich | Österreich | 0.0 | 0 | 0 | AT | POLYGON ((4742889.368 2876362.725, 4783217.798... | AT | 8978929.0 | 107.081788 |
[13]:
nuts0_popu.dropna(subset=["OBS_VALUE"]).explore(column='density', legend=True, cmap='OrRd')
[13]:
Make this Notebook Trusted to load map: File -> Trust Notebook
[14]:
mean = nuts0_popu["density"].mean()
nuts0_popu.explore(column='density', legend=True, cmap='OrRd', vmax=mean)
[14]:
Make this Notebook Trusted to load map: File -> Trust Notebook
Časová řada¶
[15]:
import pandas as pd
pd.set_option('mode.chained_assignment', None)
nuts_dens = nuts[nuts["LEVL_CODE"] == 0]
years = nuts_data["TIME_PERIOD"].unique()
for year in years:
print(f"{year}...")
nuts_popu_year = nuts_population(nuts, nuts_data, year, 0)
nuts_dens[f"density_{year}"] = nuts_popu_year["OBS_VALUE"] / (nuts_popu_year.area / 1e6)
2014...
2015...
2016...
2017...
2018...
2019...
2020...
2021...
2022...
[16]:
def show_density(df, year, base="density"):
column = f"{base}_{year}"
return df.explore(column, legend=True, cmap='OrRd')
show_density(nuts_dens, 2022)
[16]:
Make this Notebook Trusted to load map: File -> Trust Notebook
[17]:
for year in years[1:]:
print(f"{year}...")
prev_year = year - 1
nuts_dens[f"density_trend_{year}"] = nuts_dens[f"density_{year}"] - nuts_dens[f"density_{prev_year}"]
nuts_dens.head()
2015...
2016...
2017...
2018...
2019...
2020...
2021...
2022...
[17]:
NUTS_ID | LEVL_CODE | CNTR_CODE | NAME_LATN | NUTS_NAME | MOUNT_TYPE | URBN_TYPE | COAST_TYPE | FID | geometry | ... | density_2021 | density_2022 | density_trend_2015 | density_trend_2016 | density_trend_2017 | density_trend_2018 | density_trend_2019 | density_trend_2020 | density_trend_2021 | density_trend_2022 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | FR | 0 | FR | France | France | 0.0 | 0 | 0 | FR | MULTIPOLYGON (((9954236.116 -3059379.316, 9961... | ... | 105.922303 | 106.259284 | 0.457422 | 0.282178 | 0.268380 | 0.338805 | 0.237048 | 0.223221 | 0.526766 | 0.336981 |
1 | HR | 0 | HR | Hrvatska | Hrvatska | 0.0 | 0 | 0 | HR | MULTIPOLYGON (((4827385.889 2618351.326, 48483... | ... | 71.330521 | 68.254707 | -0.379825 | -0.612282 | -0.644251 | -0.860981 | -0.516853 | -0.319528 | -0.385427 | -3.075814 |
2 | HU | 0 | HU | Magyarország | Magyarország | 0.0 | 0 | 0 | HU | POLYGON ((5214660.069 2880853.832, 5216710.220... | ... | 104.668134 | 104.218925 | -0.234425 | -0.269835 | -0.354144 | -0.206415 | -0.060397 | -0.034743 | -0.416854 | -0.449209 |
3 | AL | 0 | AL | Shqipëria | Shqipëria | 0.0 | 0 | 0 | AL | POLYGON ((5129579.170 2204098.752, 5148385.473... | ... | 98.725738 | 97.464550 | -0.230195 | -0.356003 | 0.034854 | -0.218647 | -0.275515 | -0.574685 | -0.565684 | -1.261188 |
4 | AT | 0 | AT | Österreich | Österreich | 0.0 | 0 | 0 | AT | POLYGON ((4742889.368 2876362.725, 4783217.798... | ... | 106.530037 | 107.081788 | 0.919964 | 1.377978 | 0.863363 | 0.589163 | 0.435391 | 0.504334 | 0.376858 | 0.551752 |
5 rows × 27 columns
[18]:
show_density(nuts_dens, 2022, "density_trend")
[18]:
Make this Notebook Trusted to load map: File -> Trust Notebook
[19]:
def show_density_diff(df, start, end):
column = f"density_trend_{end}_{start}"
df[column] = df[f"density_{end}"] - df[f"density_{start}"]
return df.explore(column, legend=True, cmap='OrRd')
show_density_diff(nuts_dens, 2014, 2022)
[19]:
Make this Notebook Trusted to load map: File -> Trust Notebook