environmental_justice_calcs.py
The environmental_justice_calcs
script file contains a number of functions that help calculate exposure metrics for environmental justice analyses.
create_exposure_df
Creates a dataframe ready for exposure calculations
- Inputs:
conc
: concentration object fromconcentration.py
isrm_pop_alloc
: population object (frompopulation.py
) re-allocated to the ISRM grid cell geometryverbose
: a Boolean indicating whether or not detailed logging statements should be printeddebug_mode
: a Boolean indicating whether or not to output debug statements
- Outputs
exposure_gdf
: a geodataframe with the exposure concentrations and allocated population by racial group
- Methodology:
- Pulls the total concentration from the concentration object
- Grabs the population by racial/ethnic group from the population object
- Merges the concentration and population data based on the ISRM ID
- Adds the population weighted mean exposure as a column of the geodataframe using
add_pwm_col
add_pwm_col
Adds an intermediate column that multiplies population by exposure concentration
- Inputs:
exposure_gdf
: a geodataframe with the exposure concentrations and allocated population by racial groupgroup
: the racial/ethnic group name
- Outputs:
exposure_gdf
: a geodataframe with the exposure concentrations and allocated population by racial group, now with PWM column
- Methodology:
- Creates a column called
group
+’_PWM’. - Multiplies exposure concentration by
group
population - Returns the new dataframe
- Creates a column called
- Important Notes:
- The new column is not actually a population-weighted mean, it is just an intermediate for calculating PWM in the next step.
get_pwm
Estimates the population-weighted mean exposure for a given group
- Inputs:
exposure_gdf
: a geodataframe with the exposure concentrations and allocated population by racial groupgroup
: the racial/ethnic group name
- Outputs:
PWM_group
: the group-level population weighted mean exposure concentration (float)
- Methodology:
- Creates a variable for the group PWM column (as created in
add_pwm_col
- Estimates PWM by adding across the
group
_PWM column and dividing by the totalgroup
population
- Creates a variable for the group PWM column (as created in
get_overall_disparity
Returns a table of overall disparity metrics by racial/ethnic group
- Inputs:
exposure_gdf
: a geodataframe with the exposure concentrations and allocated population by racial group
- Outputs:
pwm_df
: a dataframe containing the PWM, absolute disparity, and relative disparity of each group
- Methodology:
- Creates an empty dataframe with the groups as rows
- Estimates the group population weighted mean using the
get_pwm
function - Estimates the absolute disparity as
Group_PWM
-Total_PWM
- Estimates the relative disparity as the
Absolute Disparity
/Total_PWM
estimate_exposure_percentile
Creates a dataframe of exposure percentiles for plotting
- Inputs:
exposure_gdf
: a geodataframe with the exposure concentrations and allocated population by racial groupverbose
: a Boolean indicating whether or not detailed logging statements should be printed
- Outputs:
df_pctl
: a dataframe of exposure concentrations by percentile of population exposed by group
- Methodology:
- Creates a copy of the
exposure_gdf
dataframe to prevent writing over the original. - Sorts the dataframe by PM2.5 concentration and resets the index.
- Iterates through each racial/ethnic group, performing the following:
- Creates a small slice of the dataframe that is only the exposure concentration and the
group
. - Estimates the cumulative sum of population in the sorted dataframe.
- Estimates the total population of the
group
. - Estimates percentile as the population in the grid cell divided by the total population of the
group
. - Adds the percentile column into the main dataframe.
- Creates a small slice of the dataframe that is only the exposure concentration and the
- Creates a copy of the
run_exposure_calcs
Calls the other exposure justice functions in order
- Inputs:
conc
: concentration object fromconcentration.py
isrm_pop_alloc
: population object (frompopulation.py
) re-allocated to the ISRM grid cell geometryverbose
: a Boolean indicating whether or not detailed logging statements should be printeddebug_mode
: a Boolean indicating whether or not to output debug statements
- Outputs:
exposure_gdf
: a dataframe containing the exposure concentrations and population estimates for each groupexposure_pctl
: a dataframe of exposure concentrations by percentile of population exposed by groupexposure_disparity
: a dataframe containing the PWM, absolute disparity, and relative disparity of each group
- Methodology:
- Calls the
create_exposure_df
function. - Calls the
get_overall_disparity
function. - Calls the
estimate_exposure_percentile
function.
- Calls the
export_exposure_gdf
Exports the exposure concentrations and population estimates as a shapefile
- Inputs:
exposure_gdf
: a dataframe containing the exposure concentrations and population estimates for each groupshape_out
: a filepath string of the location of the shapefile output directoryf_out
: the name of the file output category (will append additional information)
- Outputs:
- A shapefile will be output into the
shape_out
directory. - The function returns
fname
as a surrogate for completion (otherwise irrelevant)
- A shapefile will be output into the
- Methodology:
- Creates a filename and path for the export.
- Updates the columns slightly for shapefile naming
- Exports the shapefile.
export_exposure_csv
Exports the exposure concentrations and population estimates as a CSV file
- Inputs:
exposure_gdf
: a dataframe containing the exposure concentrations and population estimates for each groupoutput_dir
: a filepath string of the location of the output directoryf_out
: the name of the file output category (will append additional information)
- Outputs:
- A CSV file will be output into the
output_dir
. - The function returns
fname
as a surrogate for completion (otherwise irrelevant)
- A CSV file will be output into the
- Methodology:
- Creates a filename and path for the export.
- Updates the column names for more straightforward interpretation
- Exports the results as a comma-separated value (CSV) file.
export_exposure_disparity
Exports the exposure concentrations and population estimates as a shapefile
- Inputs:
exposure_disparity
: a dataframe containing the population-weighted mean exposure concentrations for each groupoutput_dir
: a filepath string of the location of the output directoryf_out
: the name of the file output category (will append additional information)
- Outputs:
- A shapefile will be output into the
output_dir
. - The function returns
fname
as a surrogate for completion (otherwise irrelevant)
- A shapefile will be output into the
- Methodology:
- Creates a filename and path for the export.
- Updates the columns and values slightly for more straightforward interpretation
- Exports the results as a comma-separated value (CSV) file.
plot_percentile_exposure
Creates a plot of exposure concentration by percentile of each group’s population
- Inputs:
output_dir
: a filepath string of the location of the output directoryf_out
: the name of the file output category (will append additional information)exposure_pctl
: a dataframe of exposure concentrations by percentile of population exposed by groupverbose
: a Boolean indicating whether or not detailed logging statements should be printeddebug_mode
: a Boolean indicating whether or not to output debug statements
- Outputs:
- The function does not return anything, but a lineplot image (PNG) will be output into the
output_dir
.
- The function does not return anything, but a lineplot image (PNG) will be output into the
- Methodology:
- Creates a melted (un-pivoted) version of the percentiles dataframe.
- Multiplies the percentile by 100 to span 0-100 instead of 0-1.
- Maps the racial/ethnic group names to better formatted names (e.g., “HISLA” –> “Hispanic/Latino”)
- Draws the figure using the
seaborn
library’slineplot
function. - Saves the file as
f_out
+ ‘_PM25_Exposure_Percentiles.png’ into theout_dir
.
export_exposure
Calls each of the exposure output functions in parallel
- Inputs:
exposure_gdf
: a dataframe containing the exposure concentrations and population estimates for each groupexposure_disparity
: a dataframe containing the population-weighted mean exposure concentrations for each groupexposure_pctl
: a dataframe of exposure concentrations by percentile of population exposed by groupshape_out
: a filepath string of the location of the shapefile output directoryoutput_dir
: a filepath string of the location of the output directoryf_out
: the name of the file output category (will append additional information)verbose
: a Boolean indicating whether or not detailed logging statements should be printeddebug_mode
: a Boolean indicating whether or not to output debug statements
- Outputs:
- The function does not return anything, but a shapefile will be output into the
output_dir
.
- The function does not return anything, but a shapefile will be output into the
- Methodology:
- Creates a filename and path for the export.
- Updates the columns slightly for shapefile naming
- Exports the shapefile.
region_pwm_helper
Estimates population-weighted mean for a subset of the full_dataset.
- Inputs: None
name
: the specific name of the region type (e.g., SF BAY AREA)group
: the racial/ethnic group of interestfull_dataset
: a dataframe containing all of the concentraion and population intersection objects with regions assigned
- Outputs:
pwm
: the population-weighted mean concentration of PM2.5
- Methodology:
- Slices a releevant part of the full dataset using the
NAME
column. - Estimates the population-weighted mean for that geographic area only.
- Slices a releevant part of the full dataset using the
export_pwm_map
Creates the exports for the population-weighted products requested when the user inputs an output resolution larger than the ISRM grid
- Inputs:
pop_exp
: a dataframe containing the population information without age-resolutionconc
: a concentration objectoutput_dir
: a filepath string of the location of the output directoryoutput_region
: the geometry of the desired output regionf_out
: the name of the file output category (will append additional information)ca_shp_path
: a filepath string of the location of the California boundary shapefileshape_out
: a filepath string of the location of the shapefile output directory
- Outputs: None
- Methodology:
- Combines the concentration data, geographic areas data, and the population data by intersecting all three together.
- Estimates the population counts for each group in each of these intersected areas.
- Estimates the population-weighted mean concentration for each group for each geographic subarea.
- Plots this data on a chloropleth map using the
visualize_pwm_conc
function. - Outputs this summary data as a shapefile and as a csv.
visualize_pwm_conc
Creates map of PWM concentrations using simple chloropleth.
- Inputs:
output_res_geo
: a dataframe containing the population-weighted mean concentrations for each output resolutionoutput_region
: the geometry of the desired output regionoutput_dir
: a filepath string of the location of the output directoryf_out
: the name of the file output category (will append additional information)ca_shp_path
: a filepath string of the location of the California boundary shapefile
- Outputs: None
- Methodology:
- Reads in the California boundary file and projects it to the matching coordinate reference system.
- Creates a matching map to the one created in
concentration.visualize_concentrations()
.
create_rename_dict
Makes a global rename code dictionary for easier updating
- Inputs: None
- Outputs:
logging_code
: a dictionary that maps endpoint names to log statement codes
- Methodology:
- Defines a dictionary and returns it.