Data Dictionary

Silver

Published

March 22, 2025

Modified

May 17, 2025

LDI


Creating the silver dataset

There are several input or raw data files:

  • two USA Spend files
    • Assistance/ Grants and Contracts
  • a file provided by ALI that is similar to the web-scraped results
  • a CSV file for state populations

The first step in is to join the ies short data with the usa spend assistance/grant data.

Next, the ies short data is joined with the usa spend contract data.

Next, the ies short data is joined with the ali data.

The next step is a bit more involved. Thus far three separate data sources have been matched to ies short using award_number_clean. The results of these joins are then used to create a lookup table based on awardees and recipient geographic info such as state, county, zip, etc. Some awardees in this lookup table have multiple geographies. Only awardees with distinct or singular geographic data are retained in the lookup table. The awardee lookup table is then joined to ies short on awardee. We call the results of this join awardee geo lookup. For example, let’s say that in the first three steps Florida State University (FSU) was identified as an awardee and it had distinct geo data - that is, all awards with FSU as the awardee have the same state code, county FIPS, zip code, etc. Then, using this info, all unmatched awards where the awardee is FSU are collected and added to the final silver data.

To create the silver dataset all of the aforementioned results are combined into a single file.

There are also some post-processing steps. Annual state population is added as a new field. Several lookup tables are created using the silver data to populate missing recipient geography fields. These lookup tables use zip code, state code, and county FIPS as the primary keys.

File and output structure

flowchart TB

    %% command center
    A["main"]
    
    %% pipelines
    B["pipeline: usa_ass"]
    C["pipeline: usa_con"]
    D["pipeline: ali"]
    E["pipeline: awardee_lookup"]

    %% output
    AA["file: grants-assistance.parquet"]
    AB["file: contracts.parquet"]
    AC["file: ali.parquet"]
    AD["file: awardee_lookups.parquet.parquet"]
    AE["file: silver.parquet"]
    

    %% flow
    A --> B --> AA --> AE
    A --> C --> AB --> AE
    A --> D --> AC --> AE
    A --> E --> AD --> AE

Schema overview

flowchart TB
    
    %% Databases
    bronze[(Database Bronze)]
    silver[(Database Silver)]

    %% Data
    B["
    USA Spend
    Assistance/Grants & Contracts
    "]
    C[ALI]
    D[Awardee Lookup]
    E[State populations]
    F[Awardee lookup table]
    G[Matched awards]
         
    %% Silver
    bronze --> B --> G
    bronze --> C --> G
    bronze --> D --> G
    bronze --> E --> G
    bronze --> F --> G
    
    G --> silver

Details

### Totals ###
IES Short awards: 2,996
- Grants/ assistance: 2,509 awards (84%) of total
- Contracts: 487 awards (16%) of total
USA Spend Grants/ Assistance: 1,592 awards
USA Spend Contracts: 1,069 awards
ALI DB: 1,125 awards

### Matches ###
Matched awards between IES Short and USA Spend Grants/ Assistance: 1,425 awards
- 57% of IES Short grants (1,425 out of 2,509)
- 90% of USA Spend Grants/ Assistance (1,425 out of 1,592)

Matched awards between IES Short and USA Spend Contracts: 182 awards
- 37% of IES Short contracts (182 out of 487)
- 17% of USA Spend Contracts (182 out of 1,069)

Matched awards between IES Short and ALI DB: 10 awards
- 0% of IES Short (10 out of 2,996)
- 1% of ALI DB (10 out of 1,125)

Awards identified using awardees with unique geo info in previously matched IES Short awards: 828
- 60% of unmatched IES Short (828 out of 1,379)

### silver ###
There are 2,445 awards in the final results
- 82% of IES Short (2,445 out of 2,996)

### Unmatched ###
Unmatched IES Short awards:
- 18% (551 out of 2,996)

Unmatched USA Spend Grants/ Assistance awards:
- 43% of IES Short grant (1,084 out of 2509)
- 10% of USA Spend Grants/ Assistance (167 out of 1592)

Unmatched USA Spend Contracts awards:
- 63% of IES Short contract (305 out of 487)
- 83% of USA Spend Contract (887 out of 1592)

Schemas

USA spend assistance/ grants

  • primary key: award_number
ibis.Schema {
  award_number                                   string
  award_number_clean                             string
  status                                         string
  award_type                                     string
  name                                           string
  amount                                         int64
  award_year                                     int64
  award_date_range_start                         date
  award_date_range_end                           date
  awardee                                        string
  project_type                                   string
  principal_investigator                         string
  office                                         array<string>
  program                                        array<string>
  program_topics                                 array<string>
  evaluation_topics                              array<string>
  as_of                                          date
  awarding_agency_code                           string
  awarding_agency_name                           string
  awarding_sub_agency_code                       string
  awarding_sub_agency_name                       string
  recipient_uei                                  string
  recipient_name                                 string
  recipient_parent_uei                           string
  recipient_parent_name                          string
  recipient_country_code                         string
  recipient_country_name                         string
  recipient_county_name                          string
  recipient_city_name                            string
  recipient_state_code                           string
  recipient_state_name                           string
  recipient_zip_code                             string
  primary_place_of_performance_country_code      string
  primary_place_of_performance_country_name      string
  primary_place_of_performance_city_name         string
  primary_place_of_performance_county_name       string
  primary_place_of_performance_state_name        string
  primary_place_of_performance_zip_code          string
  recipient_county_fips_code                     string
  recipient_state_fips_code                      string
  primary_place_of_performance_county_fips_code  string
  primary_place_of_performance_state_fips_code   string
  usaspending_permalink                          string
  parent_award_id_piid                           string
}

USA spend contracts

  • primary key: award_number
ibis.Schema {
  award_number                                   string
  award_number_clean                             string
  status                                         string
  award_type                                     string
  name                                           string
  amount                                         int64
  award_year                                     int64
  award_date_range_start                         date
  award_date_range_end                           date
  awardee                                        string
  project_type                                   string
  principal_investigator                         string
  office                                         array<string>
  program                                        array<string>
  program_topics                                 array<string>
  evaluation_topics                              array<string>
  as_of                                          date
  parent_award_id_piid                           string
  awarding_agency_code                           string
  awarding_agency_name                           string
  awarding_sub_agency_code                       string
  awarding_sub_agency_name                       string
  recipient_uei                                  string
  recipient_name                                 string
  recipient_parent_uei                           string
  recipient_parent_name                          string
  recipient_country_code                         string
  recipient_country_name                         string
  recipient_county_name                          string
  recipient_city_name                            string
  recipient_state_code                           string
  recipient_state_name                           string
  recipient_zip_code                             string
  primary_place_of_performance_country_code      string
  primary_place_of_performance_country_name      string
  primary_place_of_performance_city_name         string
  primary_place_of_performance_county_name       string
  primary_place_of_performance_state_name        string
  primary_place_of_performance_zip_code          string
  recipient_county_fips_code                     string
  recipient_state_fips_code                      string
  primary_place_of_performance_county_fips_code  string
  primary_place_of_performance_state_fips_code   string
  usaspending_permalink                          string
}

ALI

  • primary key: award_number
ibis.Schema {
  award_number                                   string
  award_number_clean                             string
  status                                         string
  award_type                                     string
  name                                           string
  amount                                         int64
  award_year                                     int64
  award_date_range_start                         date
  award_date_range_end                           date
  awardee                                        string
  project_type                                   string
  principal_investigator                         string
  office                                         array<string>
  program                                        array<string>
  program_topics                                 array<string>
  evaluation_topics                              array<string>
  as_of                                          date
  awarding_agency_code                           string
  awarding_agency_name                           string
  awarding_sub_agency_code                       string
  awarding_sub_agency_name                       string
  recipient_uei                                  string
  recipient_name                                 string
  recipient_parent_uei                           string
  recipient_parent_name                          string
  recipient_country_code                         string
  recipient_country_name                         string
  recipient_county_name                          string
  recipient_city_name                            string
  recipient_state_code                           string
  recipient_state_name                           string
  recipient_zip_code                             string
  primary_place_of_performance_country_code      string
  primary_place_of_performance_country_name      string
  primary_place_of_performance_city_name         string
  primary_place_of_performance_county_name       string
  primary_place_of_performance_state_name        string
  primary_place_of_performance_zip_code          string
  recipient_county_fips_code                     string
  recipient_state_fips_code                      string
  primary_place_of_performance_county_fips_code  string
  primary_place_of_performance_state_fips_code   string
  usaspending_permalink                          string
  parent_award_id_piid                           string
}

Awardee lookup

  • primary key: awardee
ibis.Schema {
  award_number                                   string
  award_number_clean                             string
  status                                         string
  award_type                                     string
  name                                           string
  amount                                         int64
  award_year                                     int64
  award_date_range_start                         date
  award_date_range_end                           date
  awardee                                        string
  project_type                                   string
  principal_investigator                         string
  office                                         array<string>
  program                                        array<string>
  program_topics                                 array<string>
  evaluation_topics                              array<string>
  as_of                                          date
  awarding_agency_code                           string
  awarding_agency_name                           string
  awarding_sub_agency_code                       string
  awarding_sub_agency_name                       string
  recipient_uei                                  string
  recipient_name                                 string
  recipient_parent_uei                           string
  recipient_parent_name                          string
  recipient_country_code                         string
  recipient_country_name                         string
  recipient_county_name                          string
  recipient_city_name                            string
  recipient_state_code                           string
  recipient_state_name                           string
  recipient_zip_code                             string
  primary_place_of_performance_country_code      string
  primary_place_of_performance_country_name      string
  primary_place_of_performance_city_name         string
  primary_place_of_performance_county_name       string
  primary_place_of_performance_state_name        string
  primary_place_of_performance_zip_code          string
  recipient_county_fips_code                     string
  recipient_state_fips_code                      string
  primary_place_of_performance_county_fips_code  string
  primary_place_of_performance_state_fips_code   string
  usaspending_permalink                          string
  parent_award_id_piid                           string
}