Exploring IES short and ALI DB

Published

April 3, 2025

Modified

May 17, 2025

LDI


ALI DB

Exploring award number character lengths

┏━━━━━━━┳━━━━━━━┳━━━━━━━━━━┓
┃ len    n      rel_freq ┃
┡━━━━━━━╇━━━━━━━╇━━━━━━━━━━┩
│ int32int64float64  │
├───────┼───────┼──────────┤
│    119350.831111 │
│    1410.000889 │
│    131720.152889 │
│    12170.015111 │
└───────┴───────┴──────────┘

Joins

  • left anti to find unmatched IDs in ies short
  • left join ali db

There are 1,686 unmatched records out of 2,996 total records (56.28%) in the ies short data

on award ID clean


0 records joined via award id clean out of 1,686


Of the 0 records, 0 of these have different values on key dimensions such as award number, year, and type, which suggests a mismatch.

on award ID


0 records joined via award id out of 1,686


Of the 0 records, 0 of these have different values on key dimensions such as award number, year, and type, which suggests a mismatch.

Better to use original award number.

on title


2 records joined via title out of 1,686


Of the 2 records, 2 of these have different values on key dimensions such as award number, year, and type, which suggests a mismatch.

Amounts and years don’t match on a few IDs.

prod

Unmatched columns that will be populated with NULLs.