Rural Speed Safety Project for USDOT SDI

Ohio Data Conflation Work






September 27, 2018


Outline

  • Data Conflation of Ohio

Folder Organization

Ohio Rural NHS Crashes (2015)

Process 2

  • Step 0: Use the NPMRDS Link file and NPMRDS Lookup Table (LUT) to create a NPMRDS TMC Shapefile from the NPMRDS Link file. This step needs to be repeated for each direction. (ArcGIS)
  • Step 1: Filter HSIS crash Shapefile to include only those crashes that are both on the NHS and in rural areas. (ArcGIS)
  • Step 2: Spatially join the TMC attributes to the HSIS crash (accbased) Shapefile with resulting distance. In this step, each crash case was assigned to the closest TMC segment on both direction. (ArcGIS)

Process 2

  • Step 3: For each TMC append the direction of the TMC from the NPMRDS monthly static file. (R)
  • Step 4: For each directional set of TMCs compare the direction of the crash from the accident or vehicle file as required and the direction of the associated TMC. Annotate it by "Yes Match", "No Match", and "Distance Based Match" based on the matching characteristics. (R)
  • Step 5: Combine the two directional data sets and reduce to the pairs of TMCs and crashes that have matching directions. (R)

Crashes with No Reference Direction

Process 2

  • Step 6: Generate crash-level output csv file. In this data, each row represents a crash case, which is assigned to a TMC when applicable. (R)
  • Step 7: Produce TMC-level output csv file. This step combines the linear conflation results of HSIS and TMC (i.e., Process 1). In this data, each row represents one directional TMC, with roadway characteristics and crash count at different severity levels. (R)

Process 3

  • Step 0: Load NPRMDS data by month (by state). (R)
  • Step 1: The NPMRDS data contain available epoch level travel time data. Two datasets with be prepared at the epoch levels (both 5-minutes and 15-minutes) with missing data points. For 15-minutes bin data, calculate average values within larger bins, ignoring missing values within each bin if at least one smaller bin is included in the calculation. If all smaller bins within an aggregated bin are missing, define that bin as 'missing.' (R)

Process 3

  • Step 2: Convert date/bin format so that it can be joined to date/time format of HSIS crash events (e.g., YEAR_DATE_BIN). (R)
  • Step 3: To provide the TMC level travel time information, the monthly data files were converted from long to wide format, with each row representing a TMC and each column a time bin. This database will be prepared for only 15-mutes bin data. (R)
  • Step 4: Develop a code to convert 5-miute and 15-minute epoch to time stamp for time series analysis. (R)

Questions?