WORK ORDER NUMBER BAT-02-006

 

 

TRAFFIC DATA QUALITY WORKSHOP
PROCEEDINGS AND ACTION PLAN

 

 

Final Report

 

 

 

to

 

Office of Policy

Federal Highway Administration

Washington, D.C.

 

 

 

 

 

505 King Avenue

Columbus, Ohio 43201

 

In Association with

 

Cambridge Systematics, Inc.

Texas Transportation Institute

 

 

September 25, 2003

 

.

 



WORK ORDER NUMBER BAT-02-006

 

 

TRAFFIC DATA QUALITY WORKSHOP
PROCEEDINGS AND ACTION PLAN

 

Final Report

 

 

 

Prepared for

 

Office of Highway Policy Information

Federal Highway Administration

Washington, D.C.

 

 

 

 

 

Principal Authors

 

Dr. Edward Fekpe, PEng.

Mr. Deepak Gopalakrishna

 

 

 

 

 

September 25, 2003



ACKNOWLEDGMENTS

The authors gratefully acknowledge the support and guidance of Mr. Ralph Gillmann of the Federal Highway Administration Office of Policy and Mr. James Pol of the Intelligent Transportation Systems Joint Program Office throughout this project.

 

The authors also acknowledge the support of the Department of Transportation of the states of Ohio and Utah for hosting the regional workshops.  Mr. David Gardner of Ohio DOT and
Ms. Dian Williams of Utah DOT deserve recognition for their roles in organizing these regional workshops.  Ms. Tami Hannahs and Ms. Lynn Price of Battelle also provided valuable logistic assistance in organizing the workshop in Columbus, Ohio.  The authors acknowledge the valuable inputs provided by state and local agency officials during the interview process and all the workshop participants.

 

The authors also acknowledge the valuable inputs provided by the project team, particularly in developing the white papers and conducting the regional workshops.  The project team members are:

 

Dr. Edward Fekpe, Principal Investigator, (Battelle)

Mr. Deepak Gopalakrishna (Battelle)

Ms. Mala Raman (Battelle)

Dr. Rich Margiotta (Cambridge Systematics Inc.)

Dr. Dan Middleton (Texas Transportation Institute)

Mr. Shawn Turner (Texas Transportation Institute).


Table of Contents

ACKNOWLEDGMENTS. i

 

List of Acronyms. v

 

EXECUTIVE SUMMARY.. vi

Introduction. vi

Research Approach. vi

Action Plan. vii

Action Plan Implementation and Work Items. viii

Research Studies. viii

Workshops. viii

Case Studies and Clearinghouse. viii

 

1.0       INTRODUCTION.. 1

1.1       Background. 1

1.2       Project Objectives and Scope. 1

1.3       Organization of Report 2

 

2.0       RESEARCH APPROACH.. 3

2.1       Traffic Data Quality Issues. 4

2.2       Data Collection – Interviews. 5

2.3       Development of White Papers. 5

2.4       Regional Workshops. 6

2.5       Action Plan Development 7

2.6       Additional Traffic Data Quality Literature. 7

 

3.0       WORKSHOP PROCEEDINGS. 7

3.1       Introduction. 7

3.2       Session 1 – Defining and Measuring Traffic Data Quality. 8

3.2.1    Defining Data Quality. 8

3.2.2    Measuring Data Quality. 9

3.2.3    Discussion Points. 9

3.2.3.1... Discussions – Ohio Workshop. 10

3.2.3.2... Discussions – Utah Workshop. 12

3.3       Session 2 – State of the Practice in Traffic Data Quality. 13

3.3.1    Types and Applications for Traffic Data. 14

3.3.2    Traffic Data Quality:  Characteristics. 14

3.3.3    Quality Issues for Using ITS-Generated Data for Traditional Uses. 14

3.3.4    Recommendations:  Possible Solutions. 14

3.3.5    Discussion Points. 15

3.3.5.1... Discussions – Ohio Workshop. 15

3.3.5.2... Discussions – Utah Workshop. 16


Table of Contents (Continued)

3.4       Session 3 – Advances in Traffic Data Collection and Management 16

3.4.1    Introduction. 16

3.4.2    Innovative Contracting Methods. 18

3.4.3    Standards. 18

3.4.4    Training for Data Collection. 18

3.4.5    Data Sharing Between Agencies and States. 18

3.4.6    Advanced Traffic Detection Techniques. 18

3.4.7    Discussion Points. 19

3.4.7.1... Discussions – Ohio Workshop. 19

3.4.7.2... Discussions – Utah Workshop. 19

3.5       Action Plan Discussion. 20

3.5.1    Defining and Measuring Traffic Data Quality. 20

3.5.1.1... Ohio Workshop. 20

3.5.1.2... Utah Workshop. 20

3.5.2    State of the Practice. 21

3.5.2.1... Ohio Workshop. 21

3.5.2.2... Utah Workshop. 21

3.5.3    Innovative Approaches. 22

3.5.3.1... Ohio Workshop. 22

3.5.3.2... Utah Workshop. 22

3.5.4    Responsibilities and Timeline. 23

 

4.0       ACTION PLAN FOR IMPROVING TRAFFIC DATA QUALITY.. 24

4.1       Introduction. 24

4.2       Partnerships and Coordination. 24

4.3       Action Items. 24

4.3.1    Guidelines and Standards for Calculating Data Quality Measures. 24

4.3.2    Compilation of Business Rules/Data Validity Checks and
Quality Control Procedures. 25

4.3.3    Best Practices for Equipment Installation and Maintenance. 26

4.3.4    Clearinghouse for Vehicle Detector Information. 26

4.3.5    Sensitivity Studies to Demonstrate “Value of Data”. 27

4.3.6    Guidelines for Sharing Resources. 27

4.3.7    Life-cycle Costs of Detection Equipment 27

4.3.8    Improved Contracting Approaches. 28

4.3.9    Case Study or Pilot Tests. 28

4.3.10  Guidance on Technologies and Applications. 28

4.4       Implementation and Work Items. 29

4.4.1    Research Studies. 29

4.4.2    Workshops. 30

4.4.3    Case Studies and Clearinghouse. 30

 

5.0       CONCLUDING REMARKS. 31

Table of Contents (Continued)

REFERENCES  32

 

 

List of Appendices

APPENDIX A:.... WHITE PAPERS. A-1

APPENDIX B:.... INTERVIEWEE CONTACT LIST AND INTERVIEW GUIDE. B-1

APPENDIX C:.... REGIONAL WORKSHOP ATTENDEES. C-1

APPENDIX D:.... RELEVANT TRAFFIC DATA QUALITY LITERATURE. D-1

 

 

List of Figures

Figure 1.  Traffic Data Quality Research Approach. 4


List of Acronyms

AADT         Average Annual Daily Traffic

AASHTO    American Association of State Highway Transportation Officials

ADUS         Archived Data User Service

AMATS      Akron Metropolitan Area Transportation Study

ARTIMIS    Advanced Regional Traffic Interactive Management and Information System

ASTM         American Society for Testing and Materials

ATIS           Advanced Traveler Information Systems

ATMS         Advanced Traffic Management Systems

ATR            Automatic Traffic Recorder

COTR         Contracting Officer’s Technical Representative

DOT            Department(s) of Transportation

EDL            Electronic Document Library

ESAL          Equivalent Single Axle Loads

FHWA        Federal Highway Administration

FOT            Field Operational Test

GIS             Geographic Information System

ITS JPO      Intelligent Transportation Systems – Joint Program Office

ITS              Intelligent Transportation Systems

MAG           Maricopa Association of Governments

NOACA     Northeastern Ohio Areawide Coordinating Agency

ODOT         Ohio Department of Transportation

OKI            Ohio-Kentucky-Indiana Regional Council of Governments

ROW          Right-of-Way

RTMS         Remote Traffic Microwave Sensor

TMCs          Traffic Management Centers

TMG           Traffic Monitoring Guide

TRB            Transportation Research Board

TTI              Texas Transportation Institute

UDOT         Utah Department of Transportation

VDC           Vehicle Detector Clearinghouse

VDOT         Virginia Department of Transportation

WIM           Weigh-in-Motion

WSDOT      Washington State DOT


Executive Summary

Introduction

Recent research and analysis have identified several issues regarding the quality of traffic data available from Intelligent Transportation Systems (ITS) for transportation operations, planning, or other functions.  Since Federal agencies use and disseminate traffic data from state and local agencies, the quality of the data becomes even more critical.  The quality of the traffic data and the information produced from the data are critical factors that affect the abilities of transportation agencies to ensure the security of transportation and the management of the nation’s transportation resources.  The focus of data quality is on establishing a consistent methodology for ensuring that data are managed so that a measure of reliability is sustained.  The primary objective of this project is to define an action plan to address traffic data quality issues.  Such an action plan should include work items that can be executed through the U.S. Department of Transportation (DOT), stakeholder organizations (e.g., American Association of State Highway Transportation Officials [AASHTO], ITS America), and state DOTs.

Research Approach

The development of the action plan involved several steps.  First, the issues associated with traffic data quality were reviewed.  Second, three white papers were developed whose themes were based on the issues identified.  The white papers were developed from information gathered from published literature and through interviews with state and local agencies involved with traffic data collection, use, and management.  The white papers are designed to explore the issues and current practices for ensuring data quality.  The scopes of the three white papers and the issues addressed are outlined below.

 

Theme #1:  Defining And Measuring Traffic Data Quality (EDL # 13767).

This white paper defines the measures and methods for quantifying traffic data.  Issues considered include definition of traffic data quality for different users and for different applications; data quality metrics or measures; methodology for assessing traffic data quality; and acceptable levels of quality.

 

Theme # 2:  State-of-the-Practice in Traffic Data Quality (EDL # 13768).

This white paper documents issues, measures, and approaches for assessing, using, and accommodating traffic data quality in various applications.  Issues considered include types and applications of traffic data being used by the states; how data quality problems are handled in various applications; methods used or studies conducted by states to ensure data quality; and institutional issues, data sharing issues and funding constraints.

 

Theme #3:  Advances in Traffic Data Collection and Management (EDL # 13766).

This white paper identifies innovative approaches for improving data quality.  This includes innovative technologies in traffic data collection, new contracting methods, and standards, training for data collection and data sharing between agencies and states.  The issues addressed in this white paper include loop detectors versus non-intrusive data collection devices; lack of field staff for proper maintenance of monitoring devices; innovative approaches to data collection; effects of contracting approach on data quality; new contracting methods, more coordination, standards, and training.

 

Following the development of the white papers, two regional workshops on traffic data quality were conducted.  The three white papers were used to stimulate discussions and obtain inputs from the workshop participants to develop an action plan that addresses traffic data quality issues.  The workshops, sponsored by FHWA Office of Policy, the ITS Joint Program Office (JPO), Ohio Department of Transportation (ODOT), and Utah Department of Transportation (UDOT) were held on March 11, 2003 in Columbus, Ohio and on March 13, 2003 in Salt Lake City, Utah. 

 

The workshop attendees included data providers and users as well as those who influence data collection activities in one way or another.  In attendance were private sector travel information providers, representatives from 10 state DOTs:  Ohio, Delaware, Indiana, Kentucky, Pennsylvania, Utah, Idaho, Texas, Washington, and California.  Also, in attendance were representatives from Advanced Regional Traffic Interactive Management and Information System (ARTIMIS) in Cincinnati, Ohio; Maricopa Association of Governments (MAG) in Arizona; Northeast Ohio Areawide Coordinating Agency (NOACA); Ohio-Kentucky-Indiana (OKI) Regional Council of Governments; and Akron Metropolitan Area Transportation Study (AMATS).

Action Plan

The action plan builds upon the findings in the white papers and inputs obtained from the regional workshops.  The action plan provides a blueprint for specific actions to address traffic data quality issues.  Implementation of the plan will require collaboration among both public and private partners with the FHWA and state DOTs playing leading roles.  The plan identifies the following 10 priority action items based on those identified at the regional workshops. 

 

1.                  Develop guidelines and standards for calculating traffic data quality measures.  The guidelines and standards are expected to contain methods to calculate and report the data quality measures for various applications and levels of aggregation. 

Coordinators:  FHWA or AASHTO

 

2.                  Synthesize validation procedures and rules used by various states and other agencies for traffic monitoring devices.  The synthesis document should include quality control procedures for all types of applications and data management methods for maintaining high quality data.

Coordinators FHWA, states

 

3.                  Develop a synthesis of best practices for installation and maintenance of traffic monitoring devices.  This document should include guidance for establishing quality; standard test methods for determining accuracy and other data quality measures; “triggers” for conducting maintenance; and guidance for selecting strategic traffic monitoring device locations.

Coordinators:  FHWA, states

 

4.                  Establish a clearinghouse for vehicle detector information.  Establish an independent testing entity to conduct periodic tests and verify claims of the new and emerging traffic detection devices on the market.  Store results of tests in a clearinghouse that can be accessed by all potential users.

Coordinators:  FHWA, Vehicle Detector Clearinghouse (VDC), states

 

5.                  Conduct sensitivity analyses and document the results to illustrate the implications of data quality on user applications.  Based on the results of the sensitivity analysis, develop data quality “targets” or “benchmarks’ for each application.  The results of the sensitivity analysis would be used to provide guidance or procedures for imputing missing data points.

Coordinators:  FHWA, states

 

6.                  Develop guidelines for sharing resources for traffic monitoring activities.  The guidelines should contain information on shared equipment, personnel, funding, and cooperation among different agencies and departments.  The guidelines should also include public-private collaboration approaches and practices which establish trust in private sources of data

Coordinators:  FHWA, states

 

7.                  Develop a methodology for calculating life-cycle costs.  The methodology would enable states and other agencies to investigate alternative data collection technologies; develop quality levels as a function of investment in installation and maintenance; and coordinate or leverage operations and other activities in more than one location or jurisdiction.

Coordinators:  FHWA, states

 

8.                  Develop guidelines for innovative contracting approaches for traffic data collection.  The guidelines should include information on performance-based contracting and management, task-order-type contracts and cooperative agreements for equipment installation and maintenance, and life-cycle-cost based bidding.

Coordinators:  FHWA, states

 

9.                  Conduct a case study or a pilot test.  The goal is to observe state DOT and TMCs working to improve data quality and evaluate the return on investment from the improved data quality.

Coordinators:  FHWA, states

 

10.              Provide guidance on technologies and applications.  This action item is in two parts:
(i) provide guidance on the data elements to measure and report since this dictates the type of device procured by the agency, and (ii) provide guidance on the innovative and emerging uses of loops and existing technologies.

Coordinators:  FHWA, states

 

Action Plan Implementation and Work Items

FHWA would play a leading role in the overall implementation of the action plan.  Following are the three potential groups of activities or work items to implement the action plan.

Research Studies

The majority of the action items relate to the development of guidelines, which are best implemented through research studies.  Action items in this category include the following:

 

·        Guidelines and standards for calculating data quality measures (#1)

Workshops

Some of the action items could be implemented through regional workshops.  Action items in this category are those that require sharing of experiences and success stories.  The following are action items in this category:

 

Case Studies and Clearinghouse

Action item in this category require establishing or identifying an independent entity and conducting case studies.  The following are the action items in this category:

 


1.0    INTRODUCTION

1.1       Background

Recent research and analysis have identified several issues regarding the quality of traffic data available from Intelligent Transportation Systems (ITS) for transportation operations, planning, or other functions.  For example, the Advanced Traveler Information Systems (ATIS) Data Gaps Workshop in 2000 identified information accuracy, reliability, and timeliness as critical to ATIS.  The key findings of the workshop, which are included in a document titled “Closing the Data Gap:  Guidelines for Quality Advanced Traveler Information System (ATIS) Data” (U.S.DOT, 2000), are the following:

 

·        Guidelines for quality data go beyond ATIS.

 

A recent report, “Sharing Data for Traveler Information:  Practices and Policies of Public Agencies” (Battelle, 2001), issued in January 2002 examines policies aimed at facilitating data sharing and ultimately improving the quality and quantity of information that reaches travelers.

 

The ITS Archived Data User Service (ADUS) promotes reuse of traffic data collected for real-time operations.  The ATIS and Advanced Traffic Management Systems (ATMS) are generating large amounts of traffic data that could be used in other applications, such as performance monitoring.  However, initial experience with ITS traffic data has identified serious data gaps and data quality deficiencies.  Data can be edited after the fact to remove errors but the problem still remains at the source.  The need for guidelines for sharing traffic data among various agencies and users has been recognized.

 

Section 515 of the Treasury and General Government Appropriations Act for Fiscal

Year 2001 (Public Law 106-554; H.R. 5658) directs the Office of Management and

Budget to issue government-wide guidelines that provide policy and procedural guidance to Federal agencies for ensuring and maximizing the quality, objectivity, utility, and integrity of information (including statistical information) disseminated by Federal agencies.  Since Federal agencies use and disseminate traffic data from State and local agencies, the quality of the data will become even more critical.

 

It is also recognized that the quality of the traffic data and the information produced from the data are critical factors that affect the abilities of transportation agencies to ensure the security of transportation and the management of the nation’s transportation resources.  Data reliability requires that the INFOstructure consistently produce output that the public sector and the private sector can accept without skepticism or distrust.  Effective data quality methods and tools are critical for ensuring the success of INFOstructure applications.

 

The focus of data quality is on establishing a consistent methodology for ensuring that data are managed so that a measure of reliability is sustained.  Several factors affect data quality, including addressing “data gaps” to rectify coverage deficiencies as well as data compatibility across different software/hardware platforms; ensuring that data elements are efficiently matched with coordinated location and time elements; and resolving conflicts among data formats so that data are manipulated to satisfy information and presentation needs.

1.2       Project Objectives and Scope

The primary objective of this project is to define an action plan with work items that can be executed through the U.S. Department of Transportation (DOT), stakeholder organizations (e.g., American Association of State Highway Transportation Officials [AASHTO], ITS America), State agencies, and private industry.  It is anticipated that this effort will establish a multi-year program that will reinforce and sustain the value of INFOstructure applications.  Specifically, this project will:

 

(1)         Develop white papers that explore the issues and current practices for ensuring quality, focusing on transportation but also considering how data quality is addressed in other industries

 

(2)         Develop a draft action plan and timeline for U.S. DOT and others to pursue that will develop metrics, tools, and recommended practices to ensure that data quality is effectively attained

 

(3)         Assemble a workshop that includes the co-sponsorship of relevant stakeholder organizations to address the issues and to validate and revise the action plan and timeline

 

(4)         Prepare proceedings and a compendium of the workshop along with an analysis of the validated action plan.

1.3       Organization of Report

The remainder of this report is divided into several chapters: 

 

Chapter 2 presents an overview of the research approach.  It also describes the major issues associated with traffic data quality.

 

Chapter 3 presents the proceedings of the two regional workshops.  This chapter includes summaries of the white papers, workshop discussions, and action items identified at the workshops.

 

Chapter 4 presents the action plan for addressing the traffic data quality issues.  The action plan describes the action items and identifies the responsible agencies for implementing the action items.

Chapter 5 presents the concluding remarks and recommendations.

 

The detailed white papers and list of workshop participants are included as appendices to the report.  Other relevant literature on traffic data quality is also included in the appendices.


2.0    RESEARCH APPROACH

The research approach adopted for the project comprises a number of steps as summarized in Figure 1.  These steps are discussed below.

 

 

Figure 1.  Traffic Data Quality Research Approach

2.1       Traffic Data Quality Issues

As a first step, a kick-off meeting was held at the start of the project with the primary objectives to (i) review the traffic data quality issues, (ii) discuss the themes for the white papers, and (iii) review the strategy for conducting the research.  Several issues associated with traffic data were identified that are common to various applications.  These issues must be addressed to ensure better quality traffic data for ATIS, ATMS, and ITS data archiving and re-use.  These issues can be grouped in different categories, as shown below:

 

Definition and Measurement Issues

·        Defining data quality attributes, including accuracy, consistency, reliability

·        Identifying differences in quality perceived by public and private sector data collectors and users

·        Quality of data as a function of its intended use

·        Measuring and ensuring quality data

·        Quantitative and qualitative metrics/levels

·        Identifying minimum acceptable levels of data quality for different applications

·        Quality control (fixing the problem at the source)

·        Lack of understanding of the full scope of the issue

·        Lack of a consistent approach for ensuring consistent quality

 

 

Equipment Installation and Maintenance Issues

·        Subcontractors install loops carelessly

·        Power and communications disruptions

·        Mix of technology introduces inherent data discrepancies

·        Innovative approaches to data collection

·        Loop detectors versus non-intrusive data collection devices

·        Those who maintain detectors may be different from those who install them

·        Effects of contracting approach on data quality

·        Relationship between data collection device and quality

·        Loops get torn out by third parties

 

Coverage Issues

·        Share traffic data or collect it yourself

·        Better quality with less coverage or lower quality with more coverage

·        Better definition of depth of coverage

·        Coverage of detectors seems to focus on traffic monitoring, but what about forecasting

 

Resource Issues

·        Budget limitations for traffic data collection

·        Lack of field staff for proper maintenance of monitoring devices

·        Lack of expertise in data management issues

·        The implications of funding levels on quality of data collected


Institutional Issues

·        Institutional issues relating to data collection and sharing

·        Regional or state versus national level interests and perspectives of data quality

 

These issues were used to scope three white paper themes.  Each white paper addresses a set of issues and includes a summary of previous literature, innovative practices, and barriers that exist in transportation operations that prevent data quality metrics, tools, and methodologies to be established.  In order to obtain more current information regarding practices, tools, and methodologies, a few states and other users of traffic data were interviewed.

 

It was also decided at the kick-off meeting that two or more regional workshops be conducted rather than the originally planned single national workshop.  The regional workshops were expected to provide the opportunity to share experiences and gather inputs from a wider range of traffic data users.

2.2       Data Collection – Interviews

In developing the white papers, officials from state DOTs and ITS groups were contacted and interviewed.  Representatives from seven states were interviewed:  Arizona, Minnesota, Ohio, Kentucky, Pennsylvania, Utah, and Virginia.  A structured interview guide was developed and used in conducting the interviews.  The contact list and interview guide are included as Appendix B of this report.  Information gathered from the interviews was incorporated into the white papers.

2.3       Development of White Papers

As noted above, the white papers were developed from literature review and information gathered through the interviews.  The draft white papers were revised based on review comments from the FHWA.  Full versions of the revised white papers are provided in Appendix A to this report.  Chapter 3 of this report presents summaries of each white paper and discussions on the findings of the regional workshops.  The following are the three white papers that were developed by the project team. 

 

White Paper #1:  Defining and Measuring Traffic Data Quality (EDL # 13767)

 

Scope:  This white paper defines measures and methods for quantifying traffic data.  Issues considered include:

 

 


White Paper #2:  State of the Practice for Traffic Data Quality (EDL # 13768)

 

Scope:  This white paper documents the issues, measures, and approaches for assessing, using, and accommodating traffic data quality in various applications.  Issues considered include:

 

White Paper #3:  Advances in Traffic Data Collection and Management (EDL #13766)

 

Scope:  This white paper identifies innovative approaches for improving data quality.  This includes new contracting methods, business models, standards, training for data collection, and data sharing between agencies and states.  Consideration was also given to public-private partnerships, advanced traffic detection techniques (intrusive versus non-intrusive), and data archiving and use.  The issues addressed in this white paper include:

 

Text Box: Full versions of the revised white papers are also available as stand-alone documents on the ITS Electronic Document Library at http://www.its.dot.gov/itsweb/welcome.htm

2.4       Regional Workshops

Two regional workshops were conducted with the primary objective of obtaining inputs from participants in developing an action plan to address traffic data quality issues.  The goal was to define an action plan with work items that can be executed by the U.S. Department of Transportation (DOT), stakeholder organizations (e.g., American Association of State Highway Transportation Officials [AASHTO], ITS America), state agencies, and private industry.

 

The regional workshops were sponsored by FHWA Office of Policy, the ITS Joint Program Office (JPO), Ohio Department of Transportation (ODOT), and Utah Department of Transportation (UDOT).  The workshops were held on March 11, 2003 in Columbus, Ohio and on March 13, 2003 in Salt Lake City, Utah.  The revised white papers were distributed to the attendees about two weeks in advance of the workshops, giving them the opportunity to read and be familiar with the concepts and material to be discussed.  The white papers served as inputs to stimulate discussions at the regional workshops.

 

The workshops were intended for state DOT professionals responsible for collecting and using traffic detector data for any application including representatives from traffic management centers (TMCs), traffic operations, traffic monitoring, and planning divisions.  The workshop attendees included data providers and users as well as those who influence data collection activities.  This group includes officials, administrators, or managers involved in budgeting and funding as well as contractors who provide and install data collection devices.  In attendance were private sector travel information providers and representatives from 10 state DOTs (Ohio, Delaware, Indiana, Kentucky, Pennsylvania, Utah, Idaho, Texas, Washington, and California).  Also in attendance were representatives from Advanced Regional Traffic Interactive Management and Information System (ARTIMIS) in Cincinnati, Ohio; Maricopa Association of Governments (MAG) in Arizona; Northeast Ohio Areawide Coordinating Agency (NOACA); Ohio-Kentucky-Indiana (OKI) Regional Council of Governments; and Akron Metropolitan Area Transportation Study (AMATS).  The list of workshop attendees is provided in Appendix C of this report.

 

The draft proceedings of the two regional workshops were prepared and circulated among the workshop attendees for review and comments.  The workshop proceedings included summaries of the white papers, the discussions, and actions items.  The combined proceedings from the two workshops are presented in Chapter 3 of this report.

2.5       Action Plan Development

Several action items were identified and prioritized at the two regional workshops.  The action plan described in Chapter 4 of this report builds upon the findings in the white papers and inputs obtained from the regional workshops and reflect a broadly based consensus of the workshop participants. 

2.6       Additional Traffic Data Quality Literature

Additional relevant information on traffic data quality issues are compiled and presented in Appendix D of this report.  Specifically, the literature pertains to data sharing, institutional issues, vehicle classification, and loop detector failures.  These documents are intended to provide more detail on some of the major issues discussed at the regional workshops and in the white papers.

 

 

 

 

 

 

 

3.0    WORKSHOP PROCEEDINGS

3.1       Introduction

This chapter presents the combined proceedings of the two regional traffic data quality workshops.  Dr. Edward Fekpe, the principal investigator of the project, opened each workshop by welcoming all participants and providing a concise overview of the traffic data quality project.  He also provided a description of the approach used in developing an action plan to address the various issues relating to traffic data quality.

 

At the regional workshop in Columbus, Ohio (March 11, 2003), Dr. Fekpe reviewed the agenda for the workshop and then introduced the Contracting Officer’s Technical Representative (COTR) for the project, Mr. Ralph Gillmann, to discuss the objectives of the workshop. 
Mr. Gillmann outlined the objectives of the project and the expectations for the one-day workshop.  He gave a background of recent efforts including workshops and studies that addressed issues of ITS-generated data.  The most recent activities that were highlighted include:

 

 

Mr. Gillmann also distinguished between real-time and archived data with respect to their uses and the quality requirements for each type.  Finally, Mr. Gillmann outlined the objectives of the workshop, which included agreeing upon the institutional and technical traffic data quality issues.  The primary goal of the workshop was to define an action plan that includes successful practices, new solutions, and priorities.  Mr. Gillmann also emphasized that data from traffic detectors were the main focus, although other traffic data would not be excluded.

 

At the regional workshop in Salt Lake City, Utah (March 13, 2003), Mr. James Pol presented objectives of the meeting and the expectations from the one-day workshop.  Mr. Pol gave a background of recent efforts including workshops and studies to address issues of ITS-generated data.  He outlined the objectives of the workshop, which included agreeing on technical and institutional traffic data quality issues.  He also mentioned the added importance of traffic data quality with new INFOstructure and integration strategies being proposed for ITS.  As at the Ohio workshop, the primary goal of the Utah workshop was to define an action plan that includes successful practices, new solutions, and priorities.

 

The three white papers were presented at each workshop, followed by a detailed discussion of the issues raised.  The remainder of each workshop was devoted to discussions to obtain inputs and ideas for the development of the action plan.  Various traffic data quality action items were identified and discussed.  The following sub-sections present summaries of the white papers, detailed discussions, and action items.

 

 

3.2       Session 1 – Defining and Measuring Traffic Data Quality

The white paper titled Defining and Measuring Traffic Data Quality” was written by
Mr. Shawn Turner (TTI) for this project.  The complete version of the white paper is provided in Appendix A.  In developing this white paper, current and advanced practices for addressing data quality were reviewed for three types of user communities:  1) real-time traffic data collection and dissemination; 2) historical traffic data collection and monitoring; and 3) other industries such as data warehousing, management information systems, and geospatial data sharing.  The recommendations in this paper follow from this review.

3.2.1   Defining Data Quality

The literature contains two similar definitions for data quality.  Strong, Lee, and Wang (1997) define information quality as “fit for use by an information consumer” and indicate that this is a widely adopted criterion for data quality.  English (1999A) further clarifies this widely adopted definition by suggesting that information quality is “fitness for all purposes in the enterprise processes that require it.” English emphasizes that it is the “phenomenon of fitness for ‘my’ purpose that is the curse of every enterprise-wide data warehouse project and every data conversion project.”  English (1999B) defines information quality as “consistently meeting knowledge worker and end-customer expectations.” It is clear from these definitions that data quality is a relative concept that could have different meanings to different consumers.  For example, data considered to have acceptable quality by one consumer may be of unacceptable quality to another consumer with more stringent use requirements.  Thus it is important to consider and understand all intended uses of data before attempting to measure or prescribe data quality levels.

 

The recommended definition for traffic data quality is as follows:

 

“Data quality is the fitness of data for all purposes that require it.  Measuring data quality requires an understanding of all intended purposes for that data.”

3.2.2   Measuring Data Quality

Based upon the review, the following data quality measures are recommended:

 

 

 

 

 

 

 

 

There are several other data quality measures that could be appropriate for specific traffic data applications.  The six measures presented above, however, are fundamental measures that should be universally considered for measuring data quality in traffic data applications.

 

At this time, it is recommended that goals or target values for these traffic data quality measures be established at the jurisdictional or program level based on a better and more clear understanding of all intended uses of traffic data.  It is evident that data consumers’ needs and expectations, as well as available resources, vary significantly by implementation program, urban area, and state and preclude the recommendation of a universal goal or standard for these traffic data quality measures.

 

It is also recommended that if data quality is measured, a data quality report be included in metadata that is made available with the actual dataset.  The practice of requiring a data quality report using standardized reporting is common in the GIS and other data communities.  In fact, several metadata standards already exist (FGDC-STD-001-1998 and ISO DIS 19115) for standardized reporting of data quality in datasets.  Until a formal traffic data archive metadata standard is approved, the traffic data community should create metadata based upon the core elements (i.e., mandatory metadata items) required in these two other geospatial metadata standards.

3.2.3   Discussion Points

The following points were suggested as discussion items at the end of the presentation:

 

  1. Agreement with the data quality measures?
  2. What are the technical or institutional barriers to measuring traffic data quality and providing data quality information with the data itself?
  3. Is there a need to provide guidelines on calculating data quality measures given typical traffic data?
  4. Is there a need for an official standard on defining or calculating these measures?
  5. What are the minimum acceptable levels of data quality for different applications?
  6. Is there a need for national benchmarks or standards for traffic data quality levels?
  7. Given that different applications and users of traffic data require different quality levels, how do public agencies reconcile these differences in quality requirements?  Particularly in cases where “non-paying” users want higher data quality than the group/agency whose budget maintains traffic data sensors?

3.2.3.1    Discussions – Ohio Workshop

Shawn Turner (Texas Transportation Institute) initiated the discussions by asking the workshop participants about their reactions to the data quality measures.  While there was overall agreement that the data quality measures are adequate, there was discussion about some of the measures.

 

The completeness measure was acknowledged as a good measure.  There was some concern that reporting this measure could be embarrassing for state agencies.  None of the state agencies currently report it.  Rob Bostrom from the Division of Planning, Kentucky Transportation Cabinet, stated that their Automatic Traffic Recorder (ATR) data do not contain data for 365 days.  He also stated that data completeness is important for applications like k-factor calculations (30th highest hour) that are used in highway design and capacity analysis.  He also stated that with the existing errors in data collection, the use of the 50th highest hour might not be very different from the 30th hour and that this might be a future research need.  Also some applications such as calculating Equivalent Single Axle Loads (ESALs) from WIM data require that all days are represented.

 

It was also suggested that the data quality measures in the white paper need to be customized by application and region.  Greg Oliver from Delaware DOT mentioned that summer periods are critical for traffic data collection in the state because of the increased flow of traffic during these months.  It is important that the data quality measure reflect this temporal component.

 

David Gardner, ODOT, questioned the usefulness of the data quality measures especially to the final user.  Most users of ODOT data expect a certain quality level to be met and do not necessarily need all the details regarding quality.  A suggestion was to have tiers of users and applications with different data quality documentation needs.

 

Andrew Pierson, URS, mentioned that it is often difficult to go back and verify data collection efforts especially since a consultant is unable to obtain the ground truth.  Data from the states typically lack metadata or the discussion of the context in which the data are produced.

 

Steve Jessberger from ODOT raised a question about the validity measure of data.  Specifically, what should be done with data collected during snow or construction?  Should agencies use the “real” but atypical data or try to collect only typical data?  Ralph Gillmann, FHWA, replied that FHWA would like to know why the data are abnormal and that while atypical conditions are not good for some applications like average annual daily traffic (AADT), metadata (data about data) for such cases would be helpful.  Metadata are not required by FHWA at this time.  None of the workshop participants indicated that the state agencies were collecting and reporting metadata.

On the question of metadata and its value, it was noted that agencies are unable to communicate effectively about data quality because there is usually no historical information or metadata that can be used for comparison; that is, there is no quality information associated with existing data. Some participants noted that their existing traffic analysis software or databases did not support the storage of metadata associated with traffic data.

 

On the issue of minimum acceptable data quality standards, the workshop participants suggested that the minimum acceptable standards vary by state, type of application, and data collection device.  Some minimum requirements are already in use by states for automated traffic recorder (ATR) data.  Ohio, Kentucky, and Indiana, for example, require two weeks of data per month from the ATRs.  Indiana also requires at least two days from each day of the week, per month.  There was no consensus as to whether it is necessary or feasible to set minimum acceptable data quality standards.

 

It was noted that the purposes of the traditional traffic monitoring groups and the ITS groups are different and that this affects their data collection and management philosophy.  Scott Evans from ARTIMIS stated that the cameras and the changeable message signs were their priority for their Traffic Management Center (TMC), and they were interested only in the change in traffic volumes.

 

Several participants expressed concerns about ITS data, including the following:

 

 

The planning division in Pennsylvania DOT has been trying to use TMC data and has encountered some challenges in educating the TMC of their data requirements.  It was also suggested that additional research be conducted to understand the value of ITS data.

 

Several traffic monitoring personnel stated that there was significant overhead involved in using ITS data including the pre-processing of data.  Ohio and Kentucky have a good relationship with ARTIMIS (the TMC in Cincinnati), and data sharing does exist between the TMC and the traffic monitoring groups.  The TMC is able to provide data to the traffic monitoring group at ODOT in a compatible Traffic Monitoring Guide (TMG) format.  While ITS groups require dense coverage, the traffic monitoring groups require coverage for a much larger area.  Dave Gardner, ODOT, cautioned that the availability of ITS data can sometimes overwhelm the resources of the traffic monitoring group in terms of the post-processing requirements.

 

All the participants agreed that guidelines are needed to explain the calculation of the suggested data quality measures.  The following observations were made regarding the need and usefulness of guidelines:


 

It was suggested that these guidelines should be similar to what is being done by ASTM (formerly American Society for Testing and Materials) for archived data.  It was also noted that standards about data quality might be useful and could be included in the AASHTO guidelines for data monitoring programs.

 

National benchmarks for data quality were also strongly encouraged.  It was noted that the concept of INFOstructure should be used in integrating all transportation-related data.  There should be greater emphasis on sharing and integrating data systems at state, local, and regional levels.  At minimum, these benchmarks should be set for loop-based detection systems.  These benchmarks also should be set based on the type of application.

3.2.3.2    Discussions – Utah Workshop

There was general agreement that the six fundamental measures of traffic data quality adequately describe all aspects.  Dr. Mark Hallenback of University of Washington added that the measures presented are the right set of quality measures.

 

The workshop participants noted that the completeness measure was difficult to define as it may differ based on the application.  The assumptions and definitions for this measure also need to be explicit.  For example, 100 percent complete data for freeways is only a partial representation if the arterial system is also considered.  It was felt that the data quality measures need to be specified differently for different applications and the uses of data should decide the nature and necessity of quality measures.  It was suggested that data quality measures need to be fluid and flexible.  One of the participants requested additional clarification on the differences between completeness and coverage.  Shawn Turner explained that “completeness” refers to the temporal aspect and “coverage” refers to the spatial aspect of traffic monitoring.  As far as data quality is concerned, it was noted that there is a lack of guidance for deploying sensors, and they are deployed ad hoc based on operational needs.

 

Text Box: Note: After considering post-workshop comments, the research team agrees that completeness can represent more than just the temporal aspects of missing data. “Completeness" can refer to both the temporal and spatial aspect of data quality, in the sense that completeness measures how much data is available compared to how much data should be available.  The "coverage" measure is most often used to refer to "how much data should be available" in terms of the extent of the transportation network. For example, the "coverage" of a dataset could be 98 percent of the freeway system within an urban area with continuous data collection (24 hours per day, 365 days per year). However, sensor downtime at a few locations and system downtime for a major system software failure might result in a completeness value of 75 percent, in which case the archive contains 75 percent of the data that should be available from the given coverage of 98 percent of the freeway system.

Qing Xia of Maricopa Association of Governments in Arizona raised a question about the weighting or ranking of the data quality measures.  Shawn Turner noted that there are no rankings or weights associated with these measures, although that is an idea for future research.

 

Peter Martin from the University of Utah suggested adding two sub-measures for the accessibility measure of data quality.  The first sub-measure suggested was “portability” to indicate the number of different formats in which the data were available to the user.  The second sub-measure would provide information on the level of manipulation and the type of manipulation used on the data.  Researchers from the University would like information on whether the data are raw or processed and how to access and reformat the data.  Mark Hallenback indicated that the TMC in Seattle has status flags for its detector data that indicate problems and applied solutions at different levels of data aggregation.

 

Martin Knopp, Utah DOT, agreed with the data quality measures and noted that the accessibility measure could place unusual demands on the states to provide data in formats to satisfy all users.  It was suggested that this measure be stated as a philosophy instead.  If all users can be defined then their accessibility also can be defined.  The problem is that some uses for data may not be immediately known—future potential uses of data may have different requirements.

 

Meeting the quality goals of non-paying users is difficult for two reasons:  (i) the provider may have different perspectives on data quality and (ii) the requirements of the non-paying user may not be clearly defined in the budget.  It was felt that if all parties (potential users or beneficiaries) pool resources to secure sufficient funds, it may be possible to meet the data quality requirements of all users.

 

In response to a question about the institutional and technical barriers involved in calculating and reporting these measures, it was noted that cost and time are the two most important issues.  There could be a significant cost to modify software to report the quality measures.  Some participants would like information on the return on investment obtained by reporting these quality measures.  Raelene Viste (Idaho Transportation Department) commented that these measures could be very useful within the transportation group itself to monitor their performance even if the external users do not need these measures.  Texas DOT feels that there is a good return on investment if these measures are followed.

 

Institutional issues arise because different departments have different data needs, operating rules, and budgets.  There is no existing mechanism for effective communication and exchange of views relating to traffic data and its quality. 

 

It was suggested that guidelines and baseline instructions could be helpful in allowing the agencies to calculate and report data quality measures.  It was also suggested that these guidelines be provisional, which will give the impetus for the agencies to start collecting quality data, allow them to start reporting data in a certain way, and provide them time to overcome the institutional barriers.  Creating a traffic monitoring master plan was suggested to describe how different components work and how they coordinate within agencies.  Caltrans indicated that they have already started work in this area.  These guidelines should take into consideration that most agencies have legacy systems, which often can be problematic.  Another idea to formalize the data quality process was to include data quality requirements in the regional ITS architectures along with data flows.  The visibility and the relevance of data collection programs can benefit greatly from data quality reporting.

 

For a particular goal or program, there is the need for a minimum set of measures to assess the quality of the data.  However, while there was no consensus on the minimum set of standards among the participants for all the applications of traffic data, it was suggested that state DOTs need to start with provisional standards that include performance statistics that have visibility within the department. 

 

There was no general agreement for the need to establish national data quality benchmarks.  Some participants felt that there is no need for a national benchmark; others thought that perhaps “national benchmark” is too strong, suggesting the use of “national goal” instead.  National goals could be set for different uses of data.  It was agreed that normalizing or leveling the playing field may be difficult given the diverse application types and needs.  However, it was also noted that such goals could lead to uniformity in data quality reporting.  Caltrans indicated that it operates according to a performance level but sees some value in having a national goal.  Such national goals also would be helpful for vendors.  Another view indicated that each state could define its own use and its own goal and standard instead of adhering to an established national goal, which may be more difficult to set and achieve.  In this way goals would be defined and met at the state level.  States that do a good job in maintaining data quality should be recognized and rewarded.

3.3       Session 2 – State of the Practice in Traffic Data Quality

The white paper titled State of the Practice in Traffic Data Quality” was written by Dr. Rich Margiotta (Cambridge Systematics) for this project.  The complete version of the white paper is provided in Appendix A.

3.3.1   Types and Applications for Traffic Data

Several types of traffic data are collected by both “traditional” and ITS means.  Where there is overlap between the two realms, the basic nature and definitions of the data collected are the same.  However, there are subtle differences in data collection methodologies that may lead to problems with data sharing and quality.  Among these are the polling rate and vehicle classification “bins”.

3.3.2   Traffic Data Quality:  Characteristics

What Causes “Bad” Traffic Data:  Several sources contribute to inaccuracies in traffic data.  These relate to the nuances of specific equipment and how data are collected and transmitted from the field:

 

 

Detection of “Bad” Data:  The white paper, “Defining and Measuring Traffic Data Quality”, presents a full discussion of how questionable/inaccurate data are identified after they are collected from the field.  A variety of methods are used, including internal range checks, cross-checks, time series patterns, comparison to theory, and historical patterns are used. 

 

Correction of “Bad” Data:  Once suspect data are identified, the question then is what to do about them.  Most applications flag the records failing quality control or set the measurement values to missing or other special codes.  Editing the measurement values is far less common, although some experimentation with “imputing” values has taken place.  Imputation appears to be most applicable where small intermittent gaps appear in the data rather than large portions of time with missing or suspect data.  A variety of techniques have been explored including time series smoothing and historical growth rates by location and day and week.  However, there is little consensus in the profession on what techniques to be used, or if imputation should be done at all.

3.3.3   Quality Issues for Using ITS-Generated Data for Traditional Uses

The applications that traffic data support in operational and traditional uses of ITS-generated traffic data – as well as the nuances of data collection in both cases – can have an impact on data quality.  Several differences exist based on these points:

 

3.3.4   Recommendations:  Possible Solutions

Sampling of ITS Locations and Data Streams:  The selection of certain strategic locations where both ITS and traffic monitoring groups can concentrate their efforts to correctly install, inspect and maintain these locations.

 

Shared Resources:  The sharing of expertise and resources among the various agencies within the state DOTs to ensure that they benefit from their strengths and help overcome weaknesses.

 

Maintenance, Calibration, and Performance Standards:  Undertaking formal studies of data quality by setting maintenance and calibration standards and goals for traffic monitoring devices

 

Contractual Arrangements:  New and emerging business models such as outsourcing and use of private contractors for collecting and archiving data.

 

More Sophisticated Operations Applications as a Data Quality Leader:  The current generation of operational strategies does not require extremely accurate data – operators typically need to know where the big problems are and their responses are geared to this.  New and emerging operations applications may drive the need for high quality data

 

New Technologies:  The use of new technologies including non-intrusive devices and probe vehicles combined with innovative uses of existing inductive loop technologies.

3.3.5   Discussion Points

The possible solutions and recommendations (section 3.3.4) served as the main points for the session’s discussions.

3.3.5.1    Discussions – Ohio Workshop

Rich Margiotta initiated the discussion by asking the participants what they thought of the potential solutions listed in the white paper.  The participants agreed that sharing resources between the ITS and traffic monitoring groups is a good idea.  The Division of Planning in Kentucky described an example of shared resources.  The Division of Planning invested in equipment they like and trust and ARTIMIS identified modifications to those devices so that they also can be used for ITS applications by the TMC.  James Pol, ITS/JPO, mentioned that there will be a greater need for sharing data in the future due to scarce resources.

 

On the question of whether there have been any observed cost savings due to data sharing, David Gardner, ODOT, responded that the data sharing with ARTIMIS was very recent and no cost information was available.  Indiana DOT commented that there should be some expected savings from a safety standpoint as they no longer have to place road tubes on the roadway.  It was suggested that TMCs start using ITS data only from select locations.  It was noted that the TMC in Cleveland is beginning to consider the use of ATR data for their operations.

 

One of the major themes of the discussion was the problems encountered during installation of traffic monitoring devices.  Installation of equipment is the most critical aspect to ensure that high quality data are obtained from the device.  It was noted that the use of pre-qualification of contractors for installing loops and piezo-based detectors was not the usual practice.  Ohio does not have any pre-qualification standards for installation and contractors install devices based on manufacturer’s instructions.  Indiana DOT calibrates their devices annually but does not have any standards for installation.  Pennsylvania DOT uses manual counts as the standard to assess the accuracy of ATR counts.  It is recognized, however, that manual counts also can be in error depending on the volume of traffic and thus may not be the most effective measure of ATR count accuracy.

 

David Gardner, ODOT, mentioned that Ohio DOT is working on a contract to maintain ATRs.  The contract would be a task order in which the successful contractor would be given maintenance tasks as needed.  ODOT hopes that such a contract would save time in fixing maintenance problems by having a contractor in place.

 

The overall consensus was that there is some existing information about installation and maintenance of equipment but more guidelines and standards are needed.

 

Quicker notification of sensor problems was discussed.  Today, in some cases, a problem might not be known for a period of four to six weeks (during data processing).  While in some instances it is possible to poll the devices daily (Kentucky polls its 77 sites daily), states with more sites usually poll less frequently.

 

On the question of whether the quality assurance software used by the traffic monitoring groups can be shared with the ITS groups, various states expressed an interest in the data validation rules used to check traffic data.  It was noted that state agencies had developed in-house software to validate traffic data using specific validation checks.  A synthesis of the data validation checks was suggested as a very important and desired research need.

 

It was also noted that some equipment does not have sufficient level of accuracy and it was recognized that vendors need to test the equipment better and make it more robust.  State DOTs also do not have information on the lifecycle cost of the equipment.  The participants also noted that the value of data to the customers was not clear.  In other words, what benefit would an increase in data quality provide to the customers?

3.3.5.2    Discussions – Utah Workshop

The participants felt that strategic ITS detector locations in which the traffic monitoring groups and the ITS groups share resources and devices was a good idea.  Washington DOT already has started using a similar concept in which certain detectors are more important than others. However, it was felt that these priority locations are politically driven and land-use factors can change the priority very quickly.  It is essential to include the planning groups in identifying the location selection and reevaluate priorities periodically.

 

The participants also agreed that sharing resources is a good idea.  However, doing it well requires understanding what is possible and what is practical.  It is necessary to define the types of data needed and collected by all the agencies sharing the data and equipment.  Vehicle classification was discussed as an example.  The 13 vehicle classes used by FHWA are required by very few analysis procedures but are required to be collected and reported by the traffic monitoring agencies.  However, ITS groups do not have the equipment to collect such detailed classification.  Some other groups within the DOT require information on body types and commodity hauled.  These discrepancies and specific needs should be understood and resolved to ensure synergies from the shared resources and equipment.

There were some concerns about sharing equipment, as different protocols and storage requirements used by different groups in the same agency make the use of the same devices difficult.

 

States have experienced problems in data collection equipment maintenance, primarily in inspections of installation after construction begins.  Coordinating with construction, planning, and operations groups to ensure proper installation and inspection is often a problem.  Joe Avis from Caltrans commented that devices that have had electrical inspections last longer than those which have not been inspected.  The biggest impediment in performing such inspections is the time and cost.  Sharing resources to achieve this goal is very beneficial to everyone.

 

Various participants noted their frustrations with equipment installation.  Texas DOT is developing procedures for design, installation, and maintenance, and will make these available on the Internet so that contractors can access them.  They are also planning to train all their regional offices on the procedures related to installation and maintenance of traffic data collection devices.

 

The participants expressed interest in quality control and assurance software used by traditional traffic monitoring groups.  The software used by states varies greatly and is typically developed using their respective in-house business rules.  Mark Hallenbeck proposed creating an open-source software model or at least having the documentation of such software available on the web so that a DOT investing in such software knows what other agencies have used.  Martin Knopp (Utah DOT) mentioned a voluntary group of state agencies that encourages informal exchange of information.  Currently, the scope of this group is very limited.  There also has been a pooled fund study to look into the elements of quality assurance software.  There was a consensus that this is an area of great interest to participants.

3.4       Session 3 – Advances in Traffic Data Collection and Management

The third white paper titled Advances in Traffic Data Collection and Management was written by Dr. Dan Middleton (TTI) for this project.  The complete version of the white paper is provided in Appendix A.

3.4.1   Introduction

Without accurate and reliable detectors, traffic management decisions based upon real-time or historical data are compromised.  Many agencies use post processing for quality assurance as opposed to quality control.  Quality assurance attempts to “fix the data” or identify defective data rather than ensuring the accuracy and reliability of the equipment.  Quality control emphasizes good data by ensuring selection of the most accurate detector then optimizing detector system performance.  This white paper identifies innovative approaches for improving data quality through innovative contracting methods, standards, training for data collection, data sharing between agencies and states, and advanced traffic detection techniques. 


3.4.2   Innovative Contracting Methods

A few agencies have already invested resources in developing new contracting methods as a means of ensuring data quality at its source.  Performance criteria in contracts, while not common, are being considered by DOTs as a method to transfer some of the risk and maintenance requirements to contractors. 

 

The Virginia Department of Transportation (VDOT) at the Hampton Roads Traffic Management Center uses contractors for support of its day-to-day operations.  The TMC accomplishes the necessary maintenance on its detection system through hiring contractor personnel who are supervised by VDOT personnel.  VDOT treats contractor personnel as an extension of its own staff, apparently giving the TMC director even more latitude to add or remove contractor personnel compared to VDOT staff.  The second example in Virginia is the VDOT Mobility Management Section (traditional data collection), which leases its traffic counters and modems from Digital Traffic Systems (DTS).  A state inspector checks the equipment once a year, but if there are substantial errors in the data, the contractor has to re-collect the data.  VDOT has established performance-based lease criteria for payment of data collection services.  Contractor compensation is based on the amount of acceptable data being submitted by the contractor. 

 

Another example of an innovative contracting method is with the Ohio Department of Transportation’s Office of Technical Services, Traffic Monitoring Section.  ODOT is in the process of executing a task-order-type contract for maintenance to have contractors on board for anticipated and unanticipated maintenance requirements of the traditional data collection equipment statewide.  The contract is expected to begin in the summer of 2003. 

3.4.3   Standards

Standards development is another aspect of traffic data quality.  The U.S. DOT ITS Standards Program is working toward the widespread use of standards to encourage the interoperability of ITS systems, including traffic data collection systems.  There is also a draft standard being developed by the ASTM, entitled “Standard Specification and Test Methods for Highway Traffic Monitoring Devices (ASTM, 2002),” which will be available soon.  Standardization has occurred in Germany, the Netherlands, and France, where national standards for data collection equipment have been developed (U.S DOT, 1997).  The process has increased the quality and accuracy of the data collected, decreased the effort needed to transfer data between agencies or offices, and increased the reliability of field equipment.  However, there is increased initial cost of the equipment when compared to non-standard equipment. 

3.4.4   Training for Data Collection

Training of personnel on the intricacies of the equipment is an essential part of ensuring data quality.  With improvements in non-intrusive detector hardware and software occurring at a rapid pace, maintenance personnel must be computer literate and must maintain an awareness of the latest changes for a variety of detection systems.  Initial training of new systems is often available through the vendor, but turnover in state DOT maintenance staff and new models requires an ongoing training program. 

3.4.5   Data Sharing Between Agencies and States

Budget cuts are causing agencies to seek alternate means of meeting data supply needs, with one solution being to share data between agencies.  The Hampton Roads TMC currently shares video with the city of Norfolk and plans to share video, voice, and data with six other cities in the immediate area, including Norfolk, which also has a TMC so there is mutual benefit to sharing each other’s data.  The New England states of Connecticut, Maine, Massachusetts, New Hampshire, Rhode Island, and Vermont have cooperated to help each other and share transportation data.  ARTIMIS supplies data to the following agencies:  planning agencies within the Ohio DOT, the Kentucky Transportation Cabinet, the local MPO (Ohio-Kentucky-Indiana Regional Council of Governments), the City of Cincinnati Traffic Engineering office, local FHWA contacts, and the FHWA Mobility Monitoring project.  The agencies sharing data about ARTIMIS perform their own analysis of data quality. 

3.4.6   Advanced Traffic Detection Techniques

Quality control emphasizes data quality by ensuring selection of the most accurate detector then optimizing detector system performance.  Two of the most recent research efforts focusing on the performance attributes of advanced detection techniques occurred at the Texas Transportation Institute (Middleton et al., 1999, 2000, 2002) and in Phase II of the Minnesota DOT Non-Intrusive Tests (MinnDOT & SRF Consulting, 2002).  Of the detectors recently tested by TTI and MinnDOT, the multi-lane detectors that are most competitive from a cost and accuracy standpoint are Autoscope Solo Pro, Iteris Vantage, RTMS by EIS, SAS-1 by SmarTek, Traficon NV, and 3M Microloops. 

3.4.7   Discussion Points

The following points were suggested as discussion items at the end of the presentation:

 

·                     What are the equipment-related impediments to data sharing?

·                     What are the data accuracy concerns for ITS data?

·                     How many detectors can be “out” at any given time?

·                     Standards development takes time.  What do we do in the meantime?  Current standard output is “contact closure.”

·                     How should/will equipment vendors help (training, product consistency, information dissemination, diagnostics)?

3.4.7.1    Discussions – Ohio Workshop

Dan Middleton (Texas Transportation Institute) presented the paper on innovative approaches to traffic data collection management.  European agencies have extensive experience with loop detectors and are satisfied with their performance.  These agencies are careful with installations and have national standards for loop installations.  Dan Middleton remarked that the specifications for the loop detectors themselves are not very different from those currently being followed by Texas DOT (TxDOT), but that there are stricter installation and maintenance standards in Europe.

 

The participants described the perfect detector as one that is easily installed off the road; weather proof; self-diagnostic; and capable of collecting multi-lane volume, speed, and classification data.

 

There was also discussion of the appropriate spacing of detectors.  Participants felt that the current 0.5-mile-spacing was driven primarily by ramp-metering applications and the one-mile spacing of urban interchanges.  For current applications at TMCs, 0.5-mile spacing is not required.  However, advanced traffic management applications might need such dense coverage. Traditional traffic monitoring groups need data from only one location in each segment.  Thus, the spacing is determined by potential application of data.

 

In terms of contracting, it was noted that most manufacturers provide a one-year warranty on their equipment and it might be useful if they provided longer warranties (e.g., five years).  Performance-based contracts were viewed as an interesting approach but the participants needed more information on how to set up and manage these contracts.  There were concerns expressed about situations where the contractor and the state do not agree on the quality of the data and the increased costs of these contracts.  Currently, the primary mode of contracting is low-bid. Another idea was to develop an asset management approach for certain devices.  It was noted during the discussions about contracting and business models that universities are now becoming archivists of traffic data.  The field operational test (FOT) being planned in Virginia would provide more information on such a framework and its advantages and disadvantages.

 

The participants also indicated the need for a clearinghouse of traffic detectors.  Ralph Gillmann mentioned the Vehicle Detector Clearinghouse (VDC), a pooled-fund project operated by New Mexico State University.  The clearinghouse has information on traffic detector tests conducted, and offers limited technical assistance.  It was noted that the clearinghouse is not a testing facility.  The need for such a testing facility was also expressed.

 

It was noted that vehicle classification was a problem for most of the detectors.  The 13 vehicle classes required by FHWA restrict the type of traffic detection device that can be used.  Also, length-based detectors have different classification schemes based on the manufacturer.  Ralph Gillmann mentioned that FHWA has worked with Illinois DOT to allow it to report length-based classification data.

3.4.7.2    Discussions – Utah Workshop

The participants were receptive to newer detection technologies as long as they are cost effective and approach the accuracy of inductive loops.  Participants from traffic monitoring groups indicated that they had tried non-intrusive technologies including remote traffic microwave sensor (RTMS) and video-based detection with varying degrees of success.  In terms of the cost-benefit of using newer detection technologies, it was felt that life-cycle costs for traffic detectors would be very valuable in decision-making; however, cost information is often not available.  It was also noted that while the cost of traffic control and maintenance are reduced in the case of non-intrusive detectors, there are still some costs which need to be considered in the cost-benefit.

 

It is not uncommon for vendors to release new or modified equipment before it has been fully tested and before proper training is provided to the vendor’s own personnel.  A testing institute was suggested as a solution.  The Vehicle Detector Clearinghouse was suggested as a potential candidate to perform such a service.  Currently the clearinghouse provides information about detectors and tests conducted by the states, but it does not conduct independent testing

 

Installation of devices was discussed again in this session as being critical.  Dan Middleton remarked that the Netherlands scanning tour indicated that the success of the inductive loops greatly depended on their installation.  There needs to be coordination during installation and even afterwards between different divisions of the same agency.  For example, milling operations to smooth the pavement can completely destroy loops, and lane-striping resulting in lane shifts can render the loops ineffective because they are no longer centered in the lanes. 

 

Each detector has its issues and problems related to installation and calibration.  Location and set-up of these devices sometimes is more art than science.  While there are manufacturer’s instructions for set-up and installation, the installer must still use trial-and-error in some installations to achieve optimum performance.  Experience gained over time is helpful in correctly and efficiently setting up these devices.  Also, a compilation of the installation, maintenance procedures, and best practices would be very useful.

3.5       Action Plan Discussion

This section summarizes the action items from brainstorming sessions conducted to identify and prioritize the action items to address the data quality issues discussed in the previous sessions. The actions are organized by white paper topic.

3.5.1   Defining and Measuring Traffic Data Quality

3.5.1.1    Ohio Workshop

Following are the action items identified to address issues relating to defining and measuring traffic data quality:

 

 

 

 

3.5.1.2   Utah Workshop

Following are the action items identified to address issues relating to defining and measuring traffic data quality:

 

 

 

 

 

 

 

3.5.2   State of the Practice

3.5.2.1    Ohio Workshop

Following are the action items identified to address issues relating to the state of the practice:

 

 

 

 

 

3.5.2.2    Utah Workshop

Following are the action items identified to address issues relating to the state of the practice:

 

 

 

 

 

 

 

 

3.5.3   Innovative Approaches

3.5.3.1    Ohio Workshop

Following action items identified to address issues relating to innovative approaches to data quality:

 

 

 

3.5.3.2    Utah Workshop

Following action items identified to address issues relating to innovative approaches to data quality:

 

 

 

 

3.5.4   Responsibilities and Timeline

Responsibilities and timelines for implementing the action items were not discussed at the regional workshops.  Although responsibilities as to which agency should perform the action items were not explicitly identified, it was implicit that FHWA and state agencies will be playing leading roles. 

 

 


4.0    Action Plan for Improving Traffic Data Quality

4.1       Introduction

As noted earlier, the primary objective of this project is to define an action plan with work items that can be executed through the U.S. Department of Transportation (DOT), stakeholder organizations (e.g., American Association of State Highway Transportation Officials [AASHTO], ITS America), state agencies, and private industry.  Several action items were identified and prioritized at the workshops.  The action plan builds upon the findings in the white papers and inputs obtained from the regional workshops.  The action plan provides a blueprint for specific actions to address traffic data quality issues. 

4.2       Partnerships and Coordination

Even though the regional workshops were not attended by representatives from every state, the plan is considered to reflect a broadly based consensus of the states DOTs and others involved in traffic monitoring activities on actions to address data quality issues.  Implementation of the plan will require collaboration among both public and private partners with the FHWA and state DOTs playing leading roles.

 

Coordinators were identified for each action item.  It is assumed that the coordinators will assume the primary responsibility of implementing the specified action items.  Although specific agency responsibilities for action items were not explicitly identified, it was implicit that FHWA and state agencies will play leading roles.  For example, FHWA would lead development of data quality assessment guidelines and the states would lead the use of task order contracting approaches.  In other areas, some FHWA assistance may be required in developing general guidance for the states.  States can then customize the approach to suit their individual circumstances.

 

There are three primary organizational units involved in the traffic monitoring activity:  Planning, Design, and Intelligent Transportation Systems (ITS) or Traffic Management Centers (TMC).  The degree of involvement in traffic monitoring activity can vary from conducting simple road tube counts to operating elaborate ITS installations.  Since methods, techniques, and equipment for conducting traffic monitoring activities are similar across the three organizational units, there is significant opportunity for partnering between the units.  These partnerships are critical in implementing some of the action items.

 

The plan identifies 10 priority action items based on those identified at the two regional workshops.  These action items were distilled from comments from both regional workshops. 

 

 

4.3       Action Items

This section describes the ten action items identified for improving traffic data quality from ITS and non-ITS sources.  These action items are presented in descending order of priority.  The plan includes descriptions of the action items and the issues they address.  For each action item, coordinating and collaborating agencies are specified. 

4.3.1   Guidelines and Standards for Calculating Data Quality Measures

Description:  Develop guidelines and standards for calculating traffic data quality measures.  The guidelines and standards are expected to contain methods to calculate and report the data quality measures for various applications and levels of aggregation.  In addition, the guidelines should also include:

 

·                    Examples or case studies of application of data quality methods 

·                    National goals (by application) – these data quality goals represent what state agencies can strive to achieve in their operations

·                    Guidance on how to construct and store quality measures

·                    Specifications and procedures for reporting data quality metadata

·                    Costs to calculate and report quality measures. 

 

Issues:  This action item was identified as top priority at the two regional workshops.  The action item addresses the following key issues:

 

·        Defining and measuring traffic data quality

·        Quantitative and qualitative metrics/levels of data quality

·        Acceptable levels of quality

·        Methodology for assessing traffic data quality.

 

Coordinators:  It was suggested that FHWA or AASHTO would be the appropriate agency to develop these guidelines.  A suggestion was to include guidelines for calculating data quality measures in the “AASHTO Guidelines for Traffic Data Programs” publication or in the Traffic Monitoring Guide. 

4.3.2   Compilation of Business Rules/Data Validity Checks and
Quality Control Procedures

Description:  Synthesize validation procedures and rules used by various states and other agencies for traffic monitoring devices.  This synthesis report will also serve as a guide to DOTs and other agencies investing in new software for traffic data collection.  The synthesis document should also include quality control procedures for all types of applications and data management methods for maintaining high quality data.

 

The development and adoption of common software was identified as a possible approach to ensure uniformity among state agencies.  Recognizing that software development and testing is expensive and time-intensive, it was suggested that an immediate action would be to share documentation and knowledge of existing software among state agencies.

 


Issues:  This action item addresses the following key issues:

 

 

Coordinators:  FHWA, state DOTs

4.3.3   Best Practices for Equipment Installation and Maintenance

Description:  Develop a synthesis of best practices of installation and maintenance of traffic monitoring devices.  This document should, among other things, include:

 

 

Issues:  This action item addresses the following key issues:

 

 

Coordinators:  FHWA, state DOTs

4.3.4   Clearinghouse for Vehicle Detector Information

Description:  Establish an independent testing entity to test and verify claims of the new and emerging traffic detection devices on the market.  Such an ongoing program would conduct periodic independent accuracy tests of new equipment.  Results from the independent tests should be stored in a clearinghouse that can be accessed by all potential users.

 

The clearinghouse would also provide technical guidelines on the capabilities of detectors by application and conditions.  The guidelines would enable agencies to select the appropriate devices for its applications, budget, and environmental conditions. 

 

It was noted that the capabilities of the existing Vehicle Detector Clearinghouse (VDC), operated out of the New Mexico State University, could potentially be expanded to serve the needs expressed above.  In the short-term, a web-log or a moderated discussion forum needs to be added to the existing Vehicle Detector Clearinghouse to help users share experiences.

 

Issues:  This action item addresses the following key issues:

 

 

Coordinators:  FHWA, state DOTs, and VDC

4.3.5   Sensitivity Studies to Demonstrate “Value of Data”

Description:  Conduct extensive sensitivity analyses and document the results to illustrate the implications of data quality on user applications.  This action item is considered important because it would help document and demonstrate the “value of data” and highlight the effects of poor quality data on various applications.  Such a document would serve as a reference for potential users in deploying data of different levels of quality.  Some applications are extremely sensitive to data quality, whereas others are not.  The documentation should include sensitivity of results for selected applications to variations in data quality measures such as accuracy, coverage (density of detectors), and completeness (missing values).

 

Based on the results of the sensitivity analysis, develop data quality “targets” or “benchmarks” for each application.  Also, the results of the sensitivity analysis would be used to provide guidance or procedures for imputing missing data points.

 

Issues:  This action item addresses the following key issues:

 

 

Coordinators:  FHWA, state DOTs

4.3.6   Guidelines for Sharing Resources

Description:  Develop guidelines for sharing resources for traffic monitoring activities including shared equipment, personnel, funding, and cooperation among different agencies and departments.  These should also include guidelines for establishing public-private partnerships for sharing resources as well as guidelines for assessing and validating traffic data collected by the private sector and vice versa.

 

Information gathered from the regional workshops clearly indicated that budget cuts and financial considerations have forced different groups (within an agency or organization) to look into synergies that would lead to the use of other group’s resources to meet their data needs.  Identifying opportunities for different groups within and outside state DOTs to work together to meet their data needs was mentioned as critical.  Furthermore, these guidelines will establish trust and confidence in private sources of data for use by the public sector and vice versa.

 

Issues:  This action item addresses the following key issues:

 

 

Coordinators:  State DOTs, FHWA

4.3.7   Life-cycle Costs of Detection Equipment

Description:  Develop a methodology for calculating lifecycle costs to enable states and other agencies to:

 

 

These include cost of equipment, installation, training, and maintenance.  The costs of equipment and maintenance impact coverage and other measures of quality.  A better understanding of the life-cycle costs and guidance on how to estimate these costs, is expected to help planning and investing in traffic monitoring activities.

 

Issues:  This action item addresses the following key issues:

 

Coordinators:  State DOTs, FHWA

4.3.8   Improved Contracting Approaches

Description:  Develop guidelines for innovative contracting approaches for traffic data collection.  This should include:

 

·                    Information regarding performance-based contracting approach and management, and the associated costs and benefits

·                    Guidance on task-order-type contracts and cooperative agreements for equipment installation and maintenance

·                    Guidance on life-cycle-cost-based bidding approach.

 

The question of the contracting approach for data collection device procurement, installation, and maintenance was identified as one of the key issues impacting traffic data quality.  This action item is intended to address the issue by providing guidelines that would ensure that vendors are held accountable for the performance of their devices.

 

Issues:  The action item addresses the following key issues:

 

 

Coordinators:  State DOTs, FHWA

4.3.9   Case Study or Pilot Tests

Description:  Conduct a case study or a pilot test to observe a state DOT and TMCs working to improve data quality and evaluate the return on investment from the improved data quality.  Information gathered from such a case study is expected to help implement some of the action items outlined above.

 

The action item addresses the following key issues:

 

 

Coordinators:  FHWA, state DOTs

4.3.10 Guidance on Technologies and Applications

Description:  Provide guidance on the data elements to measure and report since this dictates the type of device procured by the agency.  For example, the FHWA’s 13 vehicle categories should be revisited and length-based classifications explored.  Similarly, new and emerging applications might have additional data needs, which again influence the type of device.

 

Provide guidance on the innovative uses of loops and existing technologies.  Improvements in inductive loop technologies can expand their capabilities beyond volume and speeds (e.g., approaches to derive vehicle classifications from loop signatures). 

 

The action item addresses the following key issues:

 

 

Coordinators:  FHWA, state DOTs

4.4       Implementation and Work Items

As noted earlier in Section 4.2, the coordinators would assume primary responsibility for implementing the specified action items.  FHWA would play a leading role in the overall implementation of the action plan.  State DOT involvement, coordination, and participation are critical for some action items more than others.  Following are the three potential groups of activities or work items to implement the action plan.

4.4.1   Research Studies

The majority of the action items relate to the development of guidelines, which are best implemented through research studies.  The findings of the research effort would then be disseminated to all potential users.  This will then be followed by evaluation to assess the success of implementation and identify limitations and shortcomings.  FHWA would the conduct these research activities with support from state DOTs and other agencies and organizations.

 

For action items falling into this category, the first activity would be to develop research topics and statements of work for each or combination of action items.  Action items in this category include the following (with report section identified):

 

·         Guidelines and standards for calculating data quality measures (4.3.1)

·                                                         Compilation of business rules/data validity checks and quality control procedures (4.3.2)

·                                                         Best practices for equipment installation and maintenance (4.3.3)

·                                                         Sensitivity studies to demonstrate “value of data” (4.3.5)

·                                                         Guidance on technologies and applications (4.3.10)


4.4.2   Workshops

Some of the action items could be implemented through regional workshops.  It is believed that action items in this category are those that require sharing of experiences and success stories where a workshop or similar forum provides the best environment.  FHWA would coordinate with the state DOTs to sponsor and organize such workshops.  The following are action items in this category:

 

·                                                         Guidelines for sharing resources (4.3.6)

·                                                         Life-cycle costs of detection equipment (4.3.7)

·                                                         Improved contracting approaches (4.3.8)

4.4.3   Case Studies and Clearinghouse

Action item in this category require establishing or identifying an independent entity and conducting case studies.  These action items can be implemented only after some of those in the other categories have been completed.  It is expected that participation in the case studies would be voluntary.  It is envisaged that FHWA, state DOTs, and other agencies or organizations would work jointly to successfully complete these action items.  The following are the action items in this category:

 

·                                             Case study or pilot tests (4.3.9)

·                                             Clearinghouse for vehicle detector information (4.3.4)


5.0    CONCLUDING REMARKS

The action plan was developed based on information from published literature and discussions at two regional workshops.  Ten action items were identified directed at addressing traffic data quality issues.  Coordinators and work items have been suggested for the various action items.  The action items represent the general consensus of the workshop participants regarding the major traffic data quality issues.  Implementation of the action plan is seen as a major step towards enhancing the quality of traffic data and encouraging usage by federal, state, local agencies, and other organizations. 

 

The action plan in its current form would serve as input for a national workshop on data quality for review and adoption.

 

 

 

 

 

 

 

 

 


REFERENCES

Battelle Memorial Institute, Sharing Data for Traveler Information:  Practices and Policies of Public Agencies, prepared for U.S. Department of Transportation, July 2001.

 

Closing the Data Gap:  Guidelines for Quality ATIS Data, Prepared for:  ITS America and

The U.S. Department of Transportation, April 2000.

 

D. Middleton and R. Parker.  Initial Evaluation of Selected Detectors to Replace Inductive Loops on Freeways, Research Report FHWA/TX1439-7, Texas Transportation Institute, College Station, Texas, April 2000.

 

D. Middleton, D. Jasek, and R. Parker, Evaluation of Some Existing Technologies for Vehicle Detection, Research Report FHWA/TX-00/1715-S, Texas Transportation Institute, College Station, Texas, September 1999.

 

D. Middleton and R. Parker.  Evaluation of Promising Vehicle Detection Systems, Research Report FHWA/TX-03/2119-1, Draft, Texas Transportation Institute, College Station, Texas, October 2002.

 

English, L.P.  7 Deadly Misconceptions about Information Quality.  INFORMATION IMPACT International, Inc., Brentwood, Tennessee, 1999.

 

English, L.P.  Improving Data Warehouse and Business Information Quality.  John Wiley & Sons, Inc., New York, New York, 1999.

 

FHWA Study Tour for European Traffic Monitoring Programs and Technologies, FHWA’s Scanning Program, U.S. Department of Transportation, Federal Highway Administration, Washington D.C., August 1997. 

 

MNDOT and SRF Consulting Group, NIT Phase II:  Evaluation of Non-Intrusive Technologies for Traffic Detection, Final Report, September 2002.

 

Strong, D.M., Y.W. Lee and R.Y. Wang.  10 Potholes in the Road to Information Quality. Institute of Electrical and Electronic Engineers, August 1997(A), pp. 38-46.

 

Standard Specification and Test Methods for Highway Traffic Monitoring Devices, The American Society for Testing and Materials, Review Copy:  Version C for E17.52, Draft December 2002.

 


APPENDIX A

 

 

 

WHITE PAPERS


“Defining and Measuring Traffic Data Quality”

By Shawn Turner

Introduction

Although not specifically referring to intelligent transportation systems (ITS), a Wall Street Journal article speaks to the related subject of data quality:  “Thanks to computers, huge databases brimming with information are at our fingertips, just waiting to be tapped.  . . .  Just one problem:  Those huge databases may be full of junk.”  (Wand and Wang 1996)  As Alan Pisarski noted in his Transportation Research Board (TRB) Distinguished Lecture in 1999, “we are more and more capable of rapidly transferring and effectively manipulating less and less accurate information” (Pisarski 1999).

 

Recent research and analyses have identified several issues regarding the quality of traffic data available from intelligent transportation systems for transportation operations, planning, or other functions.  The Federal Highway Administration (FHWA) is developing an action plan to assist stakeholders in addressing traffic data quality issues.  Regional stakeholder workshops and white papers will serve as the basis for this action plan. 

 

As one of those white papers, this document presents recommendations for defining and measuring traffic data quality.  This white paper:

 

Recommended Definition for Data Quality

Several terms should be defined at the outset.  Data and information are sometimes used interchangeably.  Data typically refers to information in its earliest stages of collection and processing, and information refers to a product likely to be used by a consumer or stakeholder in making a decision.  For example, traffic volume and speed data may be collected from roadway-based sensors every 20 seconds.  This traffic data is then processed into information for the end consumer, such as travel time reports provided via the Internet or radio.  But the terms are also relative, as one person’s data could be another person’s information.  Throughout this paper the term data quality will be used to refer to both data and information quality.  No attempt is made to delineate the point at which data becomes information (or knowledge or wisdom, for that matter).

 

The literature contains two similar definitions for data quality.  Strong, Lee and Wang (1997A) define information quality as “fit for use by an information consumer” and indicate that this is a widely adopted criterion for data quality.  English (1999A) further clarifies this widely adopted definition by suggesting that information quality is “fitness for all purposes in the enterprise processes that require it.” English emphasizes that it is the “phenomenon of fitness for ‘my’ purpose that is the curse of every enterprise-wide data warehouse project and every data conversion project.”  In his book, English (1999B) defines information quality as “consistently meeting knowledge worker and end-customer expectations.” It is clear from these definitions that data quality is a relative concept that could have different meaning(s) to different consumers. For example, data considered to have acceptable quality by one consumer may be of unacceptable quality to another consumer with more stringent use requirements.  Thus it is important to consider and understand all intended uses of data before attempting to measure or prescribe data quality levels.

 

The recommended definition for traffic data quality is as follows:

 

Data quality is the fitness of data for all purposes that require it.  Measuring data quality requires an understanding of all intended purposes for that data.

Recommended Practices for Measuring Traffic Data Quality

Several data quality measures were consistently found in both current practice and data quality literature.  Based on the findings discussed later in this paper, the following data quality measures are recommended:

 

 

There are several other valid data quality measures presented that could be used for specific traffic data applications in some regions.  The five measures presented above, though, are fundamental measures that should be considered universally for measuring data quality in all traffic data applications.

 

At this time, we recommend that goals or target values for these traffic data quality measures be established at the regional level based on a better understanding of all intended uses of traffic data.  It is clear that data consumers’ needs and expectations, as well as available resources, vary significantly by region and preclude the recommendation for a national goal or standard for these traffic data quality measures.

 

The research team also recommends that if data quality is measured, the information should be made available and accessible with the data as metadata.  This practice of requiring a data quality report using standardized data quality measures is common in the GIS and other data communities.  The American Society of Testing and Materials (ASTM) is developing a data archive metadata standard that could be used to document and describe these data quality measures in sufficient detail for data consumers.  The ASTM metadata standard under development has been adapted from the GIS communities’ metadata standard (FGDC-STD-001-1998 and ISO DIS 19115) with their data quality reporting sections intact.

Current Practices in Measuring Traffic Data Quality

Current practices in measuring traffic data quality are summarized below for three common consumer groups involved in highway transportation:

 

 

Our review of current practice found that, in general, consistent and widespread reporting of traffic data quality measures was not evident in any of these three consumer groups.  Efforts to address data quality were more evident in the latter two groups than with real-time monitoring and control.  A few data quality measures have been suggested or are used in each of these groups.  These data quality measures are discussed in the following paragraphs:

Real-Time Traffic Monitoring and Control

Data consumers in this group are typically engaged in traffic management and control or the provision of traveler information.  Data uses are considered real-time and are generally concerned only with the most recent data available (e.g., typically five to fifteen minutes old). Some agencies are beginning to use historical data to provide additional value to traveler information.  In some cases field data collection hardware and software provide rudimentary data quality checks; in other cases, no data quality checks are made from the field to the application database.  Field hardware and software failures are common.  In some cases, equipment redundancy provides sufficient information to cover gaps in missing data.  In other cases, missing data is simply reported “as is” and decisions are made without this data.

 

Many agencies provide time-stamped traveler information via websites, thus providing an indication of the data timeliness.  Selected examples can be found at Houston TranStar (http://traffic.tamu.edu), WSDOT (http://www.wsdot.wa.gov/PugetSoundTraffic/), and Wisconsin DOT (http://www.dot.wisconsin.gov/travel/milwaukee/index.htm), just to name a few.

 

Several traffic management centers track failed field equipment through maintenance databases and report such things as the average percent of failed sensors.  The Michigan Intelligent Transportation Systems (MITS) Center has defined lane operability as the sensor-minutes of failure, which is a product of the number of failed sensors and the duration of the failure in minutes (Turner et al. 1999).  These measures can be classified as measures of coverage or completeness.

 

Some traffic management centers evaluate the accuracy of new types of sensors before widespread deployment.  For example, the Arizona DOT traffic operations center in Phoenix used accuracy to measure the data quality from non-intrusive sensors for which they were considering installation (Jonas 2001).  In their evaluation, ADOT compared traffic count and speed data from non-intrusive, passive acoustic detectors to calibrated inductance loop detectors under the assumption that the loop detector data represented the most error-free data obtainable. The measure used in the evaluation was absolute and percentage differences between traffic counts and speeds measured with the two sensor types.(incomplete sentence)

 

ITS America and the U.S. DOT convened numerous stakeholders in 1999 and developed guidelines for quality advanced traveler information system (ATIS) data (ITS America 2000). The guidelines were developed in an effort to support the expansion of traveler information products and services.  One of the explicit purposes of the guidelines was to increase the quality of traffic data being collected.  The ITS America guidelines recommended seven data attributes, six of which can be considered data quality measures:

 

 

The ITS America guidelines further defined quality levels of “good”, “better”, and “best” and provided specific quality level criteria for each attribute.  For example, five to ten percent error in travel times and speeds was classified as a “better” quality level under the Accuracy attribute.

 

In another white paper about data quality requirements for the INFOstructure (i.e., a national network of traffic information and other sensors), Tarnoff (4) suggests the following data quality measures and possible requirements (Table 1):


Table 1.  Possible INFOstructure Performance Requirements

Measure

Application

Requirement

Local Implementation

National Implementation

Speed Accuracy

Traffic Management

5-10%

5-10%

Traveler Information

20%

20%

Volume Accuracy

Traffic Management

10%

N/a

Traveler Information

N/a

N/a

Timeliness

All

Delay < 1 minute

Delay < 5 minutes

Availability

All

99.9% (approx. 10 hours per year)

99% (approx. 100 hours per year)

Source:  Tarnoff 2002

 

 

Tarnoff presented these data quality requirements as a “starting point for the discussion of these issues” and suggested that there is a tendency in the ITS community to specify performance without a complete understanding of the actual application requirements or cost implications.  Thus Tarnoff suggests that any decisions about data quality requirements be grounded in actual application requirements and cost implications.

Operations/ITS Data Archives

Data consumers in this group are typically engaged in off-line analytical processing of data generated by traffic operations.  Archived data uses vary widely, from academic research (e.g., traffic flow theory) to traveler information (e.g., “normal” traffic conditions), operations evaluation (e.g., ramp meter algorithms), performance monitoring, and basic planning-level statistics.  Although the operations data in archives are generated in real-time, most of the applications to-date have been historical in nature and outside of the traffic operations area.  Data archive applications are still in relative infancy and thus quality assurance procedures are still being established in most areas.  Several data archive managers have voiced concerns about the quality of the data generated by operations groups, presumably because the data archive managers have more stringent data quality requirements for their applications than the operations applications.  In fact, this concern about archived data quality is part of the genesis for this FHWA-sponsored project.  Most current archived data users recognize these data quality issues but maintain an optimistic attitude of “this is the best data I can get for free” and attempt to use the data for various applications.  However, interviews conducted in this project revealed several potential data archive consumers that were reluctant to use the data because of real or perceived data quality issues.

 

As noted previously, data archive applications are still in relative infancy and thus data quality measures are not extensively or consistently used.  Data completeness, expressed as the number of data samples or the percent of available samples in a summary statistic, is the measure most often used in data archives.  The data completeness measure is used frequently because operations data is often aggregated or summarized when loaded into a data archive.  For example, the ARTIMIS center in Cincinnati, Ohio/Kentucky reports the number of 30-second data samples (shown in bold in Table 2) that have been used to compute each 15-minute summary statistic.

 

 

Table 2.  ARTIMIS Reporting
of Data Completeness

Data for segment SEGK715001 for 07/15/2001

Number of Lanes: 4

 

#  Time   Samp   Speed   Vol   Occ

00:01:51    30     47    575     6

00:16:51    30     48    503     5

00:31:51    30     48    503     5

00:46:51    30     49    421     4

01:01:52    30     48    274     5

01:16:52    30     42    275    14

...

Source:  ARTIMIS Data Archives

 

 

The Washington State DOT reports data completeness as well as data validity measures for the Seattle data archives that are distributed on CD-ROM (Ishimaru 1998).  In their data archive, they report the number of 20-second data samples in a 5-minute summary statistic (e.g., maximum of 15 data samples possible).  A data validity flag (with values of good, bad, suspect, and disabled loop) is also included in data reports to indicate the validity of 5-minute statistics (Table 3).  Peak hour, peak period, and daily statistics generated by WSDOT’s CDR data extraction program also report data validity and completeness summary measures (Table 4).  The CDR software also has a data quality mapping utility that allows data users to create location-based summaries of data completeness and validity (Ishimaru and Hallenbeck 1999).  This utility is designed for data consumers who would like to analyze the underlying data quality for various purposes.

 

In the FHWA-sponsored Mobility Monitoring Program (http://mobility.tamu.edu/mmp), the Texas Transportation Institute and Cambridge Systematics, Inc. gather archived operations data from numerous traffic management centers nationwide and analyze the archived data to report mobility and reliability trends in the urban areas (Lomax, Turner and Margiotta 2001).  As such, the program is an archived data consumer with the primary application of performance monitoring.

 

The program team performs various data quality checks in the course of processing and analyzing the archived data.  In addition to summary statistics on mobility and reliability, performance reports also include information on the following data quality measures:

 


Table 3.  WSDOT Reporting of Data Validity
and Completeness

***********************************

Filename: 5TO15.DAT

Creation Date: 02/2/98 (Wed)

Creation Time: 03:16:59

File Type: SPREADSHEET

***********************************

ES-145D:_MS___1 I-5 Lake City Way 170.80

09/01/97 (Mon)

---Raw Loop Data Listing---

Time Vol Occ Flg nPds

0:00 49 3.80% 1 15

0:05 37 2.90% 1 15

0:10 38 3.50% 1 15

0:15 34 2.60% 1 15

0:20 48 4.40% 1 15

0:25 44 3.60% 1 15

0:30 35 2.80% 1 15

0:35 33 3.30% 1 15

0:40 28 2.50% 1 15

0:45 30 2.30% 1 15

Source:  Ishimaru and Hallenbeck 1999

 

 

 

Table 4.  WSDOT Reporting of Data Validity and Completeness

in Summary Statistics

***********************************

Filename: AADT.MDS

Creation Date: 02/2/98 (Thu)

Creation Time: 10:54:09

File Type: SPREADSHEET

***********************************

ES-145D:_MS___1 I-5 Lake City Way 170.80

Monthly Avg for 1996 Jan (Sun)

---Multi-Day Loop Summary Report---

Summary     Valid   Vol    Occ     G   S  B  D  Val Inv Mis

Daily        VAL   19392   7.50% 1133 18  1  0   4   0   0

AM Peak      VAL    1493   3.50%  142  2  0  0   4   0   0

PM Peak      VAL    5069  15.60%  190  2  0  0   4   0   0

AM Pk Hour   VAL    1381  10.00%   47  1  0  0   4   0   0 10:45 11:45

PM Pk Hour   VAL    1576  11.90%   48  0  0  0   4   0   0 13:45 14:45

Source:  Ishimaru and Hallenbeck 1999

 

 

For example, Figure 1 shows summary information for data validity and data completeness. Significant detail for these data quality measures is also stored in databases.  For example, one could do time-based and location-based analyses of data quality using the full database.


 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure 1.  Data Quality Statistics for 10 Cities in 2000 Mobility Monitoring Program

 

 

 

Historical/Planning-Level Traffic Monitoring

 

Data consumers in this group are typically engaged in mid- to long-range (5 to 20-plus years) traffic planning and analysis.  Data uses are mostly of an historical nature, so in some cases annual average statistics may not be available (or needed) until six or more months after the past year ends.  Thus, the consumer groups’ frame of reference for data timeliness differs from the other two groups by an order of magnitude.  Whereas operations data consumers may consider data older than 5 minutes unacceptable, planning data consumers may consider waiting up to 9 months for annual statistics to be acceptable.  The use of data quality checks or “business rules” for determining the validity of traffic data appears to be fairly common among this group.  In many cases, these planning groups serve as the “official source” of traffic data for a particular jurisdiction.

 

Numerous state departments of transportation (DOTs) use data validation checks or “business rules” when they load traffic data into their information systems.  These data quality checks are typically based upon traffic capacity principles, typical traffic trends or patterns, or simply local traffic experience and insight.  Thus data validity is a common data quality measure using in many historical traffic monitoring groups.  For example, the Texas DOT (TxDOT) plans to use 23 business rules for continuous vehicle counts in their Statewide Traffic Analysis and Reporting System (STARS) (TxDOT 2001).  Once a data record has failed a business rule, that record is flagged as “suspect” and must be reviewed by a traffic data analyst prior to the beginning of the traffic monitoring program’s year-end process.  Additionally, STARS uses data integrity as a data quality measure as they also run checks on the data file and station integrity.


The traffic monitoring group in the Virginia DOT (VDOT) also uses established business rules to perform traffic data validity checks prior to loading them into their information system.  As with TxDOT’s process, data that fails the business rules are flagged as suspect and must be reviewed by a traffic data analyst.  If the traffic data is deemed erroneous, it will not be loaded into the traffic information system.  VDOT has a unique contracting arrangement in that they lease the traffic data collection equipment from sub-contractors; thus, they pay the sub-contractors lease payments based upon the quality and completeness of the data collected by the sub-contractors’ equipment.  For example, a full monthly payment is made for locations “where 25 or more days of useable (for factor creation) classification and volume traffic information are available during a calendar month”.  A partial lease payment of 50 percent is made “where 15 or more days of useable (for factor creation) volume traffic information, but less than 15 days (useable for factor creation) classification data are available.”  Thus VDOT’s payment for traffic data collection is based on the quality measures of data validity and data completeness.

 

VDOT also designates quality levels for their traffic data they distribute.  The quality level codes and descriptions are as follows:

 

·        Code 0 - Not Reviewed

·        Code 1 - Acceptable for Nothing

·        Code 2 - Acceptable for Qualified Raw Data Distribution

·        Code 3 - Acceptable for Raw Data Distribution

·        Code 4 - Acceptable for use in AADT Calculation

·        Code 5 - Acceptable for all TMS uses

 

These quality codes are designed to indicate to data consumers what the data producers believe to be the fitness of the data for various purposes.

 

Similar software-based data validity checks are used in several other states.  The Pennsylvania and Ohio DOTs both use data validity checks in their traffic information system.  These validity checks are performed on a daily basis for all traffic data.  The Michigan DOT uses Traffic Data Quality (TDQ), a software tool developed as a result of a pooled-fund study (Flinner and Horsey, no date).

 

The international experience with traffic data validity checks is comparable to the U.S. experience.  A European scanning tour found that several countries perform an automated validation of traffic data (FHWA 1997).  All ITS systems observed in the tour countries (the Netherlands, Switzerland, Germany, France, and the United Kingdom) perform some type of automated data validation, usually by comparing current data from a particular site with historical data from that same site during a similar time interval.  If an operator identifies questionable data, they use graphic displays to review the data and determine acceptability.

 

Several of the countries have fairly extensive data validation systems, and all of them require manual input.  Most cases involve validation methods based on site-specific development of “rules” based on historical patterns by time of day, day of week, and lane for that site.  Data that fail the validation routines alert the attention of system operators, who then decide whether the data are correct.  Operators replace invalid data with data from previous time periods at that site, factoring the data with growth estimates (based on nearby counters that worked properly) when appropriate.  The discussion that follows covers processes used in individual countries.

 

The Netherlands uses a software system called INTENS.  This system collects traffic data from the various traffic-monitoring sites, conducts automated validation checks, facilitates manual review of flagged data, and produces a variety of summary graphics and statistics.  The data validation process consists of a series of parameter checks comparing the data submitted for each site with confidence limits set specifically for that site.  Initial data checks ensure that data are labeled correctly (i.e., belong to a site for which data are expected), have the proper number of lanes, and pass other site identification checks.  The next set of checks are called “primary control”, which are a series of maximum and minimum allowable data ranges for specific variables that are based on historical data.

 

At the national level, Switzerland has two sets of data validation checks.  The first determines if the telemetry system functioned properly.  The second set of validation data examines the submitted records and identifies those that are questionable based on several criteria.  These include:  zero volumes or other errors in the hourly records; hourly volumes that exceed a maximum percentile; variation in the ratio of 14-hour volumes to 24-hour volumes (14 hours from 6:00 a.m. to 8:00 p.m.) for weekdays; variation in the ratio of 5-hour volumes to 14-hour volumes (5 hours from 3:00 p.m. to 8:00 p.m.) per weekday; and variations in directional distribution.

 

Like other countries included in the scan tour, Germany utilizes multiple validation procedures.  The one included here is being developed for an ITS application in Hesse.  The system uses a combined fuzzy logic/expert system approach for data validation.  It is trained on data that are considered “valid” and then reports invalid data for subsequent manual review.  Data determined to be valid are then included in the training of the system, so that other data with those characteristics will be considered valid.

 

France uses a software system called MELODIE, which creates many of the basic reporting statistics needed for later analysis.  There are no specific algorithms within the system itself, but MELODIE generates graphical output that is viewed by an operator who makes decisions pertaining to its validity.  If the operator determines that some data are not valid, the program will use the previous month’s data for replacement.  The MELODIE system keeps track of the fact that invalid data have been replaced.

 

In the United Kingdom, the scan team found multiple validation techniques.  The one covered in this document is the Motorway Incident Detection and Analysis System (MIDAS).  It performs two levels of validation.  In the first level, the system itself has an internal validation method that indicates when the loop system needs recalibration or has failed (other details unavailable).  In the second level of validation, the system plots the volume, speed, or loop occupancy by geographic location and time of day.  The graphic provides an easy to use visual reference for detecting specific types of equipment errors.


Current Practices in Measuring Data Quality in Other Disciplines

Data quality literature is readily available in several other disciplines, especially the business management and data warehousing industries.  The research team conducted a literature review and identified at least two dozen resources that related directly to data quality measures.  Selected resources are summarized below with an emphasis on their relevancy to traffic data quality measures.

 

The geographic information systems (GIS) community has developed standards for documenting data quality in their Spatial Data Transfer Standard (SDTS) (O’Looney 2000; ANSI 1998).  The SDTS data quality categories are shown in Table 5.  The purpose of the data quality standard within SDTS is not to require acceptable levels of data quality, but to require a data quality report in all GIS data transfers.  Following are the SDTS standardized definitions and measures that are to be used in describing and documenting GIS data quality.

 

 
Table 5.  Five Categories for Data Quality in the Spatial Data Transfer Standard

Category

Definition

Example

Positional Accuracy

The degree of horizontal and vertical control in the coordinate system.

The available precision or detail of longitude and latitude coordinates.

Attribute Accuracy

The degree of error associated with the way thematic data is categorized.

The degree to which a soil description is likely to vary from a soil measurement taken from the corresponding location.

Completeness

The degree to which data is missing and the method of handling missing data.