Machine Learning for Transportation Research and Applications
By Yinhai Wang, Zhiyong Cui and Ruimin Ke
()
About this ebook
- Introduces fundamental machine learning theories and methodologies
- Presents state-of-the-art machine learning methodologies and their incorporation into transportationdomain knowledge
- Includes case studies or examples in each chapter that illustrate the application of methodologies andtechniques for solving transportation problems
- Provides practice questions following each chapter to enhance understanding and learning
- Includes class projects to practice coding and the use of the methods
Yinhai Wang
Yinhai Wang - Ph.D., P.E., Professor, Transportation Engineering, University of Washington, USA. Dr. Yinhai Wang is a fellow of both the IEEE and American Society of Civil Engineers (ASCE). He also serves as director for Pacific Northwest Transportation Consortium (PacTrans), USDOT University Transportation Center for Federal Region 10, and the Northwestern Tribal Technical Assistance Program (NW TTAP) Center. He earned his Ph.D. in transportation engineering from the University of Tokyo (1998) and a Master in Computer Science from the UW (2002). Dr. Wang’s research interests include traffic sensing, transportation data science, artificial intelligence methods and applications, edge computing, traffic operations and simulation, smart urban mobility, transportation safety, among others.
Related to Machine Learning for Transportation Research and Applications
Related ebooks
Construction Methods for an Autonomous Driving Map in an Intelligent Network Environment Rating: 0 out of 5 stars0 ratingsHandbook of Mobility Data Mining, Volume 3: Mobility Data-Driven Applications Rating: 0 out of 5 stars0 ratingsTheories and Practices of Self-Driving Vehicles Rating: 0 out of 5 stars0 ratingsData Analytics for Intelligent Transportation Systems Rating: 0 out of 5 stars0 ratingsHandbook of Mobility Data Mining, Volume 1: Data Preprocessing and Visualization Rating: 0 out of 5 stars0 ratingsAutonomous and Connected Heavy Vehicle Technology Rating: 0 out of 5 stars0 ratingsAdvances in Intelligent Vehicles Rating: 0 out of 5 stars0 ratingsTransportation Engineering: Theory, Practice, and Modeling Rating: 0 out of 5 stars0 ratingsSmart Metro Station Systems: Data Science and Engineering Rating: 0 out of 5 stars0 ratingsConnected and Automated Vehicles: Developing Policies, Designing Programs, and Deploying Projects: From Policy to Practice Rating: 0 out of 5 stars0 ratingsSmart Delivery Systems: Solving Complex Vehicle Routing Problems Rating: 0 out of 5 stars0 ratingsHandbook of Mobility Data Mining, Volume 2: Mobility Analytics and Prediction Rating: 0 out of 5 stars0 ratingsAdvances in Artificial Transportation Systems and Simulation Rating: 5 out of 5 stars5/5Supply and Demand Management in Ride-Sourcing Markets Rating: 0 out of 5 stars0 ratingsSustainable Transportation and Smart Logistics: Decision-Making Models and Solutions Rating: 0 out of 5 stars0 ratingsCognitive Computing for Human-Robot Interaction: Principles and Practices Rating: 0 out of 5 stars0 ratingsMultimodal Scene Understanding: Algorithms, Applications and Deep Learning Rating: 0 out of 5 stars0 ratingsEmpowering the New Mobility Workforce: Educating, Training, and Inspiring Future Transportation Professionals Rating: 0 out of 5 stars0 ratingsTransportation Cyber-Physical Systems Rating: 0 out of 5 stars0 ratingsDecision-Making Techniques for Autonomous Vehicles Rating: 0 out of 5 stars0 ratingsAdvances in Digitalization and Machine Learning for Integrated Building-Transportation Energy Systems Rating: 0 out of 5 stars0 ratingsData-Driven Traffic Engineering: Understanding of Traffic and Applications Based on Three-Phase Traffic Theory Rating: 0 out of 5 stars0 ratingsInterpretable Machine Learning for the Analysis, Design, Assessment, and Informed Decision Making for Civil Infrastructure Rating: 0 out of 5 stars0 ratingsIntelligent Vehicles: Enabling Technologies and Future Developments Rating: 0 out of 5 stars0 ratingsAutonomous Vehicles and Future Mobility Rating: 0 out of 5 stars0 ratingsMachine Learning Applications in Civil Engineering Rating: 0 out of 5 stars0 ratingsThe Multibody Systems Approach to Vehicle Dynamics Rating: 5 out of 5 stars5/5Spatial Cognitive Engine Technology Rating: 0 out of 5 stars0 ratingsModeling of Transport Demand: Analyzing, Calculating, and Forecasting Transport Demand Rating: 0 out of 5 stars0 ratingsSwarm Intelligence for Resource Management in Internet of Things Rating: 0 out of 5 stars0 ratings
Economics For You
Economics 101: From Consumer Behavior to Competitive Markets--Everything You Need to Know About Economics Rating: 4 out of 5 stars4/5The Richest Man in Babylon: The most inspiring book on wealth ever written Rating: 5 out of 5 stars5/5Divergent Mind: Thriving in a World That Wasn't Designed for You Rating: 4 out of 5 stars4/5The Intelligent Investor, Rev. Ed: The Definitive Book on Value Investing Rating: 4 out of 5 stars4/5Wise as Fu*k: Simple Truths to Guide You Through the Sh*tstorms of Life Rating: 4 out of 5 stars4/5Economix: How and Why Our Economy Works (and Doesn't Work), in Words and Pictures Rating: 4 out of 5 stars4/5A History of Central Banking and the Enslavement of Mankind Rating: 5 out of 5 stars5/5Sex Trafficking: Inside the Business of Modern Slavery Rating: 4 out of 5 stars4/5Limitless Mind: Learn, Lead, and Live Without Barriers Rating: 4 out of 5 stars4/5Quiet Leadership: Six Steps to Transforming Performance at Work Rating: 4 out of 5 stars4/5How to Be Everything: A Guide for Those Who (Still) Don't Know What They Want to Be When They Grow Up Rating: 4 out of 5 stars4/5Principles for Dealing with the Changing World Order: Why Nations Succeed and Fail Rating: 4 out of 5 stars4/5You Can't Lie to Me: The Revolutionary Program to Supercharge Your Inner Lie Detector and Get to the Truth Rating: 4 out of 5 stars4/5The Lords of Easy Money: How the Federal Reserve Broke the American Economy Rating: 4 out of 5 stars4/5Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are Rating: 4 out of 5 stars4/5Recession-Proof Real Estate Investing: How to Survive (and Thrive!) During Any Phase of the Economic Cycle Rating: 5 out of 5 stars5/5Chip War: The Fight for the World's Most Critical Technology Rating: 4 out of 5 stars4/5Confessions of an Economic Hit Man, 3rd Edition Rating: 5 out of 5 stars5/5The Hard Truth About Soft Skills: Soft Skills for Succeeding in a Hard Wor Rating: 3 out of 5 stars3/5Capital in the Twenty-First Century Rating: 4 out of 5 stars4/5The Price of Time: The Real Story of Interest Rating: 5 out of 5 stars5/5Disrupting Sacred Cows: Navigating and Profiting in the New Economy Rating: 0 out of 5 stars0 ratingsThe Physics of Wall Street: A Brief History of Predicting the Unpredictable Rating: 4 out of 5 stars4/5The Peter Principle: Why Things Always Go Wrong Rating: 4 out of 5 stars4/5Predictably Irrational, Revised and Expanded Edition: The Hidden Forces That Shape Our Decisions Rating: 4 out of 5 stars4/5Men without Work: Post-Pandemic Edition (2022) Rating: 5 out of 5 stars5/5
Reviews for Machine Learning for Transportation Research and Applications
0 ratings0 reviews
Book preview
Machine Learning for Transportation Research and Applications - Yinhai Wang
Chapter 1: Introduction
Abstract
This book is intended to help current and future transportation professionals build their understanding of, and capability to use, machine learning (ML) methods and tools to address transportation challenges. This chapter describes the importance of transportation and the motivation to write a book that can serve both as the textbook for college or graduate students and as a reference book for working professionals on ML research and applications in transportation engineering. This chapter also briefly reviews the history of ML and introduces promising ML-transportation research and applications, concluding with an overview of the book's the organization.
Keywords
Machine learning; transportation research; transportation applications
This book aims to help current and future transportation professionals build their understanding of, and capability to use, machine learning (ML) methods and tools to address transportation challenges. Although this book is designed as an entry level textbook for college or graduate students, it can also serve as a reference book for working professionals on ML research and applications in transportation engineering.
This book does not require any prior experience nor knowledge of computer programming or the concepts of ML and its transportation related applications to read it. Readers can gradually build the needed programming skills through reading this book and working on the exercises in each chapter. Considering the breadth of the topics covered in this book and the frequent updates needed for the programming scripts and the supporting packages, the authors choose to provide the computer codes for all the exercises online rather than in the book. Also, to make it easier for instructors to teach courses relevant to ML in transportation using this textbook, the authors will share PowerPoint files for each chapter and solutions to the example problems in the companion website of this book.
The remaining part of this chapter introduces the general background of transportation and ML and explains why the authors consider this book to be needed for transportation education, followed by the organization of this book.
1.1 Background
1.1.1 Importance of transportation
Transportation is a means of moving goods, humans, and animals from place to place. It is essential for everyone's daily life. Every aspect of modern economies, and the ways of life they support, can be tied directly or indirectly to transportation. Thus, transportation is very important to the economy. At the microscopic level, a typical household spends about 15% of its income on transportation. Traffic congestion alone costs each American driver $1,377, indicating a total congestion cost of $88 billion in the US in 2019, according to the INRIX 2019 Global Traffic Scorecard. Transportation is the second largest household expenditure category in the US when household spending, such as healthcare benefits, is excluded. The 2020 data show that an average of $9,826, or 16%, was spent on transportation by households in the US. At the macroscopic level, many wealthy countries spend approximately 6% to 12% of the Gross Domestic Product (GDP) on transportation (Rodrigue, 2020). In the US, transportation contributed 8% to the 2020 GDP and is the fourth largest contributor following housing, healthcare, and food.
Transportation plays a critical role in climate change. Transportation consumed 28% of total energy use in the US in 2021. Petroleum is the main source of energy for transportation in the US, accounting for 90% of the total transportation sector energy use. Consequently, transportation has been the largest contributor to US greenhouse gas emissions, account for 27% of the total. The increase of greenhouse gas emissions in the transportation sector is greater than any other sectors in absolute terms between 1990 and 2020.
Transportation also impacts public health significantly. Air pollutants emitted from transportation vehicles lead to poor air quality that has negative impacts on public health. Many health problems, such as respiratory infections, cardiovascular diseases, lung cancers, etc., have been found to be related to traffic pollution. Also, traffic crashes kill approximately 1.35 million people each year globally. Road traffic collisions are the leading cause of death among people aged 5–29 according to the 2018 Global Status Report on Road Safety published by the World Health Organization. Although many new safety solutions have been applied to vehicles and roadways, traffic fatalities are still high in the US, with approximately 40,000 people killed each year. Pedestrian fatalities have also been increasing after hitting the lowest recent number in 2009.
1.1.2 Motivation
Transportation is important. Improvements in the transportation system can generate remarkables benefit for the economy and society. Investment in transportation infrastructure can generate a 5% to 20% annual return (Rodrigue, 2020). However, transportation is a very complicated system-of-systems through which humans, technologies, and infrastructure interact. Our transportation system is about how people connect, build, consume, and work, with many variables involved, including policy, human, geographical, and technical factors, among others. To gain a good understanding of the transportation issues and identify effective solutions, quality data are essential.
Over the recent decades, transportation agencies have made significant investments in system sensing and data gathering. For example, data exchange and AI/ML are both listed as fundamental elements of intelligent transportation systems (ITS), a system of technologies and operational advancements that improve the capacities of the overall transportation system. The global investment on ITS technologies is projected to grow at a compounded annual rate of 8.8%, reaching $49.5 billion by 2026. These new ITS system data, third-party data (such as those from navigation apps), and classical traffic sensor and survey data form big data streams that enable system-wide in-depth analyses concerning a variety of important transportation issues.
Unfortunately, classical transportation research and practice have been mainly based on the very limited amounts of data collected from periodical surveys or limited sensor locations. Most methods, including the commonly used ones, are based on mathematical assumptions without sufficient validation (Ma et al., 2011) and are not suitable for using such new datasets. Transportation curricula currently used in universities are strong in design, construction, and operation topics, but lack coverage of database, data analytics, and information technologies. Consequently, transportation agencies and companies are facing increasing workforce challenges, particularly for the use of new technologies such as sensing, Internet of Things (IoT), connected and automated vehicles, drones, big data analytics, and artificial intelligence (AI).
To help address this workforce challenge, the authors decided to write this book and focus on ML because it has the potential to transform ITS at every level of implementation.
(Chan-Edmiston et al., 2020) ML has been widely applied in transportation data collection, autonomous driving, transportation asset management, demand forecasting, safety improvement, mobility as a service, etc. It is clearly a critical skill for future transportation professionals. This book provides an introduction to ML methods for those transportation students and working professionals who are interested in studying this subject. Hopefully, this book lays a solid foundation for addressing the current and future workforce challenges in transportation.
1.2 ML is promising for transportation research and applications
1.2.1 A brief history of ML
Before presenting ML, we need to briefly introduce AI, a research field which started at a workshop held on the campus of Dartmouth College, NH, in the summer of 1956 (Kaplan and Haenlein, 2019). The term of AI was first coined at this workshop by John McCarthy, a professor of computer science at Stanford University. Although there is not a universally accepted definition of AI, the key concept is the same: AI leverages computers and machines to mimic the problem-solving and decision-making capabilities of the human mind, as described on the IBM website at https://www.ibm.com/cloud/learn/what-is-artificial-intelligence. Research investigation in whether machine could substitue human for some jobs can be dated back to 1945 when Vannevar Bush proposed a system to amplify people's own knowledge and understanding (Bush et al., 1945). Machine intelligence was more clearly described in 1950 by Alan Turing, one of the most influential British mathematician and logician who made major contributions to mathematics, cryptanalysis, logic, philosophy, and mathematical biology and also to the new areas later named computer science, cognitive science, artificial intelligence, and artificial life. In his paper (Turing and Haugeland, 1950), Turing proposed a method for answering the question he raised at the beginning of his paper Can machines think?
This method, now referred to as the Turing test, can be used to test a machine's ability to exhibit intelligent behavior equivalent to, or indistinguishable from, that of a human.
Over recent decades, AI has attracted a substantial amount of attention and has been intensively studied. With the ubiquitous use of smart phones and deployment of Internet of Things (IoT), data has become increasingly available that has further stimulated AI research, particularly ML. Although AI and ML are often used exchangeably, these two terms are actually two different concepts even though ML is actually a part of AI. ML is an application of AI that focuses on the use of data and algorithms to imitate the way that humans learn, gradually improving its accuracy. ML algorithms build a model based on sample data, known as training data, in order to make predictions or decisions without being explicitly programmed to do so.
ML originated from the mathematical modeling of neural networks in 1943. Walter Pitts and Warren McCulloch (McCulloch and Pitts, 1943) attempted to mathematically map out thought processes and decision making in human cognition. Because of the emerging data sets and new computing technologies, ML technologies have advanced very quickly, making them significantly more effective than in the past. A remarkable milestone is the development of AlphaGo, the first computer program defeated in 2016 a Go world champion, Lee Sedol, and is arguably the strongest Go player in history. AlphaGo is a computer program developed by Google DeepMind to play the board game Go. AlphaGo's algorithm uses a combination of machine learning and tree search techniques, combined with extensive training, both from human and computer play. AlphaGo was trained on thousands of human amateur and professional games to learn how to play Go. Its improved version, AlphaGo Zero, released just roughly a year after, skips this human Go game data based training step and learns to play simply by playing games against itself, starting from completely random play. In doing so, it quickly surpassed human level of play and defeated the previously published champion-defeating version of AlphaGo by 100 games to 0. This example demonstrates the power of ML.
1.2.2 ML for transportation research and applications
Although the history of using ML in transportation is short, researchers have made remarkable progress demonstrating the high value of the method. From transportation data collection to transportation operations, demand forecasting, and planning, ML methods have been proven effective over the conventional methods. Here are some examples.
• Traffic data collection. This is the first and foremost step for ITS.It is the foundation for traffic system control, demand prediction, infrastructure monitoring and management, etc. It is also the most important technology for autonomous vehicles. Traffic data collection, or traffic sensing, from the functionality perspective, has various manners including infrastructure-based sensing, vehicle onboard sensing, and aerial sensing. Due to the different characteristics of traffic sensing technologies, how to extract useful traffic data and merge different types of sensing technologies together is critical for the future transportation system. ML methods have been used for vehicle detection and classification, road surface condition sensing, pedestrian and bicyclist counting, etc.
• Traffic prediction. Coupled with traffic sensing, traffic state prediction is another motivating application. The traffic state variables include traffic speed, volume, travel time, and other travel demand factors, such as origin and destinations (OD). The road traffic state prediction considered as a time series forecasting problem can be easily solved by various types methods. However, when comes to the traffic prediction for large-scale city areas with hundreds or thousands of road segments involved, the common time series forecasting methods may not work. Thus, how to extract comprehensive spatial-temporal features from a road network to fulfill traffic prediction and assist further traffic management is a key task for transportation planning and operations. ML methods have been widely used for short and long term traffic forecasts.
• Traffic system operations. AI traffic management is poised to revamp urban transportation, relieving bottlenecks and choke-points that routinely snarl our urban traffic. This helps reduce not only congestion and travel time but also vehicle emissions. Traffic congestion mostly occurs due to the negligence of certain factors like distance maintained between two moving vehicles, traffic lights, and road signs. Congestion leads to higher fuel consumption, increased air pollution, unnecessarily wastage of time & energy, chronic stress & other physiological problems, whereas higher traffic violations are the major cause of road fatalities. Intelligent traffic management systems refer to the usage of AI, machine learning, computer vision, sensors, and data analysis tools to collect and analyze traffic data, generate solutions, and apply them to the traffic infrastructure. AI can use live camera feeds, sensors, and even Google Maps to develop traffic management solutions that feature predictive algorithms to speed up traffic flow. Siemens Mobility recently built an ML-based monitoring system that processes video feeds from traffic cameras. It automatically detects traffic anomalies and alerts traffic management authorities. The system is effective at estimating road traffic density to modulate the traffic signals accordingly for smoother movement. ML methods will improve many smart transportation applications including emergency vehicle preemption, transit signal priority, and pedestrian safety.
1.3 Book organization
The chapters in this book are organized based on the optimal learning order of the selected machine learning methods. In each chapter, we first introduce machine learning algorithms and then present related transportation case studies into which those algorithms can be applied. Specifically, the book is organized to include the following chapters:
• Chapter 2 covers the various transportation data and sensing technologies including infrastructure-based, vehicle onboard, and aerial sensing technologies and the corresponding generated datasets. This chapter also introduces existing transportation data and sensing challenges.
• Chapter 3: This chapter introduces a spectrum of key concepts in the field of machine learning, starting with the definition and categories of machine learning, and then covering the basic building blocks of advanced machine learning algorithms. The theory behind the common regressions, including linear regression and logistic regression, gradient descent algorithms, regularization, and other key concepts of machine learning are discussed.
• Chapter 4: This chapter introduces neural network techniques starting from linear regression to the feed-forward neural network (FNN). Basic FNN components, including layers, activation functions, back-propagation algorithms, and training strategies are introduced. Additionally, case studies of representative transportation applications using FNN are presented.
• Chapter 5: This chapter introduces the fundamental mechanism of convolutional neural network (CNN) which has been widely used to learn matrix-like data, such as images. The case studies introduced in this chapter includes traffic video sensing and spatiotemporal traffic pattern learning.
• Chapter 6: This chapter introduces the fundamental mechanism of recurrent neural network (RNN) and its famous variants, like LSTM and GRU. We also present transportation related case studies adopting RNN including road traffic prediction and traffic time series data imputation.
• Chapter 7: This chapter introduces reinforcement learning (RL), which is a branch of machine learning targeting to solve these sequential decision problems. In this chapter, the basic concepts, such as Markov decision process (MDP), and value-based and policy-based algorithms of RL are introduced in detail. Multi-agent reinforcement learning (MARL) aiming at controlling a bunch of objects/agents cooperatively to achieve a better system performance is also briefly introduced in this chapter. We also present typical simulated and real scenarios to apply different RL algorithms in transportation applications, such as traffic signal control and car following problems.
• Chapter 8: This chapter introduces transfer learning and its applications in improving intelligent transportation systems, such as enhancing parking surveillance and traffic volume detection.
• Chapter 9: Graph neural network (GNN) as a building block of deep neural networks has the superiority in extracting and processing comprehensive features from graph data and enhances the interpretability of neural network models. This chapter describes the basic concepts of GNN and state-of-the-art GNN variants, such as graph convolutional neural network (GCN). In addition, a variety of transportation applications using GNN, including traffic signal control and road traffic prediction, are introduced.
• Chapter 10: This chapter introduces a representative generative model, i.e. generative adversarial network (GAN). is an unsupervised learning task that involves automatically mining and learning the patterns or distributions in a dataset to generate new data samples with some variants. We introduce the theoretical background of generative models and the details of the frameworks of and GAN variants. GAN-related case studies in the transportation domain, such as traffic state estimation, are also briefly introduced in this chapter.
• Chapter 11: Edge computing is crucial for the future ITS applications to enable the interaction between connected traffic participants. This chapter presents representative edge computing scenarios in the ITS field and introduces the background and fundamental concepts of federated learning (FL), which is a method for distributed model training across many edge computing devices.
• Chapter 12: AI will definitely be one of the fundamentals of future transportation systems. This chapter introduces key aspects and directions of AI techniques that will hugely impact and advance the urban transportation systems in the future. Additionally, this chapter also discuss the extension and future plan of this book.
Bibliography
Bush et al., 1945 V. Bush, et al., As we may think, The Atlantic Monthly 1945;176(1):101–108.
Chan-Edmiston et al., 2020 S. Chan-Edmiston, S. Fischer, S. Sloan, M. Wong, et al., Intelligent transportation systems (its) joint program office: strategic plan 2020–2025. [Technical report, United States] Department of Transportation; 2020 Intelligent transportation ….
Kaplan and Haenlein, 2019 A. Kaplan, M. Haenlein, Siri, siri, in my hand: who's the fairest in the land? On the interpretations, illustrations, and implications of artificial intelligence, Business Horizons 2019;62(1):15–25.
Ma et al., 2011 X. Ma, Y.-J. Wu, Y. Wang, Drive net: e-science transportation platform for data sharing, visualization, modeling, and analysis, Transportation Research Record 2011;2215(1):37–49.
McCulloch and Pitts, 1943 W.S. McCulloch, W. Pitts, A logical calculus of the ideas immanent in nervous activity, The Bulletin of Mathematical Biophysics 1943;5(4):115–133.
Rodrigue, 2020 J.-P. Rodrigue, The Geography of Transport Systems. Routledge; 2020.
Turing and Haugeland, 1950 A.M. Turing, J. Haugeland, Computing machinery and intelligence, The Turing Test: Verbal Behavior as the Hallmark of Intelligence. 1950:29–56.
Chapter 2: Transportation data and sensing
Abstract
Data collection and sensing is the first step in most applications of intelligent transportation systems (ITS). Modern ITS applications are data-driven and make high demands on computing services and sophisticated models to process, analyze, and store big data. Transportation data, as a major data source in urban computing, has the properties of being in high volume, highly heterogeneous in data format, and highly variable in data quality. Machine learning has been extensively applied to ITS and greatly leverages the power and potential of ITS data. This chapter introduces ITS data needs, ITS data sensing methods and scenarios, data quality control, and related challenges.
Keywords
Intelligent transportation systems; data collection; sensing; transportation infrastructure; vehicle onboard data; unmanned aerial vehicles
2.1 Data explosion
Urbanization has been posing great opportunities and challenges in multiple areas, including the environment, health care, the economy, housing, transportation, among others. The opportunities and challenges boost the rapid advances in cyber-physical technologies and bring connected mobile devices to people's daily life. Nowadays, approximately 90% of people are connected to the internet and have fast access to a wide variety of information. The superb communication infrastructure in urban area has been attracting more and more residents to cities at an unprecedented scale and speed. In order to efficiently manage the data generated every day and to use them to better allocate urban resources, the smart city concept has emerged. This concept combines sensors, system engineering, artificial intelligence (AI), and information and communication technologies to optimize city services and operations. Naturally, the transportation system that moves goods and people is a critical component of the smart city. Intelligent Transportation Systems (ITS) has likewise emerged as a concept that applies sensing, analysis, control, and communications technologies to ground transportation in order to improve safety, mobility, efficiency, and sustainability.
Modern ITS applications are data-driven and make high demands for computing services and sophisticated models to process, analyze, and store big data. Transportation data, as a major data source in urban computing, is characterized by its high volume, high heterogeneity in data formats, and high variance in data quality. Machine learning has been extensively applied to ITS and greatly leverages the power and potential of ITS data. Machine learning models are often data-hungry and computationally expensive compared to traditional statistical models. The training of a machine learning model, especially deep neural networks (DNNs), may take days or weeks to complete. In most cases, making inferences from machine learning models is also less efficient than traditional methods. Understanding traffic data types and properties is a critical step for comprehending machine learning applications in