Reinforcement Learning for Traffic Control Using Social Preferences

Orly Barzilai

doi:10.5772/intechopen.1005530

Abstract

Traffic congestion arises from all directions, particularly during peak hours, and requires the implementation of a preference mechanism—designated lanes are set up as fast lanes for prioritizing public transportation and ride sharing. Defining a rigid criterion for using the fast lanes can be ineffective if the criterion for using these lanes is unrelated to traffic volume. In situations where fast lanes become overloaded, the rigid criteria do not ensure efficient travel. A social preference criterion, similar to those utilized in priority queues found in various service sectors such as government, travel, and cultural events, could be adapted for use in managing traffic flow and lane prioritization. The social preference criteria will be based on the driver’s characteristics (e.g., a handicraft driver) or not its travel purpose (e.g., a doctor traveling for emergency surgery). To facilitate efficient travel for vehicles utilizing the fast lanes, the implementation of a reinforcement learning (RL) algorithm, specifically the Q-learning algorithm, is proposed. The results indicated that individuals exhibit social preference for various categories of vehicle passenger characteristics. The Q-learning algorithm regulated traffic flow in a junction simulation, distinguishing between fast lanes and regular lanes based on both social preference and traffic volume. This approach ensured efficient prioritization and allocation of resources.

Keywords

reinforcement learning (RL)
traffic signal control (TSC)
social preference
smart social junction (SSJ)
fast lane (FL)

Author Information

Show +

Orly Barzilai*
- The Academic College of Tel Aviv-Yaffo, Tel Aviv-Jaffa, Israel

*Address all correspondence to: orlyba@mta.ac.il

1. Introduction

In recent years, the growth of traffic congestion within urban areas has become a prominent and challenging issue, particularly in densely populated cities. Population increase, immense scale of vehicle trade, and lack of efficient public transportation systems are the foremost reasons for traffic saturation [1]. Traffic congestion is influenced by both spatial and temporal characteristics [2]. Different districts show different patterns of congestion. Traffic peaks during workdays typically occur in the morning and afternoon [2, 3, 4, 5]. Peak traffic transpires in the morning between 8 am and 11 am, with another surge between 2 pm and 6 pm during the traditional workweek. The regularity of this pattern on workdays is a result of individuals collectively commuting to work at similar times throughout the day [3].

One of the primary challenges in populated cities arises at signalized junctions [4]. The predominant traffic control method globally is the fixed-cycle program, primarily due to the lack of real-time data availability. In this approach, the traffic light phases switch at predefined intervals. Nevertheless, the advent of wireless communication in recent years has brought about a significant change. Advancements in technology now allow for the acquisition of real-time road data. As a result, new traffic signal control (TSC) algorithms have been emerged utilizing data from a variety of devices including sensors, cameras, and monitors [6, 7, 8, 9, 10, 11, 12]. These devices facilitate the measurement, modeling, and interpretation of traffic features, including flow, occupancy, or travel times. This information proves valuable for the development of intelligent transportation systems (ITS) that manage traffic based on traffic density [13]. Among these algorithms, reinforcement learning (RL), an unsupervised machine learning technique, has been widely applied to develop solutions for optimizing TSC optimization [14, 15] and plays a central role in the field of traffic management [16, 17, 18].

Managing traffic signals based on traffic density can help alleviate traffic congestion when there are disparities in the number of vehicles arriving at junctions from various directions and heading in different ways. However, during peak hours when junctions are overwhelmed, it becomes essential to introduce a preference mechanism for distributing the traffic load over time [19]. Currently, a priority mechanism is implemented at junctions and on roads, offering preference to public transportation and high-occupancy vehicles. This is achieved through the establishment of dedicated lanes for these specific vehicle types, commonly referred to as dedicated bus lanes (DBL) and high-occupancy vehicles (HOV) lanes.

Barzilai et al. [20] suggested incorporating an innovative preference system based on the social characteristics of the vehicle driver (e.g., a handicapped person) or on its travel purpose (e.g., A doctor traveling to an emergency surgery at a hospital). The concept involves adjusting traffic signals according to both traffic load and social priority. The author argues that introducing a social preference parameter to the junction traffic management algorithm can reduce driver stress, by making traffic congestion at the intersection seem more just, consequently lowering accident rates. Another rationale for incorporating social preference is its integration with traffic load, creating a flexible mechanism for effectively managing traffic on dedicated lanes and preventing issues of both overloading and underloading [19].

In the following sections, we will elaborate on and summarize several papers that enhance the innovative concept of junction management, combining social priority with traffic load through the application of RL algorithms. The review of these papers will be complemented by an examination of the supporting literature.

2. Social preference as a criterion for traffic control

Moral reasoning is defined as “behavior that is subject to (or judged according to) generally accepted moral norms of behavior” [21]. Research on moral dilemmas’ judgments has been fundamentally shaped by the distinction between Utilitarianism and Deontology. Gawronski et al. [22] suggests that the difference between these two concepts is based on sensitivity to moral consequences (Utilitarianism) and moral norms (Deontology). Lower levels of moral reasoning were found to be related to a higher number of accidents, to a higher driving speed, and to a higher degree of space-taking behavior [23].

Encountering a traffic jam, whether at a standstill in an intersection or within a lane, can be associated with the broader phenomenon of waiting in a line or in a queue. Queue involves waiting. Research into the impact of waiting time on consumers highlights both decreased satisfaction [24, 25, 26] and emotional effects such as stress, anxiety, boredom, and a sense of injustice. Moreover, perceived wait times often exceed reality, exacerbating the frustration of those affected [25, 26, 27, 28].

Queues are identified as social structures that uphold specific social [29]:

The queue visibly indicates the order of arrival.
No skipping or cutting in line is allowed.
Adherence to the “first come, first served” principle.
Everyone must wait for their turn patiently.

Most people agree that deviations from queue norms, even in shorter lines, may be acceptable to accommodate concerns such as health, disability, or childcare, as long as the exception is requested and granted by fellow queue members. One’s attitudes toward queues may be influenced by social injustice [30]. Queues are identified as social systems where social justice is an important factor for queue compliance. Social justice is defined by Rawls [31] as “fairness. The way in which social institutions distribute fundamental rights and duties and determine the division of advantages from social cooperation.” On an individual level, social justice relates to “equitable and fair access to resources, and socially valued commodities” [32]. First-order justice, defined as a first-come, first-served (FCFS) process, has been found to be a necessary condition of social justice and positive evaluation [33]. This general principle is applied with modifications [34]. Second-order justice, defined as equal waiting time, has been found to be an additional factor that comes into play only when first-order justice is met. This principle is related to the volume of the service and cost, consumer’s age, and physical conditions [33].

Priority queues are prevalent in-service operations, assigning priorities based on customer attributes [35]. Priority queues and their impact are investigated in airlines, theme parks, nightclubs, hotels, and other service contexts [36]. For example, in COVID-19 testing, priorities can be based on symptoms [37]. In government services, precedence can be granted to individuals who come from distant [38]. In numerous other scenarios, such customer characteristics are inaccessible, necessitating self-selection of priorities. This warrants the implementation of a pay-for-priority system, wherein customers who pay a higher fee are afforded greater precedence. Theme parks and ski resorts often allow customers to purchase premium tickets to join express lines. Everything Everywhere (EE), a leading telecommunications company in the United Kingdom, once offered “Priority Answer” that enabled customers to pay £0.50 to jump the queue for a service call [35].

Barzilai et al. [39] explored public sentiments related to the notion of prioritizing intersections using a rating scale from 1 (indicating strong opposition) to 5 (indicating strong agreement). Furthermore, insights into social preferences regarding various compositions of vehicle passengers were obtained through participants’ ratings on a scale ranging from 1 (indicating no preference) to 5 (indicating a very strong preference). Social preference cases were categorized based on the distinction between moral principles (Deontology) and moral consequences (Utilitarianism).

Moral principles contained the following categories:

Good virtues: people with special needs, elderly (over 75), pregnant women (8 months and above), and patients with active cancer.
Benefit for all: public transportation (bus and cab), shared vehicles, and vehicles with four passengers and above.
Encouraging community services: medical, repair technicians, security, education, and public services.
Keeping traffic rules: normal speed, avoiding phone talk, and no insurance claim.
Promotion of national issues: education, security, and health.

Moral consequences contained the following categories:

Benefit for the individual: private bus and special cab.
Promoting employment: driving to work (governmental and high-tech).
Leisure time promotion: driving for leisure and volunteering.

The results revealed that while research participants advocate for equal priority for all vehicles (excluding emergency ones) at traffic junctions, they argue that priority adjustments should consider traffic volume. The participants preferred moral principles over moral consequences. Two categories received high preference. The first category is the benefit for all, which contains public and shared transportation.

In recent years, there has been a growing public awareness of the need to reduce road congestion through public and cooperative transportation. Barzilai et al. [39] suggested that people are affected by this public propaganda when selecting social preferences. The second category and most surprising finding was that the value of good virtues was associated with the highest preferences. People indicated that vehicles that contain people with special needs and difficulties should have higher priorities in a smart junction. The authors concluded that these results suggest that people can accept moral norms as a criterion for traffic load management.

3. Traffic control preference via dedicated lanes

Traffic control preference is implemented to prioritize public and ride-sharing transportation. Busses play a crucial role in public transportation. However, as busses share the road with private vehicles, they often encounter heavy traffic congestion and delays. To address this problem, bus priority solutions via bus lanes have evolved into an essential component of urban public transportation networks’ growth and enhancement [40]. Russo et al. [41] describe two implementations for DBL systems. The first implementation involves the creation of extensive bus lanes that traverse the entire city. The second implementation consists of dedicated sections that are specifically allocated to address severe congestion within the bus network. These bus sections can be relatively short, sometimes spanning just a few hundred meters, with most of the bus routes operating on lanes not exclusively designated for them. Introducing or expanding such short bus sections constitutes a minor, localized modification to the city’s transportation system, leading to minimal implementation costs. Montero-Lamas et al. [42] describe another bus priority solution known as an intermittent bus lane (IBL), where the bus lane’s status is dynamic. The lane’s status is changed to a mixed-traffic lane when there is no bus using it or when traffic conditions do not cause delays for the bus. When a typical traffic lane is transformed into a bus lane, it often leads to a decrease in the capacity of the remaining lanes, causing an increase in traffic congestion and a reduced travel speed on these capacity-reduced general lanes [43].

High-occupancy vehicle (HOV) lanes are specialized traffic lanes set for vehicles carrying multiple passengers. Depending on the number of occupants, a driver must either use the typically more congested general-purpose lane at a slower pace or have the option to access the faster-moving HOV lane. The fundamental goal of HOV lanes is to raise the average vehicle occupancy, thereby mitigating road congestion [44]. In most applications, these lanes require at least one or two passengers to accompany the driver. HOV lanes facilitate more efficient use of roadways, which benefits traffic flow while also providing time savings and enhanced reliability for high-occupancy travel modes [45]. Sometimes, HOV lanes are also made available to other vehicle types, including emergency and law enforcement vehicles, public busses, electric vehicles, or single-occupancy cars that choose to pay a toll [44].

To regulate the number of single-occupancy vehicles using the HOV lanes, an effective lane management strategy is the implementation of high-occupancy-toll lane pricing. HOV lane pricing can take one of three forms: (i) flat rates, which remain consistent over time; (ii) scheduled tolls, where fees change according to a predetermined schedule, such as the day of the week and time of day; or (iii) dynamic tolls, which are real-time and responsive to the current traffic conditions, ideally adjusted to the prevailing congestion levels [46]. In the case of scheduled tolls, elevated prices are implemented during periods of peak demand and the specific travel location [47]. An example could involve tolls calculated using a function that depends on the maximum traffic density downstream from the entry point [48]. For dynamic tolls, pricing should be responsive to demand. To initiate the implementation of HOV lane pricing, it is crucial that the demand is substantial enough to justify charging single-occupancy vehicles for their use [49]. Additionally, lane capacity should be considered during the planning process [50]. Numerous strategies have been developed to implement dynamic tolls, including the application of a multi-agent reinforcement learning algorithm [51] and a deep reinforcement learning algorithm [52]. A comprehensive understanding of dynamic toll pricing strategies and models is provided by Lombardi et al. [48].

4. Smart social junction with social preferences

Several studies employed a smart social junction (SSJ) algorithm where traffic light timings are adjusted based on both traffic load and societal preferences [19, 20, 39, 53]. To facilitate the implementation of this algorithm, the concept of conflict side (CS) was introduced. A CS encompasses all lanes in all directions of a traffic junction that can move simultaneously. A junction simulation was conducted with four CSs, as illustrated in Figure 1. Vehicles with randomly selected length and velocity parameters were dispersed across lanes. Each vehicle was randomly assigned social preferences ranging from 1 to 10. The social preference value for each CS is the total of all individual vehicle preferences within that CS. Time slots are allocated to all CSs per cycle based on a constant scheduler, ensuring the same order is maintained in each cycle.

Figure 1.
The smart junction simulation’s scheme with four conflict sides colored red, blue, pink, and green.

Figure 2 illustrates three different types of junctions examined in Barzilai et al. [39]: the standard junction, which allocates equal time slots to every CS; the SJ, which allocates timing slots to each CS based solely on its traffic load; and the SSJ, which allocates timing slots based on traffic load and social preference. The results indicate that shorter light durations were observed for the SMJ, although the timing of the SJ and SMJ were comparable for three out of four CSs.

Figure 2.
Our model’s timing in comparison with other traffic light timing methods.

Fine et al. [53] elaborated on the SSJ algorithm where, in each time slot, the CS with the highest preference value is selected. To prevent situations where certain CSs experience prolonged waits (starvation), additional points were allocated to a CS that had not received a green light beyond a specified maximum waiting time. The researchers discovered that the SSJ outperformed the standard (regular) junction during low traffic volume. Figure 3 depicts how the average efficiency of the SSJ evacuation changes in response to different traffic loads. As the traffic load increased, the efficiency declined. In addition, at a traffic load of 55%, the efficiency of both types of junctions (SSJ and regular junction) was comparable. When traffic load surpassed this threshold, regular junction proved more effective than their smart counterparts.

Figure 3.
Average efficiency of SSJ by traffic load.

The researchers concluded that when all CSs experience the same level of congestion, there is no logical basis for prioritizing one side over another. Furthermore, the prioritization of individual vehicles is impacted by the presence of other vehicles in the same queue. For instance, if a high-priority vehicle is waiting alongside low-priority vehicles, the overall preference of the lane will be diminished, and the high-priority vehicle will not be able to assert its preference.

To address these challenges, Barzilai et al. [19] proposed implementing the concept of social preferences within fast lanes (FLs). Presently, FLs are designated for public transportation and are unaffected by traffic load. Consequently, traffic flow in these lanes often falls short of optimal levels, either due to overcrowding or insufficient volume. The proposed solution involves implementing adaptable FLs, where the types of vehicles permitted to use them vary based on prevailing traffic conditions. During periods of congestion and heightened demand, only vehicles with high social priority, such as public transportation, would be granted access to the FL. Conversely, when the FL is underused, vehicles with lower social priority could also utilize it. The optimization of traffic control in a FL based on social preference and traffic volume can be performed using the reinforcement learning (RL) algorithm.

5. Reinforcement learning algorithm for traffic control

Reinforcement learning (RL) is the problem faced by an agent that must learn behavior through trial-and-error interactions with a dynamic environment [54]. As a learning problem, it refers to learning to control a system to optimize some numerical values that represent long-term objectives [55]. The numerical values are the results of a reward function [56]. The reward function defines what is objectively good and bad for the agent. The value of the reward function is the agent’s current mapping from the set of possible states (or state-action pairs) to its estimates of the net long-term reward to be expected after visiting a state (or a state-action pair), after which the agent continues to act according to its current policy [56]. Moreover, the learner is not told which actions should be taken, as in many forms of machine learning, but instead must discover which actions yield the highest reward by trying them out [56].

Q-learning algorithm is a model-free RL algorithm that does not rely on an explicit model of the environment. In model-free reinforcement learning, the agent learns to make decisions and improve its behavior through trial and error without building an internal representation or model of how the environment works [17]. In the Q-learning algorithm, the action is based on the maximal value of the received reward calculated per each action. Miletić et al. [17] introduced a schema for applying the Q-learning algorithm to traffic management at junctions. This schema involves employing a two-dimensional Q-table, where one dimension represents the states of the junction, and the other dimension corresponds to the actions taken at the junction.

The utilization of RL for traffic signal control (TSC) has gained popularity due to its capacity to learn and adapt while actively engaging with the environment. This inherent capability empowers the system to effectively respond to new patterns of traffic congestion as it encounters them [17]. In RL-TSC, each smart junction is typically controlled by a single agent [13]. Each agent is responsible for determining the light-switching sequence at its assigned junction. Many different objectives have been considered by authors when defining the reward function used by RL-TSC agents [13]. These may include average trip waiting time, trip delay, average trip time, average junction waiting time, junction throughput/flow rate, achieving green waves, accident avoidance, speed restriction, fuel conservation, and average number of trip stops.

6. Reinforcement learning algorithm for smart social junction

Barzilai et al. [19] utilized a simplified junction model consisting of two lanes: one for regular vehicles and another for priority vehicles, designated as FL. To effectively manage the distribution of green light time between these two lanes, ensuring priority for the FL without neglecting the regular lane, the RL algorithm, specifically the Q-learning algorithm, was employed. The Q-learning algorithm pseudo-code is presented in Figure 4. Their study applied random data, and the results demonstrated the success of applying the Q-learning algorithm. The reward function contained positive factors for vehicles that crossed the junction or advanced their position and a negative factor for vehicles that remained in their position. In addition, a weight value for the vehicles with high priority was also part of the equation. By implementing this algorithm, vehicles traveling in the FL were able to cross the junction more quickly than those in the regular lane, thus optimizing traffic flow and prioritizing vehicles based on social considerations.

Figure 4.
Pseudo-code of the Q-learning algorithm.

In a recent study, Barzilai et al. [57] extended the simplified model of SSJ with FL managed by RL to a more realistic and practical solution. To achieve this, the junction simulation was constructed using actual data obtained from surveillance cameras situated at a complex, real-world junction. The surveillance camera data was extracted from “Netivei Israel” (the Israeli National Company for Transport Infrastructures) website for the “Ahisemech” junction (presented in Figure 5) which is located in the central region of Israel.

Figure 5.
Ahisemech junction photograph.

During the peak traffic hours between 8:00 am and 5:00 pm, a total of 40 minutes of video footage was captured, encompassing 2177 vehicles across 14 complete cycles of green light distribution in all directions. The researchers employed the You Only Look Once (YOLO) framework, specifically version 8, for conducting vehicle counts. YOLOv8 is a vision model utilized for object detection, classification, and segmentation tasks, particularly on real traffic videos serving diverse purposes [58, 59, 60, 61].

Using the vehicle counts and the Ahisemech junction structure as reference points, a simulation was constructed. This simulation replicated the junction structure and CSs, as presented in Figure 6, as well as accurately represented the traffic load, as presented in Table 1. CS5 emerged as the preferred choice for prioritizing vehicles with high-priority status because it does not share lanes with other CSs.

Figure 6.
Simulation of Ahisemech junction by lanes and CSs.

Lane	CS1	CS2	CS3	CS4	CS5	Total
L1 + L2		V	V			444
L3			V			6
L4 + L5	V	V				373
L6	V					62
L7					V	16
L8					V	81
L9				V		18
L10				V		103
Total	435	817	450	121	97	1103
Percentage	0.23	0.43	0.23	0.06	0.05

Table 1.

Number of simulation vehicles by lane and by CS.

Applying the Q-learning algorithm to manage the real data streamed to the smart junction simulation showed that the algorithm gave preference to both CS5, which was the least congested CS but was designated as the social priority CS, and CS2, which was the most congested CS with no social priority. To assess the order of green light allocation, the average position for each CS was computed. This position reflects the timing of green light allocation. Green lights were assigned to various CSs, leading to a sequence of CS openings until the junction was completely evacuated. A lower value of the average position for a particular CS indicates that the green light was granted earlier in this sequence, resulting in a faster evacuation of that CS. The prioritization of CS5 and CS2 is evident from their low average position values. CS5 obtained a value of 13.4, constituting 64% of the total average position, whereas CS2 achieved a value of 6, accounting for 29% of the total average position. In addition, during the learning phase performed during the 35 episodes, a reduction of 30% was found for the preferred CS (CS5).

7. Conclusions

In response to the growing challenge of traffic congestion, there is a proposal for an innovative approach to implement a social priority mechanism within Traffic Signal Control (TSC). This mechanism draws inspiration from various sectors, such as governmental, commercial, and healthcare domains, aiming to be integrated into the traffic control area.

A social preference based on driver characteristics or travel purpose is suggested, combined with the current traffic volume. This social preference is suggested to be implemented through dedicated lanes designated as fast lanes (FLs). To effectively manage the traffic flow between regular lanes and fast lanes, the reinforcement learning (RL) algorithm, specifically the Q-learning algorithm, is suggested, enabling a flexible usage of the fast lane depending on the current traffic load. Simulation of a junction, imitating the structure and traffic volume of an actual junction, has demonstrated the effectiveness of using RL to optimize traffic balance, considering both load and social preferences.

The implementation of an algorithm in a real-life traffic junction or lanes could be implemented through a smartphone or smart car software that connects to road sensors. Validating social preferences more proficiently can be done in the form of biometric authentication. The validation itself could be implemented in smart calendar applications, in which a certain institute, such as a hospital or government office, can provide schedule validation for meetings and appointments.

Future research directions can focus on refining the social priority categories, enhancing the algorithm to encompass more complex environments connecting several junctions, and inferring social priority designated to a vehicle in real time.

References

1. Rizwan P, Suresh K, Babu MR. Real-time smart traffic management system for smart cities by using Internet of Things and big data. In: 2016 International Conference on Emerging Technological Trends (ICETT). Kollam, India: IEEE; 2016. pp. 1-7
2. Li X, Gui J, Liu J. Data-driven traffic congestion patterns analysis: A case of Beijing. Journal of Ambient Intelligence and Humanized Computing. 2023;14(7):9035-9048
3. Almatar KM. Traffic congestion patterns in the urban road network: (Dammam metropolitan area). Ain Shams Engineering Journal. 2023;14(3):101886
4. Kolat M, Kővári B, Bécsi T, Aradi S. Multi-agent reinforcement learning for traffic signal control: A cooperative approach. Sustainability. 2023;15(4):3479
5. Zang J, Jiao P, Liu S, Zhang X, Song G, Yu L. Identifying traffic congestion patterns of urban road network based on traffic performance index. Sustainability. 2023;15(2):948
6. Al-Abaid SAF. A smart traffic control system using image processing: A review. Journal of Southwest Jiaotong University. 2020;55(1):1-6
7. Ata A, Khan MA, Abbas S, Ahmad G, Fatima A. Modelling smart road traffic congestion control system using machine learning techniques. Neural Network World. 2019;29(2):99-110
8. Chong HF, Ng DWK. Development of IoT device for traffic management system. In: 2016 IEEE Student Conference on Research and Development (SCOReD). Kuala Lumpur, Malaysia: IEEE; 2016. pp. 1-6
9. Firdous A, Niranjan V. Smart density based traffic light system. In: 2020 8th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO). Noida, India: IEEE; 2020. pp. 497-500
10. Hartanti D, Aziza RN, Siswipraptini PC. Optimization of smart traffic lights to prevent traffic congestion using fuzzy logic. TELKOMNIKA (Telecommunication Computing Electronics and Control). 2019;17(1):320-327
11. Lana I, Del Ser J, Velez M, Vlahogianni EI. Road traffic forecasting: Recent advances and new challenges. IEEE Intelligent Transportation Systems Magazine. 2018;10(2):93-109
12. Patil PG, Sharma S, Tamilisetti C, Prathap S. Real time smart traffic control system. International Journal of Research in Engineering, Science and Management. 2020;3(2):1-4
13. Mannion P, Duggan J, Howley E. Parallel reinforcement learning for traffic signal control. Procedia Computer Science. 2015;52:956-961
14. Arel I, Liu C, Urbanik T, Kohls AG. Reinforcement learning-based multi-agent system for network traffic signal control. IET Intelligent Transport Systems. 2010;4(2):128-135
15. Balaji PG, German X, Srinivasan D. Urban traffic signal control using reinforcement learning agents. IET Intelligent Transport Systems. 2010;4(3):177-188
16. Haydari A, Yılmaz Y. Deep reinforcement learning for intelligent transportation systems: A survey. IEEE Transactions on Intelligent Transportation Systems. 2020;23(1):11-32
17. Miletić M, Ivanjko E, Gregurić M, Kušić K. A review of reinforcement learning applications in adaptive traffic signal control. IET Intelligent Transport Systems. 2020;16(10):1269-1285
18. Noaeen M, Naik A, Goodman L, Crebo J, Abrar T, Abad ZSH, et al. Reinforcement learning in urban network traffic signal control: A systematic literature review. Expert Systems with Applications. 2022;199:116830
19. Barzilai O, Rika H, Voloch N, Hajaj MM, Steiner OL, Ahituv N. Using machine learning techniques to incorporate social priorities in traffic monitoring in a junction with a fast lane. Transport and Telecommunication Journal. 2023;24(1):1-12
20. Barzilai O, Voloch N, Hasgall A, Lavi Steiner O, Ahituv N. Traffic control in a smart intersection by an algorithm with social priorities. Contemporary Engineering Sciences. 2018;11(31):1499-1511
21. Reynolds SJ, Ceranic TL. The effects of moral judgment and moral identity on moral behavior: An empirical examination of the moral individual. Journal of Applied Psychology. 2007;92(6):1610-1624. DOI: 10.1037/0021-9010.92.6.1610
22. Gawronski B, Armstrong J, Conway P, Friesdorf R, Hütter M. Consequences, norms, and generalized inaction in moral dilemmas: The CNI model of moral decision-making. Journal of Personality and Social Psychology. 2017;113(3):343-376
23. Veldscholten N. Moral Reasoning in Traffic: About the Possible Relations Between Moral Reasoning and Traffic Safety [Master's thesis]. University of Twente; 2015
24. Chang HL, Yang CH. Do airline self-service check-in kiosks meet the needs of passengers? Tourism Management. 2008;29(5):980-993
25. Carmon Z, Shanthikumar J, Carmon T. A psychological perspective on service segmentation models: The significance of accounting for consumers’ perceptions of waiting and service. Management Science. 1995;41(11):1806-1815
26. Hirsh I, Bilger R, Deatherage B. The effect of auditory and visual background on apparent duration. The American Journal of Psychology. 1956;69(4):561-574
27. Maister D. The psychology of waiting lines. In: Czepiel J, Solomon M, Surprenant C, editors. The Service Encounter: Managing Employee/Customer Interaction in Service Businesses. Lexington, MA: Lexington Books; 1985
28. Witowska J, Schmidt S, Wittmann M. What happens while waiting? How self-regulation affects boredom and subjective time during a real waiting situation. Acta Psychologica. 2020;205:103061
29. Fagundes D. The social norms of waiting in line. Law & Social Inquiry. 2017;42(4):1179-1207
30. Larson RC. Perspectives on queues: Social justice and the psychology of queueing. Operations Research. 1987;35(6):895-905. DOI: 10.1287/opre.35.6.895
31. Rawls J. A Theory of Justice. Cambridge: Belknap Press of Harvard University Press; 1971
32. Zajda J, Majhanovich S, Rust V. Education and Social Justice. Heidelberg, Germany: Springer Verlag; 2006
33. Chatterjee S. Order of justice in queues of emerging markets. Journal of Consumer Marketing. 2020;37(6):605-616
34. Brady FN. Lining up for Star-Wars tickets: Some ruminations on ethics and economics based on an internet study of behavior in queues. Journal of Business Ethics. 2002;38:157-165
35. Cui S, Wang Z, Yang L. A brief review of research on priority queues with self-interested customers. In: Innovative Priority Mechanisms in Service Operations: Theory and Applications. Heidelberg, Germany: Springer; 2023. pp. 1-8
36. Alexander M, MacLaren A, O’Gorman K, White C. Priority queues: Where social justice and equity collide. Tourism Management. 2012;33(4):875-884
37. Yang L, Cui S, Wang Z. Design of covid-19 testing queues. Production and Operations Management. 2022;31(5):2204-2221
38. Wang Z, Cui S, Fang L. Distance-based service priority: An innovative mechanism to increase system throughput and social welfare. Manufacturing and Service Operations Management. 2022;25(1):353-369
39. Barzilai O, Voloch N, Hasgall A, Steiner OL. Real life applicative timing algorithm for a smart junction with social priorities and multiple parameters. In: 2018 IEEE International Conference on the Science of Electrical Engineering in Israel (ICSEE). Eilat, Israel: IEEE; 2018. pp. 1-5
40. Xue Y, Cheng L, Zhong M, Sun X. Evaluation of bus lane layouts based on a bi-level programming model—Using part of the Qingshan Lake District of Nanchang City, China, as an example. Sustainability. 2023;15(11):8866
41. Russo A, Adler MW, van Ommeren JN. Dedicated bus lanes, bus speed and traffic congestion in Rome. Transportation Research Part A: Policy and Practice. 2022;160:298-310
42. Montero-Lamas Y, Novales M, Orro A, Currie G. A new big data approach to understanding general traffic impacts on bus passenger delays. Journal of Advanced Transportation. 2023:1-15
43. Kampouri A, Politis I, Georgiadis G. A system-optimum approach for bus lanes dynamically activated by road traffic. Research in Transportation Economics. 2022;92:101075
44. Boysen N, Briskorn D, Schwerdfeger S, Stephan K. Optimizing carpool formation along high-occupancy vehicle lanes. European Journal of Operational Research. 2021;293(3):1097-1112
45. Gitelman V, Doveh E. Examining the safety impacts of high-occupancy vehicle lanes: International experience and an evaluation of first operation in Israel. Sustainability. 2023;15(18):13976
46. De Palma A, Lindsey R. Traffic congestion pricing methodologies and technologies. Transportation Research Part C: Emerging Technologies. 2011;19(6):1377-1399
47. DeCorla-Souza P. Making the pricing of currently free highway lanes acceptable to the public. Transportation Quarterly. 2000;54(3):17-20
48. Lombardi C, Picado-Santos L, Annaswamy AM. Model-based dynamic toll pricing: An overview. Applied Sciences. 2021;11(11):4778
49. Martínez I, Jin WL. Dynamic Distance-Based Pricing Scheme for High-Occupancy-Toll Lanes Along a Freeway Corridor. arXiv preprint arXiv:2309.01990. 2023
50. Pulyassary H, Yang R, Zhang Z, Wu M. Capacity Allocation and Pricing of High Occupancy Toll Lane Systems with Heterogeneous Travelers. arXiv preprint arXiv:2304.09234. 2023
51. Pandey V, Boyles SD. Multiagent reinforcement learning algorithm for distributed dynamic pricing of managed lanes. In: Proceedings of the IEEE Conference on Intelligent Transportation Systems, Proceedings, ITSC, Maui, HI, USA. IEEE; 4-7 Nov 2018. pp. 2346-2351
52. Pandey V, Wang E, Boyles SD. Deep reinforcement learning algorithm for dynamic pricing of express lanes with multiple access locations. Transportation Research Part C: Emerging Technologies. 2020;119:102715
53. Fine Z, Brayer E, Proshtisky I, Barzilai O, Voloch N, Steiner OL. Handling traffic loads in a smart junction by social priorities. In: 2019 IEEE International Conference on Microwaves, Antennas, Communications and Electronic Systems; November 2019: (COMCAS). Israel Tel Aviv: IEEE; 2019. pp. 1-5
54. Kaelbling LP, Littman ML, Moore AW. Reinforcement learning: A survey. Journal of Artificial Intelligence Research. 1996;4:237-285
55. Szepesvári C. Algorithms for reinforcement learning. In: Synthesis Lectures on Artificial Intelligence and Machine Learning 4.1. Heidelberg, Germany: Springer; 2010. pp. 1-103
56. Sutton RS, Barto AG. Reinforcement learning. Journal of Cognitive Neuroscience. 1999;11(1):126-134
57. Barzilai O, Rika H, Hassin Y. Smart social junction traffic control using reinforcement learning on real data. IET Intelligent Transport Systems. 2024
58. Chaudhary HK, Saraswat K, Yadav H, Puri H, Mishra AR, Chauhan SS. A Real Time Dynamic Approach for Management of Vehicle Generated Traffic. Transdisciplinary Research and Education Center for Green Technologies, Kyushu University; 2023. pp. 289-299
59. Gomaa A, Minematsu T, Abdelwahab MM, Abo-Zahhad M, Taniguchi RI. Faster CNN-based vehicle detection and counting strategy for fixed camera scenes. Multimedia Tools and Applications. 2022;81(18):25443-25471
60. Zhang Y, Guo Z, Wu J, Tian Y, Tang H, Guo X. Real-time vehicle detection based on improved yolo v5. Sustainability. 2022;14(19):12274
61. Rodríguez-Rangel H, Morales-Rosales LA, Imperial-Rojo R, Roman-Garay MA, Peralta-Peñuñuri GE, Lobato-Báez M. Analysis of statistical and artificial intelligence algorithms for real-time speed estimation based on vehicle detection with YOLO. Applied Sciences. 2022;12(6):2907

[1] 1. Rizwan P, Suresh K, Babu MR. Real-time smart traffic management system for smart cities by using Internet of Things and big data. In: 2016 International Conference on Emerging Technological Trends (ICETT). Kollam, India: IEEE; 2016. pp. 1-7

[2] 2. Li X, Gui J, Liu J. Data-driven traffic congestion patterns analysis: A case of Beijing. Journal of Ambient Intelligence and Humanized Computing. 2023;14(7):9035-9048

[3] 3. Almatar KM. Traffic congestion patterns in the urban road network: (Dammam metropolitan area). Ain Shams Engineering Journal. 2023;14(3):101886

[4] 4. Kolat M, Kővári B, Bécsi T, Aradi S. Multi-agent reinforcement learning for traffic signal control: A cooperative approach. Sustainability. 2023;15(4):3479

[5] 5. Zang J, Jiao P, Liu S, Zhang X, Song G, Yu L. Identifying traffic congestion patterns of urban road network based on traffic performance index. Sustainability. 2023;15(2):948

[6] 6. Al-Abaid SAF. A smart traffic control system using image processing: A review. Journal of Southwest Jiaotong University. 2020;55(1):1-6

[7] 7. Ata A, Khan MA, Abbas S, Ahmad G, Fatima A. Modelling smart road traffic congestion control system using machine learning techniques. Neural Network World. 2019;29(2):99-110

[8] 8. Chong HF, Ng DWK. Development of IoT device for traffic management system. In: 2016 IEEE Student Conference on Research and Development (SCOReD). Kuala Lumpur, Malaysia: IEEE; 2016. pp. 1-6

[9] 9. Firdous A, Niranjan V. Smart density based traffic light system. In: 2020 8th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO). Noida, India: IEEE; 2020. pp. 497-500

[10] 10. Hartanti D, Aziza RN, Siswipraptini PC. Optimization of smart traffic lights to prevent traffic congestion using fuzzy logic. TELKOMNIKA (Telecommunication Computing Electronics and Control). 2019;17(1):320-327

[11] 11. Lana I, Del Ser J, Velez M, Vlahogianni EI. Road traffic forecasting: Recent advances and new challenges. IEEE Intelligent Transportation Systems Magazine. 2018;10(2):93-109

[12] 12. Patil PG, Sharma S, Tamilisetti C, Prathap S. Real time smart traffic control system. International Journal of Research in Engineering, Science and Management. 2020;3(2):1-4

[13] 13. Mannion P, Duggan J, Howley E. Parallel reinforcement learning for traffic signal control. Procedia Computer Science. 2015;52:956-961

[14] 14. Arel I, Liu C, Urbanik T, Kohls AG. Reinforcement learning-based multi-agent system for network traffic signal control. IET Intelligent Transport Systems. 2010;4(2):128-135

[15] 15. Balaji PG, German X, Srinivasan D. Urban traffic signal control using reinforcement learning agents. IET Intelligent Transport Systems. 2010;4(3):177-188

[16] 16. Haydari A, Yılmaz Y. Deep reinforcement learning for intelligent transportation systems: A survey. IEEE Transactions on Intelligent Transportation Systems. 2020;23(1):11-32

[17] 17. Miletić M, Ivanjko E, Gregurić M, Kušić K. A review of reinforcement learning applications in adaptive traffic signal control. IET Intelligent Transport Systems. 2020;16(10):1269-1285

[18] 18. Noaeen M, Naik A, Goodman L, Crebo J, Abrar T, Abad ZSH, et al. Reinforcement learning in urban network traffic signal control: A systematic literature review. Expert Systems with Applications. 2022;199:116830

[19] 19. Barzilai O, Rika H, Voloch N, Hajaj MM, Steiner OL, Ahituv N. Using machine learning techniques to incorporate social priorities in traffic monitoring in a junction with a fast lane. Transport and Telecommunication Journal. 2023;24(1):1-12

[20] 20. Barzilai O, Voloch N, Hasgall A, Lavi Steiner O, Ahituv N. Traffic control in a smart intersection by an algorithm with social priorities. Contemporary Engineering Sciences. 2018;11(31):1499-1511

[21] 21. Reynolds SJ, Ceranic TL. The effects of moral judgment and moral identity on moral behavior: An empirical examination of the moral individual. Journal of Applied Psychology. 2007;92(6):1610-1624. DOI: 10.1037/0021-9010.92.6.1610

[22] 22. Gawronski B, Armstrong J, Conway P, Friesdorf R, Hütter M. Consequences, norms, and generalized inaction in moral dilemmas: The CNI model of moral decision-making. Journal of Personality and Social Psychology. 2017;113(3):343-376

[23] 23. Veldscholten N. Moral Reasoning in Traffic: About the Possible Relations Between Moral Reasoning and Traffic Safety [Master's thesis]. University of Twente; 2015

[24] 24. Chang HL, Yang CH. Do airline self-service check-in kiosks meet the needs of passengers? Tourism Management. 2008;29(5):980-993

[25] 25. Carmon Z, Shanthikumar J, Carmon T. A psychological perspective on service segmentation models: The significance of accounting for consumers’ perceptions of waiting and service. Management Science. 1995;41(11):1806-1815

[26] 26. Hirsh I, Bilger R, Deatherage B. The effect of auditory and visual background on apparent duration. The American Journal of Psychology. 1956;69(4):561-574

[27] 27. Maister D. The psychology of waiting lines. In: Czepiel J, Solomon M, Surprenant C, editors. The Service Encounter: Managing Employee/Customer Interaction in Service Businesses. Lexington, MA: Lexington Books; 1985

[28] 28. Witowska J, Schmidt S, Wittmann M. What happens while waiting? How self-regulation affects boredom and subjective time during a real waiting situation. Acta Psychologica. 2020;205:103061

[29] 29. Fagundes D. The social norms of waiting in line. Law & Social Inquiry. 2017;42(4):1179-1207

[30] 30. Larson RC. Perspectives on queues: Social justice and the psychology of queueing. Operations Research. 1987;35(6):895-905. DOI: 10.1287/opre.35.6.895

[31] 31. Rawls J. A Theory of Justice. Cambridge: Belknap Press of Harvard University Press; 1971

[32] 32. Zajda J, Majhanovich S, Rust V. Education and Social Justice. Heidelberg, Germany: Springer Verlag; 2006

[33] 33. Chatterjee S. Order of justice in queues of emerging markets. Journal of Consumer Marketing. 2020;37(6):605-616

[34] 34. Brady FN. Lining up for Star-Wars tickets: Some ruminations on ethics and economics based on an internet study of behavior in queues. Journal of Business Ethics. 2002;38:157-165

[35] 35. Cui S, Wang Z, Yang L. A brief review of research on priority queues with self-interested customers. In: Innovative Priority Mechanisms in Service Operations: Theory and Applications. Heidelberg, Germany: Springer; 2023. pp. 1-8

[36] 36. Alexander M, MacLaren A, O’Gorman K, White C. Priority queues: Where social justice and equity collide. Tourism Management. 2012;33(4):875-884

[37] 37. Yang L, Cui S, Wang Z. Design of covid-19 testing queues. Production and Operations Management. 2022;31(5):2204-2221

[38] 38. Wang Z, Cui S, Fang L. Distance-based service priority: An innovative mechanism to increase system throughput and social welfare. Manufacturing and Service Operations Management. 2022;25(1):353-369

[39] 39. Barzilai O, Voloch N, Hasgall A, Steiner OL. Real life applicative timing algorithm for a smart junction with social priorities and multiple parameters. In: 2018 IEEE International Conference on the Science of Electrical Engineering in Israel (ICSEE). Eilat, Israel: IEEE; 2018. pp. 1-5

[40] 40. Xue Y, Cheng L, Zhong M, Sun X. Evaluation of bus lane layouts based on a bi-level programming model—Using part of the Qingshan Lake District of Nanchang City, China, as an example. Sustainability. 2023;15(11):8866

[41] 41. Russo A, Adler MW, van Ommeren JN. Dedicated bus lanes, bus speed and traffic congestion in Rome. Transportation Research Part A: Policy and Practice. 2022;160:298-310

[42] 42. Montero-Lamas Y, Novales M, Orro A, Currie G. A new big data approach to understanding general traffic impacts on bus passenger delays. Journal of Advanced Transportation. 2023:1-15

[43] 43. Kampouri A, Politis I, Georgiadis G. A system-optimum approach for bus lanes dynamically activated by road traffic. Research in Transportation Economics. 2022;92:101075

[44] 44. Boysen N, Briskorn D, Schwerdfeger S, Stephan K. Optimizing carpool formation along high-occupancy vehicle lanes. European Journal of Operational Research. 2021;293(3):1097-1112

[45] 45. Gitelman V, Doveh E. Examining the safety impacts of high-occupancy vehicle lanes: International experience and an evaluation of first operation in Israel. Sustainability. 2023;15(18):13976

[46] 46. De Palma A, Lindsey R. Traffic congestion pricing methodologies and technologies. Transportation Research Part C: Emerging Technologies. 2011;19(6):1377-1399

[47] 47. DeCorla-Souza P. Making the pricing of currently free highway lanes acceptable to the public. Transportation Quarterly. 2000;54(3):17-20

[48] 48. Lombardi C, Picado-Santos L, Annaswamy AM. Model-based dynamic toll pricing: An overview. Applied Sciences. 2021;11(11):4778

[49] 49. Martínez I, Jin WL. Dynamic Distance-Based Pricing Scheme for High-Occupancy-Toll Lanes Along a Freeway Corridor. arXiv preprint arXiv:2309.01990. 2023

[50] 50. Pulyassary H, Yang R, Zhang Z, Wu M. Capacity Allocation and Pricing of High Occupancy Toll Lane Systems with Heterogeneous Travelers. arXiv preprint arXiv:2304.09234. 2023

[51] 51. Pandey V, Boyles SD. Multiagent reinforcement learning algorithm for distributed dynamic pricing of managed lanes. In: Proceedings of the IEEE Conference on Intelligent Transportation Systems, Proceedings, ITSC, Maui, HI, USA. IEEE; 4-7 Nov 2018. pp. 2346-2351

[52] 52. Pandey V, Wang E, Boyles SD. Deep reinforcement learning algorithm for dynamic pricing of express lanes with multiple access locations. Transportation Research Part C: Emerging Technologies. 2020;119:102715

[53] 53. Fine Z, Brayer E, Proshtisky I, Barzilai O, Voloch N, Steiner OL. Handling traffic loads in a smart junction by social priorities. In: 2019 IEEE International Conference on Microwaves, Antennas, Communications and Electronic Systems; November 2019: (COMCAS). Israel Tel Aviv: IEEE; 2019. pp. 1-5

[54] 54. Kaelbling LP, Littman ML, Moore AW. Reinforcement learning: A survey. Journal of Artificial Intelligence Research. 1996;4:237-285

[55] 55. Szepesvári C. Algorithms for reinforcement learning. In: Synthesis Lectures on Artificial Intelligence and Machine Learning 4.1. Heidelberg, Germany: Springer; 2010. pp. 1-103

[56] 56. Sutton RS, Barto AG. Reinforcement learning. Journal of Cognitive Neuroscience. 1999;11(1):126-134

[57] 57. Barzilai O, Rika H, Hassin Y. Smart social junction traffic control using reinforcement learning on real data. IET Intelligent Transport Systems. 2024

[58] 58. Chaudhary HK, Saraswat K, Yadav H, Puri H, Mishra AR, Chauhan SS. A Real Time Dynamic Approach for Management of Vehicle Generated Traffic. Transdisciplinary Research and Education Center for Green Technologies, Kyushu University; 2023. pp. 289-299

[59] 59. Gomaa A, Minematsu T, Abdelwahab MM, Abo-Zahhad M, Taniguchi RI. Faster CNN-based vehicle detection and counting strategy for fixed camera scenes. Multimedia Tools and Applications. 2022;81(18):25443-25471

[60] 60. Zhang Y, Guo Z, Wu J, Tian Y, Tang H, Guo X. Real-time vehicle detection based on improved yolo v5. Sustainability. 2022;14(19):12274

[61] 61. Rodríguez-Rangel H, Morales-Rosales LA, Imperial-Rojo R, Roman-Garay MA, Peralta-Peñuñuri GE, Lobato-Báez M. Analysis of statistical and artificial intelligence algorithms for real-time speed estimation based on vehicle detection with YOLO. Applied Sciences. 2022;12(6):2907

Reinforcement Learning for Traffic Control Using Social Preferences

Recent Topics in Highway Engineering - Up-to-date Overview of Practical Knowledge [Working Title]

Abstract

Keywords

Author Information

Orly Barzilai*

1. Introduction

2. Social preference as a criterion for traffic control

3. Traffic control preference via dedicated lanes

4. Smart social junction with social preferences

Figure 1.

Figure 2.

Figure 3.

5. Reinforcement learning algorithm for traffic control

6. Reinforcement learning algorithm for smart social junction

Figure 4.

Figure 5.

Figure 6.

Table 1.

7. Conclusions

References