编者按

在本系列文章中,我们对顶级期刊《European Journal of Operational Research》上 2025 年 12 月份在线发布的文章进行了精选(共 10 篇),并总结其基本信息,旨在帮助读者快速洞察领域最新动态。这些文章涵盖了铁路维护调度、生产车间排序、服务系统排队论、群决策共识、仓储物流优化、应急疏散管理、海事搜救路径规划、共享出行定价、库存管理中的深度强化学习以及医疗预约系统优化等运筹学前沿领域。

文章1

● 题目:The maintenance scheduling and location choice problem for railway rolling stock

铁路机车车辆的维护调度与地点选择问题

● 主题:Production, Manufacturing, Transportation and Logistics

● 作者

Jordi Zomer (a), Nikola Bešinović (a), Mathijs M. de Weerdt (b), Rob M.P. Goverde (a) *

(a)Department of Transport and Planning, Delft University of Technology, Delft, The Netherlands

(b)Department of Software and Computer Technology, Delft University of Technology, Delft, The Netherlands

● 发表时间:2025-12-06

● 原文链接:  https://doi.org/10.1016/j.ejor.2025.12.005

● 重点

  • Introduction of a model for Maintenance Scheduling and Location Choice

  • Use of Logic-Based Benders’ Decomposition for faster, scalable solutions

  • Four cut generation methods tested with min-cut the fastest for suboptimal results

  • Binary search cuts are best for solving the problem with hard shifts to optimality

  • Tests on real-world data show a significant reduction in capacity violations

  • 引入了维护调度与地点选择(MSLCP)模型。

  • 使用基于逻辑的 Benders 分解(LBBD)以获得更快、可扩展的解决方案。

  • 测试了四种割平面生成方法,其中最小割(min-cut)方法在寻找次优解时最快。

  • 二分搜索割平面(Binary search cuts)最适合将具有硬性班次约束的问题求解至最优。

  • 基于真实数据的测试显示,容量违规情况显著减少。

    ● 摘要

    The increasing train traffic over railway networks stretches the demand for capacity of railway yards and rolling stock maintenance locations, which increasingly limits performance and further growth. Therefore, the scheduling of rolling stock maintenance and the choice regarding optimal locations to perform maintenance is increasingly complicated. This research introduces a Maintenance Scheduling and Location Choice Problem (MSLCP). It simultaneously determines maintenance locations and maintenance schedules of rolling stock, while considering the available capacity of maintenance locations. Solving the MSLCP using one large Mixed Integer Programming appears not to perform well enough. Therefore, to solve the MSLCP, an optimization framework based on Logic-Based Benders’ Decomposition (LBBD) is proposed by combining two models, the Maintenance Location Choice Problem (MLCP) and the Activity Planning Problem (APP), to assess the capacity of an MLCP solution. Within the LBBD, four variants of cut generation procedures are introduced to improve the computational performance: a naive procedure, two heuristic procedures and the so-called min-cut procedure that aims to exploit the specific characteristics of the problem at hand. The framework is demonstrated on realistic scenarios from the Dutch railways. It is shown that the best choice for the cut generation procedure depends on the objective: when aiming to find a good but not necessarily optimal solution, the min-cut procedure performs best, whereas when aiming for the optimal solution, one of the heuristic procedures is the preferred option. The techniques used in the current research are new to the current field and offer interesting next research opportunities.

    随着铁路网络列车流量的增加,对铁路车场和机车车辆维护地点的容量需求日益紧张,这日益限制了绩效和进一步增长。因此,机车车辆的维护调度以及维护地点的最优选择变得愈发复杂。本研究引入了维护调度与地点选择问题(MSLCP)。它在考虑维护地点可用容量的同时,同时确定机车车辆的维护地点和维护时间表。直接使用一个大型混合整数规划求解 MSLCP 效果不佳。因此,为了解决 MSLCP,本文提出了一种基于逻辑的 Benders 分解(LBBD)优化框架,该框架结合了维护地点选择问题(MLCP)和活动规划问题(APP)两个模型,以评估 MLCP 解的容量可行性。在 LBBD 框架内,引入了四种割平面生成程序的变体以提高计算性能:朴素程序、两种启发式程序以及旨在利用当前问题特定特征的所谓最小割程序。该框架在荷兰铁路的现实场景中进行了演示。结果表明,割平面生成程序的最佳选择取决于目标:当旨在寻找良好但不一定是最优的解时,最小割程序表现最佳;而当旨在寻找最优解时,其中一种启发式程序是首选。本研究中使用的技术在当前领域是新颖的,并提供了有趣的未来研究机会。

    文章2

    ● 题目:Solving the paint shop problem with flexible management of multi-lane buffers using reinforcement learning and action masking

    利用强化学习和动作屏蔽解决多车道缓冲区灵活管理的喷漆车间问题

    ● 主题:Discrete Optimization

    ● 作者

    Mirko Stappert (a), Bernhard Lutz (a) (c) *, Janis Brammer (b), Dirk Neumann (a)

    (a)University of Freiburg, Rempartstr. 16, Freiburg, 79098, Germany

    (b)CARIAD SE, Berliner Ring 2, Wolfsburg, 38440, Germany

    (c)University of Vienna, Währinger Str. 29, Vienna, 1090, Austria

    ● 发表时间:2025-12-11

    ● 原文链接: https://doi.org/10.1016/j.ejor.2025.12.017

    ● 重点:

    • RL approach for paint shop problem with flexible multi-lane buffers.

    • Problem variant does not restrict execution of store and retrieve operations.

    • Problem formalization as an integer linear programming problem.

    • Action masking enforces greedy storage, retrieval, and fast-track actions.

    • Reinforcement learning with action masking and policy sampling reduces color changes over baselines.

    • 针对具有灵活多车道缓冲区的喷漆车间问题的强化学习方法。

    • 该问题变体不限制存储和检索操作的执行顺序。

    • 将问题形式化为整数线性规划问题。

    • 动作屏蔽(Action masking)强制执行贪婪存储、检索和快速通道动作。

    • 带有动作屏蔽和策略采样的强化学习相比基准方法减少了颜色切换次数。

    ● 摘要

    In the paint shop problem, an unordered upstream sequence of cars assigned to different colors has to be reshuffled with the objective of minimizing the number of color changes. To reshuffle the upstream sequence, manufacturers can employ a first-in-first-out multi-lane buffer system allowing store and retrieve operations. So far, prior studies primarily focused on simple decision heuristics like greedy or simplified problem variants that do not allow full flexibility when performing store and retrieve operations. In this study, we propose a reinforcement learning approach to minimize color changes for the flexible problem variant, where store and retrieve operations can be performed in an arbitrary order. After proving that greedy retrieval is optimal, we incorporate this finding into the model using action masking. Our evaluation, based on 170 problem instances with 2-8 buffer lanes and 5–15 colors, shows that our approach reduces color changes compared to existing methods by considerable margins depending on the problem size. Furthermore, we demonstrate the robustness of our approach towards different buffer sizes and imbalanced color distributions.

    在喷漆车间问题中,分配了不同颜色的无序上游汽车序列必须重新排序,以最小化颜色切换次数。为了重新排列上游序列,制造商可以采用允许存储和检索操作的先进先出(FIFO)多车道缓冲系统。迄今为止,先前的研究主要集中在简单的决策启发式方法(如贪婪算法)或不允许在执行存储和检索操作时具有完全灵活性的简化问题变体上。在本研究中,我们提出了一种强化学习方法,以最小化灵活问题变体中的颜色切换,在该变体中,存储和检索操作可以按任意顺序执行。在证明了贪婪检索是最优的之后,我们将这一发现通过动作屏蔽整合到模型中。我们的评估基于 170 个问题实例(包含 2-8 个缓冲车道和 5-15 种颜色),结果表明,根据问题规模的不同,我们的方法相比现有方法显著减少了颜色切换次数。此外,我们还展示了该方法对不同缓冲区大小和不平衡颜色分布的鲁棒性。

    文章3

    ● 题目:Optimal server control with Two Customer Classes and Classification Errors具有两类客户和分类错误的最佳服务器控制

    ● 主题:Stochastics and Statistics

    ● 作者

    Sigrún Andradóttir, Hayriye Ayhan *

    H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA 30332-0205, USA

    ● 发表时间:2025-12-11

    ● 原文链接: https://doi.org/10.1016/j.ejor.2025.11.032

    ● 重点:

    • Customer misclassification is a common phenomenon in many applications.

    • We provide analytical models of service systems with customer misclassification.

    • We identify the optimal allocation of specialists in systems with customer misclassification.

    • We investigate how the long-run average profit depends on the misclassification probability.

    • We identify under what conditions it is more profitable to serve customers with or without service continuity.

    • 客户误分类是许多应用中的常见现象。

    • 提供了具有客户误分类的服务系统的解析模型。

    • 确定了具有客户误分类的系统中专家(specialists)的最优分配。

    • 研究了长期平均利润如何取决于误分类概率。

    • 确定了在何种条件下,提供或不提供服务连续性(service continuity)更有利可图。

    ● 摘要

    We consider a Markovian queueing system with two types of customers (basic and advanced) and two types of servers (regular and specialist) in the presence of customer classification errors. We assume that there are always both types of customers waiting for service. When an advanced customer is misclassified as a basic customer, he needs to be served by a specialist after being served by a regular server. Our objective is to determine the dynamic assignment of the specialists between advanced and misclassified customers that maximizes the long-run average profit. We consider two versions of the problem that differ depending on whether the misclassified customers experience service continuity (the regular servers stay with misclassified customers while they wait for specialists, preventing the regular servers from serving other basic customers) or not (the regular servers continue serving other basic customers while misclassified customers wait for specialists). For both versions of the problem, we first characterize the optimal assignment of the specialists and then investigate how the optimal long-run average profit depends on the misclassification probability. We provide examples of systems where the optimal long-run average profit is not monotone in the misclassification probability, which is counter intuitive as one would expect misclassification to have a negative impact on system performance. We conclude our analysis by identifying under what conditions it is more profitable to serve customers with or without service continuity.

    我们考虑一个具有两类客户(基础和高级)和两类服务器(常规和专家)且存在客户分类错误的马尔可夫排队系统。假设始终有两类客户在等待服务。当一名高级客户被误分类为基础客户时,他在被常规服务器服务后,还需要由专家服务器进行服务。我们的目标是确定专家在高级客户和被误分类客户之间的动态分配,以最大化长期平均利润。我们考虑了该问题的两个版本,区别在于被误分类的客户是否通过服务连续性得到服务(即常规服务器在误分类客户等待专家时一直陪同,导致常规服务器无法服务其他基础客户)或不通过服务连续性(即常规服务器在误分类客户等待专家时继续服务其他基础客户)。对于这两个版本的问题,我们首先刻画了专家的最优分配,然后研究了最优长期平均利润如何取决于误分类概率。我们提供的系统示例表明,最优长期平均利润关于误分类概率并非单调的,这与直觉相反,因为人们通常认为误分类会对系统性能产生负面影响。最后,我们通过分析确定了在何种条件下,采用服务连续性或不采用服务连续性来服务客户更有利可图。

    文章4

    ● 题目:Measuring consensus and voter influence in ternary preferences

    衡量三元偏好中的共识与投票者影响力

    ● 主题:Decision Support

    ● 作者

    Alessandro Albano (a) *, José Luis García-Lapresta (b), Antonella Plaia (a), Mariangela Sciandra (a)(a)Department of Economics, Business and Statistics, University of Palermo, Palermo, Italy

    (b)IMUVA, PRESAD Research Group, Departamento de Economía Aplicada, Universidad de Valladolid, Valladolid, Spain

    ● 发表时间:2025-12-11

    ● 原文链接: https://doi.org/10.1016/j.ejor.2025.12.016

    ● 重点

    • Proposed a new consensus measure for ternary preferences.

    • Introduced marginal contributions to consensus inspired by the Banzhaf value.

    • Developed sampling-based estimations for large voter groups.

    • Validated methods via extensive simulation studies.

    • Applied methods to ISTAT and Balkan Barometer data.

    • 提出了一种针对三元偏好的新共识度量方法。

    • 引入了受 Banzhaf 值启发的对共识的边际贡献概念。

    • 开发了针对大型投票者群体的基于抽样的估算程序。

    • 通过广泛的模拟研究验证了方法。

    • 将方法应用于意大利国家统计局(ISTAT)和巴尔干晴雨表(Balkan Barometer)数据。

    ● 摘要

    This paper explores the concept of consensus in the context of ternary preferences, an extension of dichotomous preference approvals, where alternatives are classified into three categories: acceptable, neutral, and unacceptable.We propose a novel distance-based measure to quantify consensus among voters and introduce a method for calculating the marginal contribution of each voter to the overall consensus, drawing parallels to the Banzhaf value in cooperative game theory. To handle large voter groups, we also present an estimation procedure based on sampling techniques to derive the marginal contributions. We performed comprehensive simulation studies to validate the statistical properties and computational efficiency of the proposed approach. Finally, empirical analyses using data from the Italian National Institute of Statistics (ISTAT) and the Balkan Barometer highlight its practical applicability.

    本文探讨了三元偏好背景下的共识概念,三元偏好是二分偏好认可(dichotomous preference approvals)的扩展,其中备选方案被分类为三类:可接受、中立和不可接受。我们提出了一种新颖的基于距离的度量方法来量化投票者之间的共识,并引入了一种计算每个投票者对整体共识的边际贡献的方法,这与合作博弈论中的 Banzhaf 值有异曲同工之妙。为了处理大型投票者群体,我们还提出了一种基于抽样技术的估算程序来推导边际贡献。我们进行了全面的模拟研究,以验证所提出方法的统计特性和计算效率。最后,使用意大利国家统计局(ISTAT)和巴尔干晴雨表数据的实证分析突显了其实际适用性。

    文章5

    ● 题目:Order consolidation in warehouses with compact 3D sorter modules

    采用紧凑型 3D 分拣模块的仓库订单合并

    ● 主题:Innovative Applications of O.R.

    ● 作者

    Zhensheng Zhou (a), Nils Boysen (b) *, Konrad Stephan (b), Hu Yu (a), Yugang Yu (a)

    (a)School of Management, University of Science and Technology of China (USTC), Hefei, 230026, PR China

    (b)Lehrstuhl für Operations Management, Friedrich-Schiller-Universität Jena, Jena, 07743, Germany

    ● 发表时间:2025-12-14

    ● 原文链接: https://doi.org/10.1016/j.ejor.2025.12.015

    ● 重点

    • We are introducing a new type of sorting system to scientific literature.

    • A holistic model including all operational decisions is presented.

    • By reducing the problem, we make it accessible to an efficient matheuristic.

    • We are providing managerial decision support for 3D sorter applications.

    • 向科学文献介绍了一种新型分拣系统。

    • 提出了包含所有运营决策的整体模型。

    • 通过简化问题,使其适用于高效的数学启发式算法(matheuristic)。

    • 为 3D 分拣机应用提供了管理决策支持。

      ● 摘要

      In the fast-paced realm of e-commerce and omnichannel retail, the swift consolidation of orders is a crucial bottleneck in today’s distribution centers and warehouses. In this context, compact 3D sorter modules are among the latest inventions. Inside these modules an array of liftable trays circulates along loading stations and shelves to transport picked products from the former to the latter. Each shelf is tasked with gathering products for a particular customer order and is situated on one of the several movable rack carts arrayed around the module’s perimeter. Upon the completion of all orders on a rack, it is detached from the sorter, conveyed to the packing area, and supplanted by a vacant rack. Achieving efficient order consolidation with 3D sorters hinges on resolving several operational decision-making tasks: determining the allocation of orders to shelves, assigning products of identical stock keeping units to specific orders, and tray loading considering delays caused by liftable trays still occupied by a product awaiting dispatch when passing a loading station. This paper models these operational decisions, introduces efficient solution methods, and demonstrates that sophisticated optimization markedly enhances throughput efficiency beyond the capabilities of basic decision-making rules. Our computational results highlight potential performance improvements of 50 % and more.

      在电子商务和全渠道零售的快节奏领域,订单的快速合并是当今配送中心和仓库的关键瓶颈。在此背景下,紧凑型 3D 分拣模块属于最新发明之一。在这些模块内部,一系列可升降托盘沿着装载站和货架循环,将拣选的产品从前者运输到后者。每个货架负责收集特定客户订单的产品,并位于模块周边排列的多个可移动货架推车之一上。当一个推车上的所有订单完成后,它会从分拣机上卸下,运送到包装区,并由空推车取而代之。利用 3D 分拣机实现高效的订单合并取决于解决几个运营决策任务:确定订单到货架的分配、将相同库存量单位(SKU)的产品分配给特定订单,以及考虑因可升降托盘在经过装载站时仍被待发产品占用而导致的延迟的托盘装载。本文对这些运营决策进行建模,引入了高效的求解方法,并证明了复杂的优化方法显著提高了吞吐效率,超越了基本决策规则的能力。我们的计算结果突显了 50% 或更高的潜在性能提升。

      文章6

      ● 题目:Optimizing evacuation via budget constrained maximum dynamic flow with speed variation and intermediate storage

      基于预算约束、速度变化及中间存储的最大动态流疏散优化

      ● 主题:Innovative Applications of O.R.

      ● 作者

      Tanka Nath Dhamala (a), Durga Prasad Khanal (b) *, Stefan Nickel (c)

      (a)Central Department of Mathematics, Tribhuvan University, Kathmandu, Nepal

      (b)Saraswati Multiple Campus, Tribhuvan University, Kathmandu, Nepal

      (c)Institute for Operations Research, Karlsruhe Institute of Technology, Karlsruhe, Germany

      ● 发表时间:2025-12-16

      ● 原文链接: https://doi.org/10.1016/j.ejor.2025.12.026

      ● 重点

      • Flow improvement via bottleneck capacity increment and speed variation within the budget is applied.

      • Flow models, with or without intermediate storage, incorporating budget and speed constraints are introduced.

      • Polynomial time solutions using a temporally repeated approach for one-way and two-way networks are presented.

      • Algorithms are illustrated using a network within the Kathmandu Valley encompassing the Ring Road.

      • The approach adds theoretical and practical value to reduce congestion during emergencies and peak-hour traffic.

      • 在预算内通过瓶颈容量增加和速度变化来改善流量。

      • 引入了包含预算和速度约束的、带或不带中间存储的流量模型。

      • 提出了针对单向和双向网络的、使用时间重复(temporally repeated)方法的多项式时间解法。

      • 使用涵盖环路的加德满都谷地网络演示了算法。

      • 该方法具有理论和实践价值,可减少紧急情况和高峰时段的交通拥堵。

        ● 摘要

        During any type of disaster, managing the evacuation of people at risk and planning humanitarian support constitute a critical challenge due to the presence of heavy traffic congestion in urban areas. Among them, flow maximization and time minimization models in bi-directional contraflow network have been emerging in addressing these issues. Resource limitation is one of the critical issues in such scenarios. The main objective of this work is to maximize the number of evacuees by best utilizing budget allocation and improving speed adjustment, which minimize congestion during evacuation. Since a limited budget is available, a set of bottleneck arcs is first identified, and then the budget is optimally allocated to some of these arcs to increase their capacities within given space bounds. The remaining arcs are then updated with new speed adjustment, where the travel time should be reduced. In this model, the flow is increased by settling evacuees in intermediate shelters, intended for those who may not reach the final destination due to network capacity or permissible time window constraints. The presented algorithms are polynomial, and their validity is proved. This novel approach could be a milestone in saving lives and reducing traffic congestion, as it offers both theoretical and practical value and contributes to more effective traffic management during emergencies, special events, and rush hour periods. To demonstrate their efficacy, the proposed models are applied to a real-world network case study of the Kathmandu Valley.

        在任何类型的灾难中,由于城市地区存在严重的交通拥堵,管理高风险人群的疏散和规划人道主义支持构成了一个关键挑战。其中,双向逆向流(contraflow)网络中的流量最大化和时间最小化模型已成为解决这些问题的新兴方法。资源限制是此类场景中的关键问题之一。这项工作的主要目标是通过最佳地利用预算分配和改进速度调整来最大化疏散人数,从而最大限度地减少疏散期间的拥堵。由于可用预算有限,首先识别一组瓶颈弧,然后在给定的空间范围内将预算最优地分配给其中一些弧以增加其容量。剩余的弧随后通过新的速度调整进行更新,以减少由于拥堵导致的行程时间。在该模型中,通过将疏散人员安置在中间避难所(intermediate shelters)来增加流量,这些避难所旨在容纳那些由于网络容量或允许的时间窗口限制而可能无法到达最终目的地的疏散人员。所提出的算法是多项式的,并证明了其有效性。这种新颖的方法可能是挽救生命和减少交通拥堵的里程碑,因为它提供了理论和实践价值,并有助于在紧急情况、特殊事件和高峰时段进行更有效的交通管理。为了证明其有效性,所提出的模型被应用于加德满都谷地的真实网络案例研究。

        文章7

        ● 题目:Collaborative path optimization of ship and multiple drones for maritime search

        海事搜索中船舶与多无人机协同路径优化

        ● 主题:Discrete Optimization

        ● 作者

        Xinhao Hou, Mingjun Ji *, Lingrui Kong, Zhendi Gao, Di Wu, Jianfeng Zheng

        Department of Transportation Engineering, Dalian Maritime University, Dalian, 116026, China

        ● 发表时间:2025-12-20

        ● 原文链接: https://doi.org/10.1016/j.ejor.2025.12.013

        ● 重点

        • A maritime search strategy combining ship endurance and drone agility is proposed.

        • A novel model is proposed to coordinated optimize the path of the ship and drones.

        • The limited endurance of drones and the real-world conditions are addressed.

        • An adaptive two-stage iterative algorithm is developed for solving the MILP model.

        • The insights for determining optimal drone configurations are derived.

        • 提出了一种结合船舶续航能力和无人机敏捷性的海事搜索策略。

        • 提出了一个新颖的模型来协同优化船舶和无人机的路径。

        • 解决了无人机续航能力有限和现实世界条件的问题。

        • 开发了一种自适应两阶段迭代算法(ATIA)来求解混合整数线性规划(MILP)模型。

        • 得出了确定最佳无人机配置的见解。

        ● 摘要

        Maritime accidents are frequent, making efficient search and rescue operations critical. Traditional rescue ship often encounter challenges such as poor maneuverability and limited visibility. In response, maritime authorities are exploring the use of drones to quickly locate survivors. While drones offer excellent maneuverability, their limited endurance constrains their effectiveness in large-scale maritime searches. To address these challenges, this paper proposes a novel maritime search strategy that combines the sustained operational capabilities of the ship with the agility of drones. Specifically, we formulate a mixed integer linear programming (MILP) model to optimize the collaborated paths of ship and drones, as well as the battery-swapping plans for drones, with the goal of minimizing search time. The model fully accounts for practical factors, including irregular maritime areas, wind and ocean current effects, drone endurance limits, and variable drone and ship speeds. Given the complexity of the MILP model, we developed an adaptive two-stage iterative algorithm (ATIA) to solve the problem. Various experiments were conducted to evaluate ATIA’s performance: for small-scale instances, ATIA’s results deviated from Gurobi’s optimal solutions by no more than 0.3 %; for medium-scale instances, ATIA outperformed Gurobi in both solution quality and computation time. We also derived the problem’s lower bounds under the specified scenario; comparing ATIA’s results with these bounds further validated its effectiveness for large-scale instances. Additionally, numerical experiments showed the number of drones and endurance significantly impact search efficiency-this insight helps determine optimal drone configurations to enhance search performance.

        海事事故频发,使得高效的搜救行动至关重要。传统的救援船只通常面临机动性差和视野受限等挑战。作为回应,海事当局正在探索使用无人机来快速定位幸存者。虽然无人机具有出色的机动性,但其有限的续航能力限制了它们在大规模海事搜索中的有效性。为了应对这些挑战,本文提出了一种新颖的海事搜索策略,将船舶的持续作业能力与无人机的敏捷性相结合。具体而言,我们制定了一个混合整数线性规划(MILP)模型来优化船舶和无人机的协同路径以及无人机的电池更换计划,目标是最小化搜索时间。该模型充分考虑了实际因素,包括不规则海域、风和洋流影响、无人机续航限制以及可变的无人机和船舶速度。鉴于 MILP 模型的复杂性,我们开发了一种自适应两阶段迭代算法(ATIA)来解决该问题。进行了各种实验来评估 ATIA 的性能:对于小规模实例,ATIA 的结果与 Gurobi 的最优解偏差不超过 0.3%;对于中等规模实例,ATIA 在解的质量和计算时间上都优于 Gurobi。我们还推导了特定场景下问题的下界;将 ATIA 的结果与这些下界进行比较,进一步验证了其在大规模实例中的有效性。此外,数值实验表明,无人机数量和续航能力显著影响搜索效率——这一见解有助于确定最佳无人机配置以提高搜索性能。

        文章8

        ● 题目:Balancing profit and traveller acceptance in ride-pooling personalised fares

        平衡利润与旅客接受度的拼车个性化票价研究

        ● 主题:Production, Manufacturing, Transportation and Logistics

        ● 作者

        Michał Bujak (a) (b) *, Rafał Kucharski (a) *

        (a)Faculty of Mathematics and Computer Science, Jagiellonian University, ul. Prof. S. Ł ojasiewicza 6 30-348 Kraków, Poland

        (b)Doctoral School of Exact and Natural Sciences, Jagiellonian University ul. Prof. S. Ł ojasiewicza 11 30-348 Kraków, Poland

        ● 发表时间:2025-12-23

        ● 原文链接: https://doi.org/10.1016/j.ejor.2025.12.025

        ● 重点

        • We study pricing mechanisms for a ride-pooling mobility service and propose a personalised pricing strategy.

        • Our method combines a traveller perspective with the economic characteristics of a shared trip.

        • In a stochastic framework, we introduce a method that maximises the expected profitability.

        • We leverage the population behavioural heterogeneity to decrease vehicle mileage and improve economic performance.

        • 研究了拼车出行服务的定价机制,并提出了个性化定价策略。

        • 我们的方法结合了旅客视角和共享出行的经济特征。

        • 在随机框架下,引入了一种最大化预期盈利能力的方法。

        • 利用人群行为的异质性来减少车辆行驶里程并提高经济绩效。

        ● 摘要

        In a ride-pooling system, travellers experience discomfort associated with a detour and a longer travel time, which is compensated with a sharing discount. Most studies assume homogeneous travellers that receive either a flat discount or, in rare cases, a proportional to the inconvenience. This simplified approach offers inaccurate results and leads to an underperforming service when tested against diverse and natural human behaviour. We improve the standard approach on two bases. First, we propose a stochastic setting, where we leverage the population distribution of behavioural traits to determine the acceptance probability. Second, we personalise fares. Each traveller receives a sharing discount based on their contribution to the system such that the operator maximises his expected profitability. In the study, we rigorously prove that the discount optimisation problem can be decomposed. We optimise discounts at a ride level to claim the system optimum. An operator, when proposing fares, encounters two counteracting effects. Low fares increase realisation probability while high fares improve profit from a realised ride. In the personalised discount optimisation, we seek the golden mean. Travellers, who are well-aligned and experience minimal discomfort of sharing, are offered higher fares than those who require more incentive to join the service. Unlike in previous methods, our approach naturally balances the travellers satisfaction and the profit maximisation. With an experiment set in NYC, we show that this leads to significant improvements over the flat discount baseline: the mileage is reduced by 4.5% and the operator generates more profit per mile (over 20% improvement).

        在拼车(ride-pooling)系统中,旅客会因绕路和更长的旅行时间而感到不适,这通常通过共享折扣来补偿。大多数研究假设旅客是同质的,要么接受统一折扣,要么(在极少数情况下)接受与不便程度成正比的折扣。这种简化的方法在针对多样化和自然的人类行为进行测试时,结果不准确且导致服务表现不佳。我们在两个基础上改进了标准方法。首先,我们提出了一个随机设置,利用行为特征的人群分布来确定接受概率。其次,我们实行个性化票价。每位旅客根据其对系统的贡献获得共享折扣,以使运营商最大化其预期盈利能力。在研究中,我们严格证明了折扣优化问题是可以分解的。我们在行程层面优化折扣以求得系统最优。运营商在提出票价时会遇到两种相互抵消的效应:低票价增加实现概率,而高票价提高已实现行程的利润。在个性化折扣优化中,我们寻求黄金分割点。对于那些配合度高且对共享带来的不适感最小的旅客,提供的票价较高;而对于那些需要更多激励才能加入服务的旅客,则提供较低的票价。与以前的方法不同,我们的方法自然地平衡了旅客满意度和利润最大化。通过在纽约市进行的实验,我们表明这导致了相对于统一折扣基准的显著改进:里程减少了 4.5%,运营商每英里产生的利润增加了(超过 20% 的提升)。

        文章9

        ● 题目:Zero-shot generalization in inventory management: Train, then Estimate and Decide

        库存管理中的零样本泛化:训练,然后估计与决策

        ● 主题:Stochastics and Statistics

        ● 作者

        Tarkan Temizöz (a) *, Christina Imdahl (a), Remco Dijkman (a), Douniel Lamghari-Idrissi (a) (b), Willem Van Jaarsveld (a)

        (a)Department of Industrial Engineering and Innovation Sciences, Eindhoven University of Technology, PO Box 513, Eindhoven, 5600 MB, Netherlands

        (b)ASML, 5504 DR, Veldhoven, The Netherlands

        ● 发表时间:2025-12-24

        ● 原文链接: https://doi.org/10.1016/j.ejor.2025.12.033

        ● 重点:

        • Studies how to train a DRL policy for tasks with unknown parameters without retraining

        • Introduces Super-MDPs to unify inventory problems under parameter uncertainty

        • Proposes TED: Train, then Estimate and Decide framework for zero-shot generalization

        • Trains a single policy (GC-LSN) that handles cyclic demand and stochastic lead times

        • GC-LSN outperforms state-of-the-art online learning algorithms and heuristics

        • 研究如何在无需重新训练的情况下训练深度强化学习(DRL)策略以处理具有未知参数的任务。

        • 引入 Super-MDPs 以统一参数不确定性下的库存问题。

        • 提出 TED(训练,然后估计与决策)框架用于零样本泛化。

        • 训练单一策略(GC-LSN)处理周期性需求和随机提前期。

        • GC-LSN 优于最先进的在线学习算法和启发式算法。

        ● 摘要

        Deploying deep reinforcement learning (DRL) in real-world inventory management presents challenges, including dynamic environments and uncertain problem parameters, e.g. demand and lead time distributions. These challenges highlight a research gap, suggesting a need for a unifying framework to model and solve sequential decision-making under parameter uncertainty. We address this by exploring an underexplored area of DRL for inventory management: training generally capable agents (GCAs) under zero-shot generalization (ZSG). Here, GCAs are advanced DRL policies designed to handle a broad range of sampled problem instances with diverse inventory challenges. ZSG refers to the ability to successfully apply learned policies to unseen instances with unknown parameters without retraining.

        We propose a unifying Super-Markov Decision Process formulation and the Train, then Estimate and Decide (TED) framework to train and deploy a GCA tailored to inventory management applications. The TED framework consists of three phases: training a GCA on varied problem instances, continuously estimating problem parameters during deployment, and making decisions based on these estimates. Applied to periodic review inventory problems with lost sales, cyclic demand patterns, and stochastic lead times, our trained agent, Generally Capable Lost Sales Network (GC-LSN) consistently outperforms well-known traditional policies when problem parameters are known. Moreover, under conditions where demand and/or lead time distributions are initially unknown and must be estimated, we benchmark against online learning methods that provide worst-case performance guarantees. Our GC-LSN policy, paired with the Kaplan-Meier estimator, is demonstrated to complement these methods by providing superior empirical performance.

        在现实世界的库存管理中部署深度强化学习(DRL)面临着挑战,包括动态环境和不确定的问题参数,例如需求和提前期分布。这些挑战凸显了一个研究空白,表明需要一个统一的框架来建模和解决参数不确定性下的顺序决策问题。我们通过探索库存管理 DRL 中一个未被充分探索的领域来解决这个问题:在零样本泛化(ZSG)下训练通用能力智能体(Generally Capable Agents, GCAs)。在这里,GCAs 是先进的 DRL 策略,旨在处理各种具有不同库存挑战的采样问题实例。ZSG 指的是将学习到的策略成功应用于具有未知参数的未见实例而无需重新训练的能力。 我们提出了一个统一的超级马尔可夫决策过程(Super-MDP)公式和“训练,然后估计与决策”(TED)框架,以训练和部署专为库存管理应用量身定制的 GCA。TED 框架包括三个阶段:在不同的问题实例上训练 GCA,在部署期间持续估计问题参数,并根据这些估计做出决策。应用于具有销售损失、周期性需求模式和随机提前期的定期盘点库存问题,当问题参数已知时,我们训练的智能体——通用能力销售损失网络(GC-LSN)始终优于著名的传统策略。此外,在需求和/或提前期分布最初未知且必须进行估计的条件下,我们以提供最坏情况性能保证的在线学习方法为基准进行了比较。我们的 GC-LSN 策略与 Kaplan-Meier 估计器相结合,被证明可以通过提供卓越的实证性能来补充这些方法。

        文章10

        ● 题目:Mitigating concentrated and cascading delays in preference-based appointment systems: A cost and risk perspective

        缓解基于偏好的预约系统中的集中性与级联性延误:成本与风险视角

        ● 主题:Innovative Applications of O.R.

        ● 作者

        Lixiang Zhao, Min Zhang *, Han Zhu *, Shujun Li

        Dongbei University of Finance and Economics, Dalian, 116025, China

        ● 发表时间:2025-12-27

        ● 原文链接: https://doi.org/10.1016/j.ejor.2025.12.038

        ● 重点

        • Identifies two concurrent service delays in preference-based appointment systems.

        • One service delay stems from physician preference, the other from time preference.

        • Proposes a Cost-Risk model to mitigate delays under uncertain service durations.

        • Develops a constraint generation-embedded branch-and-cut method proven convergent.

        • Extends the model to address patient no-shows and unpunctuality in practice.

        • 识别了基于偏好的预约系统中两种并发的服务延误:集中性延误和级联性延误。

        • 一种服务延误源于医生偏好,另一种源于时间偏好。

        • 提出了成本-风险模型以在不确定的服务持续时间下缓解延误。

        • 开发了一种被证明收敛的嵌入约束生成的分支切割方法。

        • 扩展了模型以解决实践中的患者爽约和不准时问题。

        ● 摘要

        Our study is motivated by real-world observations indicating that preference-based appointment systems suffer from two concurrent types of service delays: concentrated delay and cascading delay. Concentrated delay refers to the physician-preference phenomenon in which patients opting for popular physicians experience significantly longer wait times compared to those selecting less popular providers. Cascading delay, on the other hand, denotes a time-preference pattern whereby, for any given physician, patients selecting later appointment slots tend to face longer delays than those with earlier slots. In a context with uncertain service duration, to mitigate both types of delays simultaneously, we propose a Cost-Risk model that incorporates delay-balancing constraints to mitigate their combined impact. The model robustly minimizes the weighted sum of patient rejection costs and service delay risks, based on the manager’s delay aversion level, by jointly optimizing decisions on service request rejection, patient-to-physician assignment, patient sequencing, and patient-to-slot assignment. Service delay risks are evaluated using the event-wise ambiguity set-based Delay Riskiness Index (DRI), a measure that captures both the probability and intensity of random delays. We propose a constraint generation-embedded branch-and-cut algorithm to exactly solve the model. We further extend the model to incorporate uncertain patient no-shows and unpunctuality, thereby enhancing its validity in practical applications. Numerical experiments validate that the proposed model and algorithm offer healthcare managers a practical tool for effectively and efficiently mitigating concentrated and cascading delays. We also provide insights into the impact of crucial parameters and model extensions on operational performance, particularly regarding profit and service punctuality.

        我们的研究源于现实世界的观察,即基于偏好的预约系统遭受两种并发类型的服务延误:集中性延误(concentrated delay)和级联性延误(cascading delay)。集中性延误是指医生偏好现象,即选择热门医生的患者比选择较不热门医生的患者面临更长的等待时间。另一方面,级联性延误表示一种时间偏好模式,即对于任何给定的医生,选择较晚预约时段的患者往往比选择较早时段的患者面临更长的延误。在服务持续时间不确定的背景下,为了同时缓解这两种类型的延误,我们提出了一个包含延误平衡约束的成本-风险模型,以减轻它们的综合影响。该模型基于管理者的延误厌恶水平,通过联合优化服务请求拒绝、患者-医生分配、患者排序和患者-时段分配决策,稳健地最小化患者拒绝成本和服务延误风险的加权和。服务延误风险使用基于事件模糊集的延误风险指数(DRI)进行评估,该指标同时捕捉随机延误的概率和强度。我们提出了一种嵌入约束生成的分支切割算法来精确求解该模型。我们进一步扩展了模型以纳入不确定的患者爽约和不准时,从而增强了其在实际应用中的有效性。数值实验验证了所提出的模型和算法为医疗管理者提供了一个实用工具,用于有效且高效地缓解集中性和级联性延误。我们还提供了关于关键参数和模型扩展对运营绩效(特别是关于利润和服务准时性)。

        Logo

        有“AI”的1024 = 2048,欢迎大家加入2048 AI社区

        更多推荐