Mathematical Modeling of Barrier-Free Public Transportation Networks in Shanghai 上海无障碍公共交通网络的数学建模
July 29, 2025 2025 年 7 月 29 日
Abstract 摘要
1 Abstract 1 摘要
This paper presents a comprehensive mathematical model for optimizing barrier-free travel in Shanghai’s public transportation system, focusing on wheelchair users. We utilize graph theory to represent bus and metro stations as nodes in a network, with accessible connections between these stations as edges. The model introduces an accessibility-adjusted edge weight function to prioritize barrier-free paths. Dijkstra’s algorithm is employed for shortest-path computation, while betweenness centrality is used to identify critical points. Network resilience is quantified through connectivity, path availability, and accessibility recovery indices. The model integrates 2025 Shanghai metro data (19 lines, 508 stations) and bus system data (with approximately 85%85 \% accessibility), using graded accessibility attributes derived from facility counts. Implemented in Python using NetworkX, the model demonstrates the network’s connectivity and the effectiveness of accessibility-optimized paths. Results indicate bottlenecks in low-access areas, suggesting that increasing redundant facilities could enhance travel equity. 本文提出了一个针对上海公共交通系统无障碍出行优化的综合数学模型,重点关注轮椅使用者。我们利用图论将公交和地铁站点表示为网络中的节点,站点之间的无障碍连接作为边。模型引入了无障碍调整的边权函数,以优先考虑无障碍路径。采用 Dijkstra 算法进行最短路径计算,同时利用中介中心性识别关键节点。通过连通性、路径可用性和无障碍恢复指数量化网络的韧性。模型整合了 2025 年上海地铁数据(19 条线路,508 个站点)和公交系统数据(无障碍设施约 85%85 \% 个),使用基于设施数量的分级无障碍属性。模型在 Python 中使用 NetworkX 实现,展示了网络的连通性及无障碍优化路径的有效性。结果显示低无障碍区域存在瓶颈,建议增加冗余设施以提升出行公平性。
Accessibility: a_(v)in[0,1]a_{v} \in[0,1], where 1 indicates full barrier-free support (graded based on facilities). 无障碍性: a_(v)in[0,1]a_{v} \in[0,1] ,其中 1 表示完全无障碍支持(基于设施分级)。
For each edge e=(u,v)in Ee=(u, v) \in E, the following attributes are defined: 对于每条边 e=(u,v)in Ee=(u, v) \in E ,定义了以下属性:
1. Distance Cost (Calculated Using the Haversine Formula for Geographic Distance Between Two Points): 1. 距离成本(使用哈弗辛公式计算两点之间的地理距离):
# Haversine distance function (km)
def haversine(lon1, lat1, lon2, lat2):
R = 6371.0 # Earth radius
lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])
dlon = lon2 - lon1
dlat = lat2 - lat1
a = sin(dlat / 2) **2 + cos (lat1) * cos(lat2) * sin(dlon / 2) **2
c = 2 * atan2(sqrt(a), sqrt(1 - a))
return R * c
Let’s think about how this formula comes about. The Earth is a large sphere, not flat, so we can’t use a straight-line distance. To calculate the distance between two points, we need to account for the Earth’s curvature. The Haversine formula is specifically designed for this purpose. It first calculates the difference in latitude ( Delta phi\Delta \phi ) and longitude ( Delta lambda\Delta \lambda ), then uses sine and cosine trigonometric functions to convert these differences into an angle, multiplies by the Earth’s radius RR to get kilometers. The value aa is an intermediate result, representing the “curvature” in squared form. Why use arctan? Because it converts the angle into a real distance. RR is fixed at 6371 km because that’s the average radius of the Earth. This way, we can determine the distance between stations, just like measuring a route with a map app. 让我们来思考这个公式是如何得出的。地球是一个大球体,而不是平面,所以我们不能使用直线距离。为了计算两点之间的距离,我们需要考虑地球的曲率。Haversine 公式正是为此设计的。它首先计算纬度( Delta phi\Delta \phi )和经度( Delta lambda\Delta \lambda )的差值,然后使用正弦和余弦三角函数将这些差值转换成一个角度,再乘以地球半径 RR 得到公里数。数值 aa 是一个中间结果,表示“曲率”的平方形式。为什么使用 arctan?因为它将角度转换为实际距离。 RR 固定为 6371 公里,因为这是地球的平均半径。通过这种方式,我们可以确定站点之间的距离,就像用地图应用测量路线一样。
Where: 位置:
R=6371kmR=6371 \mathrm{~km}, the Earth’s radius; R=6371kmR=6371 \mathrm{~km} ,地球半径;
phi_(u),phi_(v)\phi_{u}, \phi_{v} are the latitudes of the two stations (in radians); phi_(u),phi_(v)\phi_{u}, \phi_{v} 是两个站点的纬度(以弧度为单位);
Delta phi=phi_(u)-phi_(v),Delta lambda=lambda_(u)-lambda_(v)\Delta \phi=\phi_{u}-\phi_{v}, \Delta \lambda=\lambda_{u}-\lambda_{v}, the differences in latitude and longitude. Delta phi=phi_(u)-phi_(v),Delta lambda=lambda_(u)-lambda_(v)\Delta \phi=\phi_{u}-\phi_{v}, \Delta \lambda=\lambda_{u}-\lambda_{v} ,纬度和经度的差异。
2. Accessibility (Graded Based on Shanghai Facilities; Score from 0 to 1) Optimized Formula: 2. 无障碍性(基于上海设施分级;得分范围 0 到 1)优化公式:
# Optimized graded accessibility from Shanghai CSV data
def calculate_accessibility(facilities):
n_e = sum(1 for f in facilities if '无障碍电梯' in f['type'])
n_s = sum(1 for f in facilities if f['self'] == '自助')
n_p = sum(1 for f in facilities if '升降平台' in f['type'] or '斜挂梯' in f['type'
])
score = min(1.0, (n_e + 0.5 * n_s + 0.3 * n_p) / 5.0)
return score
for node in G.nodes():
G.nodes[node]['accessibility'] = calculate_accessibility(node_facilities[node])
for u, v in G.edges():
a_u = G.nodes[u]['accessibility']
a_v = G.nodes[v]['accessibility']
n_e = (len([f for f in node_facilities[u] if '无障碍电梯' in f['type']]) +
len([f for f in node_facilities[v] if '无障碍电梯' in f['type']])) / 2
a_e = min(a_u, a_v) * (1 + 0.1 * n_e)
G[u][v]['accessibility'] = min(a_e, 1.0) # Cap at 1
This formula is concise:Single min and linear terms,clear weights(e.g.,elevators prioritized). Practical:Boosts edges with more facilities,encouraging robust paths in Shanghai(e.g.,transfers with multiple elevators). 该公式简洁:单一最小值和线性项,权重明确(例如,优先考虑电梯)。实用性强:提升设施较多的边权,鼓励上海的稳健路径(例如,多个电梯的换乘)。
We construct the barrier-free subgraph as: 我们构建无障碍子图为:
G^(')=(V^('),E^(')),quad" where "V^(')={v in V∣a_(v) >= 0.5},quadE^(')={e in E∣a(e) >= 0.5}G^{\prime}=\left(V^{\prime}, E^{\prime}\right), \quad \text { where } V^{\prime}=\left\{v \in V \mid a_{v} \geq 0.5\right\}, \quad E^{\prime}=\{e \in E \mid a(e) \geq 0.5\}
Why do we build this subgraph?Because the original graph has bad spots,and we only want to use the good parts.The process:first select nodes with a_(v) >= 0.5a_{v} \geq 0.5 for V^(')V^{\prime}(threshold for partial access in Shanghai),then only add edges with a(e) >= 0.5a(e) \geq 0.5 to E^(')E^{\prime} .Think of it like tidying up toys,keeping only the unbroken ones. 为什么要构建这个子图?因为原始图中存在不便之处,我们只想使用好的部分。过程是:首先选择节点中 V^(')V^{\prime} 为 a_(v) >= 0.5a_{v} \geq 0.5 的(上海部分通行阈值),然后只添加 a(e) >= 0.5a(e) \geq 0.5 到 E^(')E^{\prime} 的边。可以把它想象成整理玩具,只保留完好的。
3 Edge Weight Function Design 3 边权重函数设计
To prioritize barrier-free paths,we define the weighted cost of each edge as: 为了优先考虑无障碍路径,我们将每条边的加权成本定义为:
w(e)=(c(e))/(a(e))w(e)=\frac{c(e)}{a(e)}
for u, v in G.edges():
cost = G[u][v]['cost']
access = G[u][v]['accessibility']
G[u][v]['weight'] = cost / max(access, 0.01) # Avoid division by zero
w(e)w(e) is the total cost-how did we come up with it? Distance ( c(e)c(e) ) divided by accessibility ( a(e)a(e) ), so higher accessibility lowers the effective cost. Why? The algorithm chooses low w(e) paths, favoring highly accessible routes, aligning with wheelchair users’ needs in Shanghai. w(e)w(e) 是总成本——我们是如何得出它的?距离( c(e)c(e) )除以可达性( a(e)a(e) ),因此更高的可达性会降低有效成本。为什么?算法选择低 w(e) 的路径,偏好高度可达的路线,符合上海轮椅使用者的需求。
Accessibility Cost Derivation (Explanation of Why This Weight Function is Adopted): 可达性成本推导(解释为何采用此权重函数):
Consider the effective cost adjusted for accessibility: 考虑调整了可达性的有效成本:
# Conceptual derivation: Effective cost as distance penalized by inverse accessibility
# Aligns with w(e) = c(e) / a(e)
E[E[ cost ]] is the adjusted cost-how do we get it? Low accessibility increases “effort” for wheelchair users, modeled as dividing distance by a(e) (e.g., a(e)=0.5\mathrm{a}(\mathrm{e})=0.5 doubles perceived cost). This derivation prioritizes barrier-free optimization, ensuring the model is practical for Shanghai’s facilities data. E[E[ 成本 ]] 是调整后的成本——我们如何得到它?低可达性增加了轮椅使用者的“努力”,通过将距离除以 a(e) 来建模(例如, a(e)=0.5\mathrm{a}(\mathrm{e})=0.5 会使感知成本加倍)。这一推导优先考虑无障碍优化,确保模型适用于上海的设施数据。
The shortest weighted path from a starting point ss to an ending point tt in the barrier-free subgraph G^(')G^{\prime} can be solved using Dijkstra’s algorithm. The steps are as follows: 在无障碍子图 G^(')G^{\prime} 中,从起点 ss 到终点 tt 的最短加权路径可以使用 Dijkstra 算法求解。步骤如下:
Initialize the distance of all points to oo\infty, with the starting point ss set to 0 ; 将所有点的距离初始化为 oo\infty ,起点 ss 的距离设为 0;
Add all points to a priority queue QQ, sorted by distance; 将所有点加入优先队列 QQ ,按距离排序;
Each time, remove the current shortest point uu from QQ, and traverse its adjacent points vv. If: 每次从 QQ 中移除当前距离最短的点 uu ,并遍历其相邻点 vv 。如果:
if dist[v] > dist[u] + G[u][v]['weight']:
dist[v] = dist[u] + G[u][v]['weight']
prev[v] = u
then update dist[v]\operatorname{dist}[v] and the predecessor node. 然后更新 dist[v]\operatorname{dist}[v] 和前驱节点。
4. Repeat until reaching the endpoint tt or the queue is empty; 4. 重复上述步骤,直到到达终点 tt 或队列为空;
5. Trace back through the predecessor nodes to obtain the path. 5. 通过前驱节点回溯以获得路径。
Why do it this way? It’s like exploring a path: start from ss, try the closest point uu, then check if going from uu to vv is better ( dist[u]+w < dist[v]\operatorname{dist}[u]+w<\operatorname{dist}[v] ). The priority queue QQ ensures we try the shortest paths first. The complexity is O(|E|log |V|)O(|E| \log |V|) because heap sorting is fast. 为什么要这样做?这就像探索一条路径:从 ss 开始,尝试最近的点 uu ,然后检查从 uu 到 vv 是否更好( dist[u]+w < dist[v]\operatorname{dist}[u]+w<\operatorname{dist}[v] )。优先队列 QQ 确保我们优先尝试最短路径。复杂度是 O(|E|log |V|)O(|E| \log |V|) ,因为堆排序速度很快。
Time Complexity is O(|E|log |V|)O(|E| \log |V|). 时间复杂度是 O(|E|log |V|)O(|E| \log |V|) 。
5 Vulnerability Analysis 5 漏洞分析
1. Low-Access Edges: 1. 低可达性边:
Defined as edges with accessibility below 0.5 : 定义为可达性低于 0.5 的边:
{e in E∣a(e) < 0.5}\{e \in E \mid a(e)<0.5\}
low_access_edges = [(u, v) for u, v in G.edges() if G[u][v]['accessibility'] < 0.5]
Why 0.5 ? With accessibility ranging from 0 to 1 , below 0.5 is considered low, like a threshold for usability. We pick these edges because they’ re barriers for wheelchair users and need improvement. 为什么是 0.5?可达性的范围是从 0 到 1,低于 0.5 被认为是低的,就像可用性的阈值。我们选择这些边是因为它们对轮椅使用者来说是障碍,需要改进。
2. High Betweenness Centrality Nodes: 2. 高介数中心性节点:
Betweenness centrality is defined as follows: 介数中心性定义如下:
b(v)=sum_(s!=t!=v)(sigma_(st)(v))/(sigma_(st))b(v)=\sum_{s \neq t \neq v} \frac{\sigma_{s t}(v)}{\sigma_{s t}}
b(v)b(v) measures the importance of vv-how does it come about? If many shortest paths go through v,vv, v is a key hub. sigma_(st)\sigma_{s t} is the total number of shortest paths from ss to tt, and sigma_(st)(v)\sigma_{s t}(v) is those passing through vv. The ratio is summed over all pairs. It’s like counting how often vv is “used.” High b(v)b(v) nodes are bottlenecks that need protection. The complexity is O(|V|*|E|)O(|V| \cdot|E|) because it calculates all paths. b(v)b(v) 衡量 vv 的重要性——它是如何产生的?如果许多最短路径经过 v,vv, v ,则 v,vv, v 是一个关键枢纽。 sigma_(st)\sigma_{s t} 是从 ss 到 tt 的最短路径总数, sigma_(st)(v)\sigma_{s t}(v) 是那些经过 vv 的路径数。该比率对所有节点对进行求和。它就像是在计算 vv 被“使用”的频率。高 b(v)b(v) 节点是需要保护的瓶颈。其复杂度为 O(|V|*|E|)O(|V| \cdot|E|) ,因为它计算了所有路径。
Where: 位置:
sigma_(st)\sigma_{s t} : The number of shortest paths from ss to tt; sigma_(st)\sigma_{s t} :从 ss 到 tt 的最短路径数量;
sigma_(st)(v)\sigma_{s t}(v) : The number of those paths passing through node vv. sigma_(st)(v)\sigma_{s t}(v) :通过节点 vv 的路径数量。
Time Complexity is O(|V|*|E|)O(|V| \cdot|E|). 时间复杂度是 O(|V|*|E|)O(|V| \cdot|E|) 。
6 Network Resilience Indicators 6 网络韧性指标
1. Barrier-Free Connectivity (BC): 1. 无障碍连通性(BC):
BC=1-(C-1)/(N)B C=1-\frac{C-1}{N}
bc = 1 - (connected_components - 1) / num_nodes
BCB C checks how well-connected the network is-how do we think about it? CC is the number of blocks, and NN is the total number of nodes. When C=1,BC=1C=1, B C=1, meaning fully connected; when C=N,BC=0C=N, B C=0, meaning completely disconnected. (C-1)//N(C-1) / N is like a penalty for extra blocks, and 1 minus that leaves the good part. BCB C 检查网络的连通性——我们如何理解它? CC 是区块数量, NN 是节点总数。当 C=1,BC=1C=1, B C=1 时,表示完全连通;当 C=N,BC=0C=N, B C=0 时,表示完全断开。 (C-1)//N(C-1) / N 类似于对额外区块的惩罚,1 减去它就是好的部分。
Where CC is the number of connected components in the barrier-free subgraph, and NN is the total number of nodes. When C=1C=1, the network is fully connected, and BC=1\mathrm{BC}=1. 其中 CC 是无障碍子图中的连通分量数量, NN 是节点总数。当 C=1C=1 时,网络是完全连通的, BC=1\mathrm{BC}=1 。
2. Path Availability (PA): 2. 路径可用性(PA):
PA=(ASL_("original "))/(ASL_("failed "))P A=\frac{A S L_{\text {original }}}{A S L_{\text {failed }}}
asl_original = nx.average_shortest_path_length(G, weight='weight') if
connected_components == 1 else float('inf')
asl_failed = nx.average_shortest_path_length(G_failed, weight='weight') if
connected_components_failed == 1 else float('inf')
pa = asl_original / asl_failed if asl_failed != float('inf') and asl_original != float
('inf') else 0
PAP A looks at changes after failure. ASLA S L is the original average path length, and failed is after low-access edges are removed. If failed is large, PAP A is small, meaning worse. It’s like comparing road times before and after-longer means less availability. PAP A 关注故障后的变化。 ASLA S L 是原始的平均路径长度,failed 是移除低访问边后的路径长度。如果 failed 较大, PAP A 较小,意味着情况更糟。就像比较故障前后的路程时间——时间变长意味着可用性降低。
Where ASL represents the average shortest path length: 其中 ASL 表示平均最短路径长度:
ASL=(1)/(N(N-1))sum_(s!=t)dist(s,t)A S L=\frac{1}{N(N-1)} \sum_{s \neq t} \operatorname{dist}(s, t)
How do we calculate ASLA S L ? Sum the dist for all pairs and divide by the number of pairs N(N-1)N(N-1). It’s like the class average for road lengths. 我们如何计算 ASLA S L ?将所有节点对的距离求和,再除以节点对的数量 N(N-1)N(N-1) 。这就像是道路长度的班级平均值。
When PA is close to 1, it indicates strong network resilience; if PA drops significantly, it suggests some paths have lengthened due to the failure of low-access edges. 当 PA 接近 1 时,表示网络具有较强的韧性;如果 PA 显著下降,则表明由于低访问边的故障,部分路径变长。
3. Accessibility Recovery Index (ARI): 3. 可达性恢复指数(ARI):
ARI=((sum_(e inE^('))(1-a(e)))/(|E^(')|))xx TA R I=\left(\frac{\sum_{e \in E^{\prime}}(1-a(e))}{\left|E^{\prime}\right|}\right) \times T
avg_access_gap = np.mean([1 - d['accessibility'] for u, v, d in G.edges(data=True)])
if num_edges > 0 else 0
ari = avg_access_gap * 10 # Scaling factor T=10 hours for "recovery" to full access
ARIA R I estimates time to upgrade low-access edges. The average gap ( 1-a(e)1-a(e) ) multiplied by TT (10 hours). If the average gap is high, ARIA R I is long. It’s like the network’s average “access deficit” multiplied by a base upgrade time. ARIA R I 估计升级低访问边所需的时间。平均差距( 1-a(e)1-a(e) )乘以 TT (10 小时)。如果平均差距较大, ARIA R I 时间较长。这就像网络的平均“访问缺口”乘以基础升级时间。
Where: - a(e)a(e) is the accessibility of each edge; - TT is the unit upgrade time, set to 10 hours in this paper; - ARI represents the average estimated upgrade time for the entire network to achieve full barrier-free status. 其中:- a(e)a(e) 是每条边的可达性;- TT 是单位升级时间,本文设定为 10 小时;- ARI 表示整个网络实现完全无障碍状态的平均预计升级时间。