Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks [AAMAS 2017]

achrxme 2023. 8. 13. 13:02

Contribution

1. Formalize the MAPD(Multi-Agent Pickup and Delivery) Problem

2. Present 2 decoupled MAPD algorithms (TP and TPTS)

1. Well-Formed MAPD Instances

: An MAPD instance, which agents should only be allowed in rest(= stay forever) in loactions where they cannot block other agents.

Formally..

1. Let the set of 'endpoints' : $V_{ep}$

$V_{ep}$ includes

1) Every initial locations of agents

2) Every pickup & delivery loacations of tasks ($\doteq V_{tsk}$ , task endpoints)

3) Additional designated parking locaations (if needed)

* $V_{ep} \setminus V_{tsk}$ , non-task endpoints

2. Define well-formed MAPD Instance

Define1

A MAPD instance is 'well-formed' IFF

1) # of tasks : finite

2) $\left| V_{ep} \setminus V_{tsk} \right| \geq \left|A\right|$

3) for every 2 endpoins, there exists a path btn them that traverses no other endpoints

2. Token Passing Algorithm (TP)

Cooperative A* 와 유사함

(Cooperative A* : Reservation table에 다른 agent들의 planned path를 저장해 두고, 각 agent는 하나씩 차례대로 resevation table을 shared memory로 사용하여 서로 간의 충돌을 피하며 path finding 수행)

Task set : 어떤 agent에게도 할당 되지 않은 task들의 집합 (TPTS에서는 다르게 정의 됨)

Token : a synchronized shared block of memory

-> 각 agent가 task를 할당 받기 위해서는 token을 가지고 있어야 함. (전체 system에 토큰은 딱 1개만 존재)

-> Token에 저장 된 정보 : 모든 agent들의 현재 path + task set + 모든 agent들의 assignment 상태

Algorithm 요약

1) 각자의 path의 끝에 도달한 agent들은 매 time step 마다 token을 request함 (system으로부터)

2) Task set에서 현재 token에 저장 된 경로들이 task의 start location 과 goal location을 지나지 않는 task들을 분리하여 $T^{'}$에 저장

3) 그런 task들이 있으면 각 agent의 현재 위치와 task의 goal location 사이의 heuristic cost (h-value)가 가장 작은 task($\tau$)를 assign -> Path1 함수 사용하여 이동 (current location -> pickup location -> delivery location)

$\tau \leftarrow argmin_{\tau_{j}\in T^{'}}h(loc(a_i),s_j)$

4) 없는 경우

4-1) 현재 agent의 위치가 goal location인 task가 taskset에 없는 경우 : 현 위치에 대기

4-2) 현재 agent의 위치가 goal location인 task가 taskset에 있는 경우 : 다른 endpoint 중 하나로 이동 -> Path2함수 사용(current location -> another endpoint)

3. Token Passing with Task Swaps Algorithm (TPTS)

TP와 Task set을 다르게 정의함

TPTS : 모든 unexcuted tasks (assign은 되었지만 아직 agent가 pickup location에 도달하지 못한 task들도 포함)

cf. TP : assign 되는 순간 task set에서 빠짐

-> Assign 되어 해당 pick up location으로 이동하는 동안 Task Swap이 일어날 수 있음

-> TS가 일어나는 경우 : 현재 해당 pick up location으로 이동하는 agent 보다 해당 pick up location에 먼저 도착할 수 있는 agent가 존재하는 경우

* 대부분의 경우에 Task Swap을 수행하는 것이 더욱 효과적이지만, 아래 그림 처럼 그렇지 않은 경우도 있음. 자세한 설명은 논문 참조

왼 : a1에게 먼저 token이 할당 된 상황 / 가운데 : TS가 없으면 a2는 t2로 할당 됨 / 오 : TS가 있으면 a2가 t1에 할당되지만 이는 전체 makespan을 더 길어지게 함

4. Centralized Algorithm

Agent Assignment

1. Unexcuted task의 pickup location에서 rest 중인 agent들을 각각 하나씩 차례대로 해당하는 task에 배정

2. Free agent들을 an unexecuted task의 pickup location 혹은 parking location이라 불리는 다른 endpoint에 배정

-> 이때 assigned delivery location들은 다른 모든 agent들이 배정받은 endpoint와 달라야 함. 이를 위해 아래와 같은 과정을 거침

1) Greedily constructs a set of possible endpoints $X$ for the free agents

1-1) Greedily constructs a subset $T'$ of unexecuted tasks one after the other

(해당 task의 pickup and delivery location이 모든 exected tasks의 delivery location과 다르고 + 모든 unexected tasks의 pickup and delivery location과 다르고 + 이미 포함되어 있지 않은지 확인 후)

1-2) $X$ 를 $T'$의 모든 task들의 pickup location들로 설정

1-3) 만약 free agent의 수가 $X$의 수보다 많으면 몇몇 endpoint들을 parking location으로 $X$에 추가

-> Parking location 결정 : greedily

--> 각각의 free agent $a_i$에게 cost $c(a_i, e)$를 최소화 하는 endpoint $e$를 배정 (agent에게 가장 가까운)

2) 각각의 free agent를 $X$의 endpoint 중 하나에 배정 : Hungarian method 사용

Modified cost $c'(a_i, e)$ 사용

- If $e$ : pickup location of a task, then $c'(a_i, e) = c \cdot C \cdot c(a_i, e)$

- If $e$ : parking location, then $c'(a_i, e)=c\cdot C^2+c(a_i,e)$

*$C$ : a sufficiently large constant

-> 이렇게 하면 pickup location을 assign하는 것이 rest 혹은 parking location에 assign하는 것보다 중요하게 됨

Agent Assignment

Optimally effective한 Conflict-Based Search를 사용하여 각각의 agent가 현재 위치에서 할당받은 endpoint로 향하는 collision free path를 만듬