Borrero, Juan S.
(2017)
Sequential Bilevel Linear Programming with Incomplete Information and Learning.
Doctoral Dissertation, University of Pittsburgh.
(Unpublished)
Abstract
We present a framework for a class of sequential decision-making problems in the context of bilevel linear programming, where a leader and a follower repeatedly interact. At each period, the leader allocates resources that can modify the performance of the follower (e.g., as in interdiction or defender-attacker problems). The follower, in turn, optimizes some cost function over a set of activities that depends on the leader's decision. While the follower has complete knowledge of his problem, the leader, who decides as to optimize her objective function, has only partial information. As such, she needs to learn about the cost parameters, available resources, and the follower's activities from the feedback generated by the follower's actions. We measure the performance of any given leader's decision-making policy in terms of its time-stability, defined as the number of periods it takes the policy to match the actions of an oracle decision-maker with complete information of the bilevel problem.
Three types of bilevel models are considered: Shortest path interdiction, max-min bilevel linear problems, and asymmetric bilevel linear problems. For shortest path interdiction we discuss greedy and pessimistic policies, and show that their time stability is upper-bounded by the number of arcs in the network; moreover, these policies are not dominated by any non-greedy or non-pessimistic policy. We refine these ideas into the more general max-min bilevel problems. Here we show that there is a class of greedy and robust policies that have the best possible worst-case performance, eventually match the oracle's actions, provide a real-time optimality certificate, and can be computed using mixed-integer linear programming. These policies, however, do not retain their features for asymmetric bilevel problems. For this setting we study the performance of greedy and best-case policies and show that they keep many of the attractive properties that the greedy and robust policies have for the max-min case.
By performing computational experiments under different configurations, we show that the proposed policies compare favorably against different benchmark policies. Moreover, they perform reasonably close to the semi-oracle, that is a novel decision-maker we introduce that provides a lower bound on the time-stability of any policy.
Share
Citation/Export: |
|
Social Networking: |
|
Details
Item Type: |
University of Pittsburgh ETD
|
Status: |
Unpublished |
Creators/Authors: |
|
ETD Committee: |
|
Date: |
27 September 2017 |
Date Type: |
Publication |
Defense Date: |
15 June 2017 |
Approval Date: |
27 September 2017 |
Submission Date: |
13 July 2017 |
Access Restriction: |
3 year -- Restrict access to University of Pittsburgh for a period of 3 years. |
Number of Pages: |
175 |
Institution: |
University of Pittsburgh |
Schools and Programs: |
Swanson School of Engineering > Industrial Engineering |
Degree: |
PhD - Doctor of Philosophy |
Thesis Type: |
Doctoral Dissertation |
Refereed: |
Yes |
Uncontrolled Keywords: |
Bilevel optimization, Online Optimization, Robust Optimization, Interdiction, Learning |
Date Deposited: |
27 Sep 2017 18:56 |
Last Modified: |
27 Sep 2020 05:15 |
URI: |
http://d-scholarship.pitt.edu/id/eprint/32752 |
Metrics
Monthly Views for the past 3 years
Plum Analytics
Actions (login required)
|
View Item |