27 Nov 2018
In this post we look at an interesting application motivating dynamic matching - Kidney Exchanges. The stakes being that of life and death, we need to squeeze every bit of efficiency off the system. Prior-free algorithms usually perform unacceptably in this domain. However encoding distributional information in deterministic algorithms ususally involves solving for sample trajectories over future time periods, which can turn computationally prohiitive very quickly. The paper we discuss involves assigning importance potentials to the elements of the graph over which the matching is to be done, post which at runtime an offline mechanism is run in the weighted graph.
The model considers two kinds of users - patient donor pairs and altruistic donors. Patient donor pairs exchange with other such pairs to form disjoint cycles while altruisitc donors can trigger chains in the graph. Various blood types have differing latitudes with regards to the number of different agents they can exchange with. The algorithm allows for a central planner across time who ca observe when an agent gets critical and is about to leave the market. The algorithm thus assigns potentials to graph elements which allows it to sacrifice lower valued elements in immediate requirement matchings when an agent becomes critical and save the more versatile elements for later to trigger larger chains or cycles, when the opportunity presents itself.
The weights are learnt by the model during a training phase using a well known integer solver called ParamILS which is run over 1500 synthetically generated graphs with realistic agent expiration rates simulated. The solver tries to maximize the number of matches and the graph element weights are set as the learnable parameters. Once the weights are learnt, at every moment of criticality, the method solves the weighted graph using a branch-and-price solver which has been shown to be able to solve the NP-Hard problem to optimality for upto 10,000 agents.
Finally, the authors demonstrate how potentials may be learnt for arbitrarily complex graph structures, thus improving theoretical performance. There are a few pathological examples wherein additionally learning edge potentials significantly outperforms learning just vertex potentials, additionally learning cycle potentials significantly outperforms learning just edge and vertex potentials and so on. These however, quickly become computationally intractable and the authors show demonstrable improvements from existing methods using just vertex potentials.
26 Nov 2018
In this post we look at solutions to static matching problems under distributional constraints. A motivating example would be a school matching system wherein both students and schools have preference orders over one another and there exist constraints over student distribution such that the least coveted schools must get at least some set proportion of the number of students that the most coveted school gets. Additionally, each student can only be matched to one school.
Before we can get into it a few notions need to be clarified:
- Strategyproof: A mechanism is strategyproof if misreporting ones prefernce order does not confer that agent any advantage in terms of the the match obtained.
- Nonwasteful: A mechanism is nonwasteful if it does not produce a matching where a student claims than empty seat in a school different from the one he is alloted and switching to the desired school would still lead to a matching that satisfies the distributional constraints.
- Fairness: A mechanism is fair if it never leads to a situation where a student and a school that are not matched would rather be matched with each other.
The notions of nonwastefulness and fairness together constitute stability in the Gale-Shapely sense and guarantee that rational agents will not try to deviate from determined matching. The paper goes into two existing algorithms:
- Serial Dictatorship (SD): In SD a common priority order is assumed among students which is dictated by a master list, with students ordered sequentially according to the master list. A student can be matched to the schools to which on matching they wont violate any constraints. The student can then choose the best among them. It is strategyproof and nonwasteful but unfair.
- Artificial Cap Deffered Acceptance (ACDA): In ACDA, the maximum quota of each school is artificially lowered such that the deferred acceptance run over the lowered quotas satisfied all constraints. It is strategyproof and fair.
This work introduces a method called Adaptive Deffered Acceptance (ADA) which is a round by round method that maintains a set called a forbidden list and starts with a separate complete list of students ranked by how desirable they are to the schools. Every round a set number of top students which is commensurate with the round number are taken off the ranked list and their matches are determined using the DA. If a valid match is obtained for all students, the method terminates. If not, the quotas are reduced by the number of students assigned and the we move onto the next round. If a school violates the distributional constraints, the its quota is set to zero and the currently matched students are removed from the student list and the lgorithm moves to the next round.
The ADA is strategyproof, nonwasteful and in fairness lies between the ACDA and the SD algorithms. The paper also proves an important result that there cannot be another method which is strategyproof and also dominates the ADA in all cases.
The paper introduces a new algorithm called Quota Reduction Deferred Acceptance (QRDA) which solves the above stated problem by repeatedly applying the standard DA to sequentially reduced artificially imposed maximum quotas. This algorithms is strategyproof and fair and performs the best amongst all the mentioned algortihms in simulations. However, in some pathological cases this algorithm is shown to underperform ACDA.
23 Nov 2018
This is the start of a series on summarizing a few seminal and interesting papers in the field of optimal graph matchings.
This paper looks into the dynamic resource assignment problem wherein agents have varying arrival and departure times and have preference orders on a set of resources. The agent reporting it’s preference truthfully is not a given and four algorithms are evaluated to this end:
- Arrival Priority Serial Dictatorship (APSD): In this method each agent on arrival selects the resource for which he has the highest valuation and no payments are made to the platform. This method has been proven to be the only one in which a true value elicitation over resources is a dominant strategy given that payments are not allowed and that an agent does not misreport his arrival and departure times to be earlier than and later than actual respectively.
- Split Dynamic VCG (SDV): It finds an assignment that maximizes the sum of agents valuations for their alloted resources and their payments are determined by the extent to which they have affected other agents allotments. It was originally proposed under static settings but has been adapteed to dynamic settings by partitioning hte agents into groups that exist in the system simultanously. At each timestep the VCG auction is rerun with the remaining resources.
- E-Auction: This method uses the solution to the standard secretary problem adapted to the multiple resources setting. The platform waits for \(\frac{n}{e}\) fraction of the agents to arrive in the first phase and sets the reserve price as the highest valuation received for that resource in the first phase. The first agent to arrive in thesecond phase with a bid surpassing the reserve price is awarded the resource. This mechanism is aso truthful for settings in which agents do not misreport their arrival and departures to be later than and earlier than actual.
- Online Ranked Competition Auction (ORCA): The system extends on the earlier E-Auction by considering the possibility that some resources may be more coveted across the board by all agents and thus it makes sense to have differing thresholds in the multiple secretaries problem with less coveted resources having lower thresholds. These thresholds are solved using a dynamic programming approach resulting in the algorithm having similar truthfulness characteristics as the method above.
The above algorithms are compared to the offline optimal which is an omneiscent god-like algorithm which knows all the user valauations from the start, no matter when they enter the market. Once all valuations are known an optimization problem can be solved to get the best possible result. The ratio of the value extracted by the offline algorithm to that of any online algorithm is its competitive ratio. APSD and SDV have a competitive ratio in the order of number of agents, while E-Auction has a cometitive ratio of \(\mathrm{e}\) and ORCA has the best competitive ratio of less than \(\mathrm{e}\). The paper also compares the algorithms in differeing scenarios and finds that in scenarios with low competition SDV is the best performing method while in scenarios with high competition ORCA is the best performing method.
This paper looks into the scenario where both sides of the market are dynamic rather than just one side. There are a couple of working assumptions:
- The arrival of workers and tasks can be inferred from past data into known independent distributions.
- The number of different distributions one side is far lesser than that on the other.
The paper contributes one adaptive and one non-adaptive approach. The adaptive approach changes its matching strategy depending on the type and the number of agents in the system ehile the non-adaptive matching strategy is decided before the start of the algorithm. The paper uses a stocastic tool called the Two-Stage Birth Death Process to draw parrallels with the problem and uses that to derive competitive ratios and hardness results such as theoretically maximum possible competitive ratio. The following are the pertinent observations:
- The non-adaptive greedy algorithm achieves a Competitve Ratio (CR) of 0.3 which is shown uing the stochastic process to be the maximum possible for a non-adaptive process. This beats a previous result of 0.25 which assumed random arrival order, thus demonstrating the value of using past data to infer arrival distributions.
- An adaptive greedy algorithm is shown to outperform the best non-adaptive algortihm using simulations and for a node-weighted relaxation of the problem (each agent has similar value over all resources) a linear program with tight constraints exploiting the same is used to achieve a CR of 0.34. Here again the stochastic tool is used to provide a theoretical bound of 0.58 indicating scope for improvement.
This is a largely theoretical paper that looks into the repurcussions of a blocking coalition (A subset of users that can all improve their utlities by deviating from the specified matching) imposing their deviations on the rest of the system in the presence of their differing attitudes to the rest. An additive model of externalities are assumed wherein the utlity of an agent is the sum of values of the matches it is a part of and the externalities arising due to the matches of other agents. The assumprion of the externalities depends on the attitude of the agents in the blocking coalition:
- Neutral Attitude: Agents in the blocking coalition assume that the rest will not react to their deviation.
- Optimistic Attitude: Agents in the blocking coalition assume that the rest will act to their own best interest.
- pessimistic Attitude: Agents in the vlocking coalition assume that the rest will act to punish them for their deviations.
The work looks into the questions of membership and non-emptiness i.e whether a given matching is stable under the given attitude and whether a given problem with a given attitude even has a stable matchig, respectively. The paper contributes the result that under assumptions of additive externalities only Gale-Shapely style pairwise stability questions of non-emptiness and membership can be answered in polynomial time. Also questions of membership can always be answered in polynomial time for optimistic attitudes, whether the stability be pairwise or setwise. All other combinations of the pproblem are computationally intractable.