Commit | Line | Data |
---|---|---|
f3084293 MM |
1 | \documentclass[11pt]{article} |
2 | \usepackage{url} | |
3 | \usepackage[square,comma,numbers]{natbib} | |
4 | \usepackage{color} | |
5 | ||
6 | % Figure stuff. To add a new figure named `foo.fig', put | |
7 | % `\includexfig{foo}' in the appropriate place and add `foo' to the | |
8 | % figures line at the top of the Makefile. | |
9 | \usepackage{ifpdf} | |
10 | \ifpdf | |
11 | \usepackage[pdftex]{graphicx} | |
12 | \def\includexfig#1{\input{#1.pdf_t}} | |
13 | \else | |
14 | \usepackage[dvips]{graphicx} | |
15 | \def\includexfig#1{\input{#1.ps_t}} | |
16 | \fi | |
17 | ||
18 | \usepackage{amssymb, amsmath, amsthm} | |
19 | \newtheorem{theorem}{Theorem} | |
20 | \newtheorem{lemma}{Lemma} | |
21 | ||
22 | \usepackage[letterpaper,left=2.6cm,right=2.6cm,top=2.6cm,bottom=2.6cm]{geometry} | |
23 | ||
24 | \usepackage{floatflt} | |
25 | \usepackage{pstricks} | |
26 | \usepackage{delarray} | |
27 | ||
17e0995d MM |
28 | \def\code#1{\texttt{#1}} |
29 | ||
30 | \title{Assigning Papers to Reviewers} | |
f3084293 MM |
31 | |
32 | \author{ | |
33 | Samir Khuller\thanks{Research currently supported by CCF-0728839. | |
34 | Email:{\tt samir@cs.umd.edu}.}\\ | |
35 | Dept.\ of Computer Science\\ | |
36 | University of Maryland, College Park MD 20742. | |
37 | \and | |
38 | Richard Matthew McCutchen\thanks{ | |
39 | The bulk of this work was done while Matt was at the Dept.\ of Computer Science, | |
40 | University of Maryland, College Park, and supported by REU Supplement to | |
41 | CCF-0430650. Current email: {\tt matt@mattmccutchen.net}.}\\ | |
42 | %Department of Computer Science\\ | |
43 | %University of Maryland, College Park, MD 20742. | |
44 | } | |
45 | ||
46 | \date{} | |
47 | ||
48 | \begin{document} | |
49 | ||
50 | \maketitle | |
51 | ||
52 | %\begin{abstract} | |
53 | % | |
54 | %\end{abstract} | |
55 | ||
56 | \section{Introduction} | |
57 | Assignment problems arise in a variety of settings. | |
58 | For funding agencies such as NSF program directors | |
59 | that co-ordinate panels, assigning proposals to | |
60 | reviewers is a major challenge. It is important that each proposal receive | |
61 | sufficient review by qualified experts, and at the same time we would like to | |
62 | roughly balance the workload across reviewers and to honor the reviewers' | |
63 | preferences for which proposals they would like to read. The same | |
64 | issue arises for a program committee chair, who may have to assign | |
65 | literally hundreds of papers to a program committee consisting of | |
66 | thirty to forty program committee members. | |
67 | ||
68 | %{\em What does CMT use? What does Easychair use?} | |
69 | ||
70 | From now on we will focus on the problem of assigning papers to | |
71 | reviewers. | |
72 | We assume that each reviewer is given access to the | |
73 | list of papers to be reviewed, and gives each paper both a ``desirability'' | |
74 | score indicating his/her level of interest in reviewing the paper and an | |
75 | ``expertise'' score indicating how qualified he/she is to evaluate the paper. | |
76 | (Some organizations may choose to use a single set of scores for both | |
77 | desirability and expertise. We believe that making this distinction may better | |
78 | model the real-world objective.) | |
17e0995d | 79 | A reviewer may also declare a conflict of interest with a particular paper, |
f3084293 MM |
80 | meaning that he/she is forbidden to review the paper. |
81 | ||
82 | We do not consider stable marriage type preference lists, | |
83 | because a strict ranking of papers would be rather tedious | |
84 | to produce. In this scheme, the papers are essentially grouped | |
85 | into a few categories. | |
86 | ||
87 | Let $N$ be the number of papers and $P$ be the number of reviewers. | |
17e0995d | 88 | Suppose that each paper needs $q$ reviews, so a total of $qN$ |
f3084293 MM |
89 | reviews need to be generated. |
90 | Ideally, from the perspective of the papers, we would like to | |
17e0995d | 91 | assign each paper the $q$ most qualified reviewers for the paper. |
f3084293 MM |
92 | Of course, this could lead to a load imbalanced solution where |
93 | the load on some program committee members is very high, and the | |
94 | load on others is low. On the other hand, we could insist | |
95 | on a perfectly load balanced solution in which the number | |
96 | of papers assigned to each program committee member does not | |
17e0995d | 97 | exceed $L= \lceil qN/P \rceil$. However, this may lead to a solution |
f3084293 MM |
98 | which is not optimal from the perspective of the papers. |
99 | ||
100 | One of our goals is to study precisely this tradeoff, and allow each | |
101 | reviewer to be assigned up to $L + C$ papers, | |
102 | where $C$ is the {\em load tolerance}. We consider the question: | |
103 | is it possible to obtain a high quality | |
104 | assignment with a fairly low value of $C$? One can also ask whether, | |
105 | in such an assignment, the reviewers receive the papers that they would | |
106 | have most liked to review. | |
107 | ||
108 | {\em Stinkers} are papers that pretty much no-one wanted to review. | |
109 | We would like to spread the load of the stinkers as evenly as possible. | |
110 | ||
111 | ||
17e0995d MM |
112 | \section{Formulation as a Min-Cost Max-Flow Problem} |
113 | ||
114 | \begin{figure*} | |
115 | \begin{center} | |
116 | \def\eg#1{\fbox{#1}} | |
117 | \centerline{\includexfig{flow}} | |
118 | \caption{Flow Construction.} | |
119 | \label{flow-fig} | |
120 | \end{center} | |
121 | \end{figure*} | |
f3084293 | 122 | |
17e0995d MM |
123 | We formulate the assignment problem as a min-cost max-flow problem, |
124 | where each unit of flow represents one review. | |
f3084293 | 125 | The construction is somewhat involved in order to incorporate all |
17e0995d | 126 | the desired incentives. In several places, it makes |
f3084293 MM |
127 | use of sets of ``parallel edges'' of different costs connecting a |
128 | single pair of nodes $(x, y)$ to allow flow to be sent from $x$ to $y$ | |
129 | at a cost that grows faster than linear in the amount of the flow. For | |
130 | example, if there are five unit-capacity edges from $x$ to $y$ of costs | |
131 | 1, 3, 5, 7, and 9, then any integer amount $f \le 5$ of flow can be | |
132 | sent from $x$ to $y$ at a cost of $f^2$. | |
133 | ||
17e0995d MM |
134 | An example of the graph is shown in Figure~\ref{flow-fig}. |
135 | We have a source $s$ and a sink $t$. | |
f3084293 MM |
136 | For each paper $j$ we create a set of nodes $p^1_j, p^2_j,p^3_j$, and for each |
137 | reviewer $i$ we create a set of nodes $r^1_i, r^2_i, r^3_i$. (The rationale | |
17e0995d MM |
138 | for these sets is discussed below.) |
139 | Flow can pass from $s$ through one or more of the nodes $r^t_i$ and | |
f3084293 MM |
140 | one or more of the nodes $p^t_j$ to the sink to represent a review |
141 | by reviewer $i$ of paper $j$. | |
142 | ||
143 | Each paper has an edge of capacity $r$ to | |
144 | the sink, indicating that it needs $r$ reviews. In general, these | |
145 | edges will constitute the min cut, so any max flow will saturate them | |
146 | and thereby provide all the required reviews. We take the min-cost | |
147 | max flow in order to provide the reviews in the ``best'' possible way. | |
148 | ||
149 | Each reviewer has a zero-cost edge of capacity $L$ from the source so that | |
150 | he/she can be assigned $L$ papers. If that were all, we would get a perfectly | |
151 | load-balanced solution, but we may be able to improve the quality of the | |
152 | assignments by allowing some imbalance. | |
153 | Therefore, we allow each reviewer to be overloaded by up to $C$ papers | |
154 | ($C$ is the load imbalance parameter) via a set | |
155 | of $C$ additional unit-capacity edges from the source. We make the cost of the | |
156 | $l$th edge an increasing function $f(l)$ to provide an incentive to | |
157 | balance load across reviewers. Since $2f(1) < f(1) + f(2)$, | |
158 | a solution that loads two reviewers each by $L+1$ will be preferred | |
159 | to a solution that loads one reviewer by $L$ and the other by $L+2$ | |
160 | unless the load imbalance in the second solution is outweighed by | |
161 | other benefits. | |
162 | ||
4e46fa87 | 163 | For each reviewer $i$ and paper $j$, there is a unit-capacity edge from $i$ |
17e0995d MM |
164 | to $j$ allowing that pair to be assigned, unless the reviewer declared a |
165 | conflict of interest, in which case the edge is not present. The edge cost is | |
4e46fa87 | 166 | based on the desirability value $d_{ij}$ stated by reviewer $i$ for paper |
17e0995d MM |
167 | $j$. For values on the NSF scale of 1 (best) to 40 (worst), we chose the cost |
168 | function $(10 + d_{ij})^2$, in an attempt to provide an incentive to avoid | |
169 | really bad matched pairs without completely masking the difference between a | |
170 | good matched pair and an excellent one. This choice seeks only to achieve a | |
171 | natural relationship between a linear preference scale as normally interpreted | |
172 | and the costs to be used in the optimization. We realize that strategic | |
173 | reviewers will take the cost function into account in choosing what desirability | |
174 | values to submit, in which case its form matters little. | |
175 | ||
176 | Alongside these purely additive per-review costs, | |
177 | we want to avoid an individual reviewer | |
178 | getting too many papers he/she does not like. | |
179 | With respect to a reviewer $i$, we classify papers as ``interesting'', | |
180 | ``boring'', or ``very boring'' based on their desirability values; | |
181 | the thresholds for these classes are currently | |
182 | the same for all reviewers. The edge for reviewer $i$ and paper | |
f3084293 MM |
183 | $j$ leaves from $r^1_i$ if $j$ is interesting, $r^2_i$ if $j$ is boring, or |
184 | $r^3_i$ if $j$ is very boring. Since all edges from the source enter $r^1_i$, | |
17e0995d MM |
185 | flow for boring and very boring papers is forced to pass through a set of |
186 | parallel edges from $r^1_i$ to $r^2_i$, and flow for very boring papers | |
f3084293 MM |
187 | must pass through an additional set of parallel edges from $r^2_i$ to $r^3_i$. |
188 | In each of these sets, we make the cost of the $l$th edge an increasing | |
189 | function of $l$ to provide an incentive to balance the load of boring | |
17e0995d MM |
190 | papers in the same way as for overload. |
191 | ||
192 | Finally, we would like at least one or two of each paper's reviews to be well | |
193 | qualified, if possible. The method is the same as that for reviewer fairness. | |
194 | With respect to a paper $j$, we classify reviewers as ``expert'', | |
195 | ``knowledgeable'', or ``general'' by comparing the expertise values to uniform | |
196 | thresholds. (Since this is the only place the expertise values are used, we | |
197 | effectively assume expertise takes on only these three values.) | |
198 | Edges representing | |
f3084293 MM |
199 | expert reviews enter $p^1_j$, edges for knowledgeable reviews enter $p^2_j$, |
200 | and edges for general reviews enter $p^3_j$; the edge to the sink leaves | |
201 | $p^3_j$. A paper's first knowledgeable (or expert) review scores a bonus $c_2$ | |
202 | by traversing a unit-capacity edge of cost $-c_2$ from $p^2_j$ to $p^3_j$, | |
203 | and an additional expert review scores another bonus $c_1$ by traversing a | |
204 | unit-capacity edge of cost $-c_1$ from $p^1_j$ to $p^3_j$. | |
205 | In addition to the bonus edges, | |
206 | there are edges of zero cost and unlimited capacity that reviews can follow | |
207 | from $p^1_j$ to $p^2_j$ and from $p^2_j$ to $p^3_j$ in order to reach the sink. | |
208 | The choice to offer bonuses for two reviews was based on the value $r = 3$; | |
209 | this would be easy to change for other values of $r$. | |
210 | ||
17e0995d MM |
211 | In the example in Figure~\ref{flow-fig}, |
212 | paper 1 is interesting to reviewer 1 and boring to reviewers 2 and 3. | |
213 | Reviewer 2 is expert on paper 1, with reviewers 1 and 3 merely knowledgeable. | |
214 | (Reviewer edges for paper 2 are not shown.) | |
215 | This illustrates how, in principle, | |
216 | the desirability and expertise relations might differ. | |
217 | Each is taken into account at a different stage of the construction. | |
218 | ||
219 | The cost of a flow (assignment) is the sum of its reviewer overload costs, | |
220 | per-review costs, and reviewer boring / very boring load costs, | |
221 | minus paper bonuses. Any one of these components can be traded off against | |
222 | the others. | |
223 | ||
224 | \section{Evaluation} | |
225 | ||
226 | We have implemented the matching algorithm, based on the construction above, in | |
227 | Haskell. | |
228 | The construction is intended to illustrate | |
229 | ways to model real concerns that we find reasonable a priori. | |
230 | We do not have enough experience with real-world | |
231 | instances to be confident that each part of the construction serves its intended | |
232 | purpose or that the parameter values we have chosen are suitable. | |
233 | Fortunately, the parameter values are easy to change in the source code of our | |
234 | implementation, and even substantive changes to the graph structure are not too | |
235 | hard to make. At a minimum, we believe that min-cost max-flow provides a | |
236 | reasonable framework for global optimization of an assignment. | |
237 | ||
238 | We have worked with Michael Hicks, program chair of POPL 2012, to apply our | |
239 | method to the paper assignment problem for that conference. | |
240 | The set of POPL 2012 reviewers consisted of the | |
241 | program committee (PC) and an external review committee (ERC), | |
242 | where the ERC served two purposes: | |
243 | \begin{itemize} | |
244 | \item To provide up to one knowledgeable (or expert) review per paper if | |
245 | prior knowledge of the topic was hard to find on the PC. | |
246 | \item To provide all reviews of papers with an author on the PC (``PC papers''), | |
247 | which were considered to | |
248 | pose an implicit conflict of interest to all PC members. | |
249 | \end{itemize} | |
250 | The special policy for ERC reviews of non-PC papers was realized via a more | |
251 | complex paper-side gadget, not described here. | |
252 | Based on an evaluation tool he wrote as well as manual inspection, | |
253 | Dr.~Hicks was satisfied that the decisions made by the matching tool | |
254 | closely reflected the best he could have done manually. | |
255 | ||
256 | We are looking for additional opportunities to apply the matching tool. | |
257 | Anyone interested is invited to contact us so we can help adapt it | |
258 | to the scenario and document the experience gained. | |
259 | ||
260 | %Waiting for data from NSF. | |
261 | ||
262 | %Synthetic Data. | |
263 | ||
264 | \section{Getting the Tool} | |
265 | ||
266 | A distribution containing the source code for the matching tool as well as this | |
267 | document may be browsed or downloaded at (NOT YET): | |
268 | \[\hbox{\url{https://mattmccutchen.net/match/}}\] | |
269 | There are currently two branches: | |
270 | \begin{itemize} | |
271 | \item \code{master} has the tool as originally designed for NSF, with no | |
272 | distinction between desirability and expertise. | |
273 | \item \code{popl2012} is the basis of the version used for POPL 2012. The main | |
274 | differences are that it has separate desirability and expertise, support for | |
275 | ``fixing'' previously chosen reviewer-paper pairs (buggy, however), | |
276 | and the special ERC gadget. | |
277 | \end{itemize} | |
278 | We regret that we do not currently have a single well-maintained version of the | |
279 | tool to recommend. | |
f3084293 MM |
280 | |
281 | \begin{thebibliography}{99} | |
282 | ||
283 | \bibitem{Flow} | |
284 | R. Ahuja, T. Magnanti and J. Orlin. | |
285 | Network Flows: Theory and Applications. | |
286 | {\em Prentice Hall}. | |
287 | ||
288 | ||
289 | ||
290 | \end{thebibliography} | |
291 | ||
292 | \end{document} |