Commit | Line | Data |
---|---|---|
f3084293 MM |
1 | \documentclass[11pt]{article} |
2 | \usepackage{url} | |
3 | \usepackage[square,comma,numbers]{natbib} | |
4 | \usepackage{color} | |
5 | ||
6 | % Figure stuff. To add a new figure named `foo.fig', put | |
7 | % `\includexfig{foo}' in the appropriate place and add `foo' to the | |
8 | % figures line at the top of the Makefile. | |
9 | \usepackage{ifpdf} | |
10 | \ifpdf | |
11 | \usepackage[pdftex]{graphicx} | |
12 | \def\includexfig#1{\input{#1.pdf_t}} | |
13 | \else | |
14 | \usepackage[dvips]{graphicx} | |
15 | \def\includexfig#1{\input{#1.ps_t}} | |
16 | \fi | |
17 | ||
18 | \usepackage{amssymb, amsmath, amsthm} | |
19 | \newtheorem{theorem}{Theorem} | |
20 | \newtheorem{lemma}{Lemma} | |
21 | ||
22 | \usepackage[letterpaper,left=2.6cm,right=2.6cm,top=2.6cm,bottom=2.6cm]{geometry} | |
23 | ||
f3084293 MM |
24 | \usepackage{pstricks} |
25 | \usepackage{delarray} | |
26 | ||
17e0995d MM |
27 | \def\code#1{\texttt{#1}} |
28 | ||
29 | \title{Assigning Papers to Reviewers} | |
f3084293 MM |
30 | |
31 | \author{ | |
32 | Samir Khuller\thanks{Research currently supported by CCF-0728839. | |
33 | Email:{\tt samir@cs.umd.edu}.}\\ | |
34 | Dept.\ of Computer Science\\ | |
35 | University of Maryland, College Park MD 20742. | |
36 | \and | |
37 | Richard Matthew McCutchen\thanks{ | |
38 | The bulk of this work was done while Matt was at the Dept.\ of Computer Science, | |
39 | University of Maryland, College Park, and supported by REU Supplement to | |
40 | CCF-0430650. Current email: {\tt matt@mattmccutchen.net}.}\\ | |
41 | %Department of Computer Science\\ | |
42 | %University of Maryland, College Park, MD 20742. | |
43 | } | |
44 | ||
45 | \date{} | |
46 | ||
47 | \begin{document} | |
48 | ||
49 | \maketitle | |
50 | ||
51 | %\begin{abstract} | |
52 | % | |
53 | %\end{abstract} | |
54 | ||
55 | \section{Introduction} | |
56 | Assignment problems arise in a variety of settings. | |
57 | For funding agencies such as NSF program directors | |
58 | that co-ordinate panels, assigning proposals to | |
59 | reviewers is a major challenge. It is important that each proposal receive | |
60 | sufficient review by qualified experts, and at the same time we would like to | |
61 | roughly balance the workload across reviewers and to honor the reviewers' | |
62 | preferences for which proposals they would like to read. The same | |
63 | issue arises for a program committee chair, who may have to assign | |
64 | literally hundreds of papers to a program committee consisting of | |
65 | thirty to forty program committee members. | |
66 | ||
67 | %{\em What does CMT use? What does Easychair use?} | |
68 | ||
69 | From now on we will focus on the problem of assigning papers to | |
70 | reviewers. | |
71 | We assume that each reviewer is given access to the | |
08f65a95 | 72 | list of papers to be reviewed, and gives each paper both a ``preference'' |
f3084293 MM |
73 | score indicating his/her level of interest in reviewing the paper and an |
74 | ``expertise'' score indicating how qualified he/she is to evaluate the paper. | |
08f65a95 MM |
75 | (Some organizations may use a single preference score and assume that it |
76 | also indicates expertise. We believe that making the distinction may better | |
f3084293 | 77 | model the real-world objective.) |
17e0995d | 78 | A reviewer may also declare a conflict of interest with a particular paper, |
f3084293 MM |
79 | meaning that he/she is forbidden to review the paper. |
80 | ||
81 | We do not consider stable marriage type preference lists, | |
82 | because a strict ranking of papers would be rather tedious | |
83 | to produce. In this scheme, the papers are essentially grouped | |
84 | into a few categories. | |
85 | ||
86 | Let $N$ be the number of papers and $P$ be the number of reviewers. | |
17e0995d | 87 | Suppose that each paper needs $q$ reviews, so a total of $qN$ |
f3084293 MM |
88 | reviews need to be generated. |
89 | Ideally, from the perspective of the papers, we would like to | |
17e0995d | 90 | assign each paper the $q$ most qualified reviewers for the paper. |
f3084293 MM |
91 | Of course, this could lead to a load imbalanced solution where |
92 | the load on some program committee members is very high, and the | |
93 | load on others is low. On the other hand, we could insist | |
94 | on a perfectly load balanced solution in which the number | |
95 | of papers assigned to each program committee member does not | |
17e0995d | 96 | exceed $L= \lceil qN/P \rceil$. However, this may lead to a solution |
f3084293 MM |
97 | which is not optimal from the perspective of the papers. |
98 | ||
99 | One of our goals is to study precisely this tradeoff, and allow each | |
100 | reviewer to be assigned up to $L + C$ papers, | |
101 | where $C$ is the {\em load tolerance}. We consider the question: | |
102 | is it possible to obtain a high quality | |
103 | assignment with a fairly low value of $C$? One can also ask whether, | |
104 | in such an assignment, the reviewers receive the papers that they would | |
105 | have most liked to review. | |
106 | ||
107 | {\em Stinkers} are papers that pretty much no-one wanted to review. | |
108 | We would like to spread the load of the stinkers as evenly as possible. | |
109 | ||
110 | ||
17e0995d MM |
111 | \section{Formulation as a Min-Cost Max-Flow Problem} |
112 | ||
113 | \begin{figure*} | |
114 | \begin{center} | |
115 | \def\eg#1{\fbox{#1}} | |
116 | \centerline{\includexfig{flow}} | |
117 | \caption{Flow Construction.} | |
118 | \label{flow-fig} | |
119 | \end{center} | |
120 | \end{figure*} | |
f3084293 | 121 | |
17e0995d MM |
122 | We formulate the assignment problem as a min-cost max-flow problem, |
123 | where each unit of flow represents one review. | |
f3084293 | 124 | The construction is somewhat involved in order to incorporate all |
17e0995d | 125 | the desired incentives. In several places, it makes |
f3084293 MM |
126 | use of sets of ``parallel edges'' of different costs connecting a |
127 | single pair of nodes $(x, y)$ to allow flow to be sent from $x$ to $y$ | |
128 | at a cost that grows faster than linear in the amount of the flow. For | |
129 | example, if there are five unit-capacity edges from $x$ to $y$ of costs | |
130 | 1, 3, 5, 7, and 9, then any integer amount $f \le 5$ of flow can be | |
131 | sent from $x$ to $y$ at a cost of $f^2$. | |
132 | ||
17e0995d MM |
133 | An example of the graph is shown in Figure~\ref{flow-fig}. |
134 | We have a source $s$ and a sink $t$. | |
f3084293 MM |
135 | For each paper $j$ we create a set of nodes $p^1_j, p^2_j,p^3_j$, and for each |
136 | reviewer $i$ we create a set of nodes $r^1_i, r^2_i, r^3_i$. (The rationale | |
17e0995d MM |
137 | for these sets is discussed below.) |
138 | Flow can pass from $s$ through one or more of the nodes $r^t_i$ and | |
f3084293 MM |
139 | one or more of the nodes $p^t_j$ to the sink to represent a review |
140 | by reviewer $i$ of paper $j$. | |
141 | ||
b43dc33a MM |
142 | Each paper has an edge of capacity $q$ to |
143 | the sink, indicating that it needs $q$ reviews. In general, these | |
f3084293 MM |
144 | edges will constitute the min cut, so any max flow will saturate them |
145 | and thereby provide all the required reviews. We take the min-cost | |
146 | max flow in order to provide the reviews in the ``best'' possible way. | |
147 | ||
148 | Each reviewer has a zero-cost edge of capacity $L$ from the source so that | |
149 | he/she can be assigned $L$ papers. If that were all, we would get a perfectly | |
150 | load-balanced solution, but we may be able to improve the quality of the | |
151 | assignments by allowing some imbalance. | |
152 | Therefore, we allow each reviewer to be overloaded by up to $C$ papers | |
153 | ($C$ is the load imbalance parameter) via a set | |
154 | of $C$ additional unit-capacity edges from the source. We make the cost of the | |
155 | $l$th edge an increasing function $f(l)$ to provide an incentive to | |
156 | balance load across reviewers. Since $2f(1) < f(1) + f(2)$, | |
157 | a solution that loads two reviewers each by $L+1$ will be preferred | |
158 | to a solution that loads one reviewer by $L$ and the other by $L+2$ | |
159 | unless the load imbalance in the second solution is outweighed by | |
160 | other benefits. | |
161 | ||
4e46fa87 | 162 | For each reviewer $i$ and paper $j$, there is a unit-capacity edge from $i$ |
17e0995d MM |
163 | to $j$ allowing that pair to be assigned, unless the reviewer declared a |
164 | conflict of interest, in which case the edge is not present. The edge cost is | |
08f65a95 | 165 | based on the preference value $a_{ij}$ stated by reviewer $i$ for paper |
17e0995d | 166 | $j$. For values on the NSF scale of 1 (best) to 40 (worst), we chose the cost |
08f65a95 | 167 | function $(10 + a_{ij})^2$, in an attempt to provide an incentive to avoid |
17e0995d MM |
168 | really bad matched pairs without completely masking the difference between a |
169 | good matched pair and an excellent one. This choice seeks only to achieve a | |
170 | natural relationship between a linear preference scale as normally interpreted | |
171 | and the costs to be used in the optimization. We realize that strategic | |
08f65a95 | 172 | reviewers will take the cost function into account in choosing what preference |
17e0995d MM |
173 | values to submit, in which case its form matters little. |
174 | ||
175 | Alongside these purely additive per-review costs, | |
176 | we want to avoid an individual reviewer | |
177 | getting too many papers he/she does not like. | |
178 | With respect to a reviewer $i$, we classify papers as ``interesting'', | |
08f65a95 | 179 | ``boring'', or ``very boring'' based on their preference values; |
17e0995d MM |
180 | the thresholds for these classes are currently |
181 | the same for all reviewers. The edge for reviewer $i$ and paper | |
f3084293 MM |
182 | $j$ leaves from $r^1_i$ if $j$ is interesting, $r^2_i$ if $j$ is boring, or |
183 | $r^3_i$ if $j$ is very boring. Since all edges from the source enter $r^1_i$, | |
17e0995d MM |
184 | flow for boring and very boring papers is forced to pass through a set of |
185 | parallel edges from $r^1_i$ to $r^2_i$, and flow for very boring papers | |
f3084293 MM |
186 | must pass through an additional set of parallel edges from $r^2_i$ to $r^3_i$. |
187 | In each of these sets, we make the cost of the $l$th edge an increasing | |
188 | function of $l$ to provide an incentive to balance the load of boring | |
17e0995d MM |
189 | papers in the same way as for overload. |
190 | ||
191 | Finally, we would like at least one or two of each paper's reviews to be well | |
192 | qualified, if possible. The method is the same as that for reviewer fairness. | |
193 | With respect to a paper $j$, we classify reviewers as ``expert'', | |
194 | ``knowledgeable'', or ``general'' by comparing the expertise values to uniform | |
195 | thresholds. (Since this is the only place the expertise values are used, we | |
196 | effectively assume expertise takes on only these three values.) | |
197 | Edges representing | |
f3084293 MM |
198 | expert reviews enter $p^1_j$, edges for knowledgeable reviews enter $p^2_j$, |
199 | and edges for general reviews enter $p^3_j$; the edge to the sink leaves | |
200 | $p^3_j$. A paper's first knowledgeable (or expert) review scores a bonus $c_2$ | |
201 | by traversing a unit-capacity edge of cost $-c_2$ from $p^2_j$ to $p^3_j$, | |
202 | and an additional expert review scores another bonus $c_1$ by traversing a | |
203 | unit-capacity edge of cost $-c_1$ from $p^1_j$ to $p^3_j$. | |
204 | In addition to the bonus edges, | |
205 | there are edges of zero cost and unlimited capacity that reviews can follow | |
206 | from $p^1_j$ to $p^2_j$ and from $p^2_j$ to $p^3_j$ in order to reach the sink. | |
b43dc33a MM |
207 | The choice to offer bonuses for two reviews was based on the value $q = 3$; |
208 | this would be easy to change for other values of $q$. | |
f3084293 | 209 | |
17e0995d MM |
210 | In the example in Figure~\ref{flow-fig}, |
211 | paper 1 is interesting to reviewer 1 and boring to reviewers 2 and 3. | |
212 | Reviewer 2 is expert on paper 1, with reviewers 1 and 3 merely knowledgeable. | |
213 | (Reviewer edges for paper 2 are not shown.) | |
214 | This illustrates how, in principle, | |
08f65a95 | 215 | the preference and expertise relations might differ. |
17e0995d MM |
216 | Each is taken into account at a different stage of the construction. |
217 | ||
218 | The cost of a flow (assignment) is the sum of its reviewer overload costs, | |
219 | per-review costs, and reviewer boring / very boring load costs, | |
220 | minus paper bonuses. Any one of these components can be traded off against | |
221 | the others. | |
222 | ||
223 | \section{Evaluation} | |
224 | ||
225 | We have implemented the matching algorithm, based on the construction above, in | |
226 | Haskell. | |
227 | The construction is intended to illustrate | |
228 | ways to model real concerns that we find reasonable a priori. | |
229 | We do not have enough experience with real-world | |
230 | instances to be confident that each part of the construction serves its intended | |
231 | purpose or that the parameter values we have chosen are suitable. | |
232 | Fortunately, the parameter values are easy to change in the source code of our | |
233 | implementation, and even substantive changes to the graph structure are not too | |
234 | hard to make. At a minimum, we believe that min-cost max-flow provides a | |
235 | reasonable framework for global optimization of an assignment. | |
236 | ||
237 | We have worked with Michael Hicks, program chair of POPL 2012, to apply our | |
238 | method to the paper assignment problem for that conference. | |
239 | The set of POPL 2012 reviewers consisted of the | |
240 | program committee (PC) and an external review committee (ERC), | |
241 | where the ERC served two purposes: | |
242 | \begin{itemize} | |
243 | \item To provide up to one knowledgeable (or expert) review per paper if | |
244 | prior knowledge of the topic was hard to find on the PC. | |
245 | \item To provide all reviews of papers with an author on the PC (``PC papers''), | |
246 | which were considered to | |
247 | pose an implicit conflict of interest to all PC members. | |
248 | \end{itemize} | |
249 | The special policy for ERC reviews of non-PC papers was realized via a more | |
250 | complex paper-side gadget, not described here. | |
251 | Based on an evaluation tool he wrote as well as manual inspection, | |
252 | Dr.~Hicks was satisfied that the decisions made by the matching tool | |
253 | closely reflected the best he could have done manually. | |
254 | ||
255 | We are looking for additional opportunities to apply the matching tool. | |
256 | Anyone interested is invited to contact us so we can help adapt it | |
257 | to the scenario and document the experience gained. | |
258 | ||
259 | %Waiting for data from NSF. | |
260 | ||
261 | %Synthetic Data. | |
262 | ||
263 | \section{Getting the Tool} | |
264 | ||
265 | A distribution containing the source code for the matching tool as well as this | |
fcdd42c0 | 266 | document may be browsed or downloaded at: |
17e0995d MM |
267 | \[\hbox{\url{https://mattmccutchen.net/match/}}\] |
268 | There are currently two branches: | |
269 | \begin{itemize} | |
270 | \item \code{master} has the tool as originally designed for NSF, with no | |
08f65a95 | 271 | distinction between preference and expertise. |
17e0995d | 272 | \item \code{popl2012} is the basis of the version used for POPL 2012. The main |
08f65a95 | 273 | differences are that it has separate preference and expertise, support for |
17e0995d MM |
274 | ``fixing'' previously chosen reviewer-paper pairs (buggy, however), |
275 | and the special ERC gadget. | |
276 | \end{itemize} | |
277 | We regret that we do not currently have a single well-maintained version of the | |
278 | tool to recommend. | |
f3084293 MM |
279 | |
280 | \begin{thebibliography}{99} | |
281 | ||
282 | \bibitem{Flow} | |
283 | R. Ahuja, T. Magnanti and J. Orlin. | |
284 | Network Flows: Theory and Applications. | |
285 | {\em Prentice Hall}. | |
286 | ||
287 | ||
288 | ||
289 | \end{thebibliography} | |
290 | ||
291 | \end{document} |