| 1 | \documentclass[11pt]{article} |
| 2 | \usepackage{url} |
| 3 | \usepackage[square,comma,numbers]{natbib} |
| 4 | \usepackage{color} |
| 5 | |
| 6 | % Figure stuff. To add a new figure named `foo.fig', put |
| 7 | % `\includexfig{foo}' in the appropriate place and add `foo' to the |
| 8 | % figures line at the top of the Makefile. |
| 9 | \usepackage{ifpdf} |
| 10 | \ifpdf |
| 11 | \usepackage[pdftex]{graphicx} |
| 12 | \def\includexfig#1{\input{#1.pdf_t}} |
| 13 | \else |
| 14 | \usepackage[dvips]{graphicx} |
| 15 | \def\includexfig#1{\input{#1.ps_t}} |
| 16 | \fi |
| 17 | |
| 18 | \usepackage{amssymb, amsmath, amsthm} |
| 19 | \newtheorem{theorem}{Theorem} |
| 20 | \newtheorem{lemma}{Lemma} |
| 21 | |
| 22 | \usepackage[letterpaper,left=2.6cm,right=2.6cm,top=2.6cm,bottom=2.6cm]{geometry} |
| 23 | |
| 24 | \usepackage{floatflt} |
| 25 | \usepackage{pstricks} |
| 26 | \usepackage{delarray} |
| 27 | |
| 28 | \def\code#1{\texttt{#1}} |
| 29 | |
| 30 | \title{Assigning Papers to Reviewers} |
| 31 | |
| 32 | \author{ |
| 33 | Samir Khuller\thanks{Research currently supported by CCF-0728839. |
| 34 | Email:{\tt samir@cs.umd.edu}.}\\ |
| 35 | Dept.\ of Computer Science\\ |
| 36 | University of Maryland, College Park MD 20742. |
| 37 | \and |
| 38 | Richard Matthew McCutchen\thanks{ |
| 39 | The bulk of this work was done while Matt was at the Dept.\ of Computer Science, |
| 40 | University of Maryland, College Park, and supported by REU Supplement to |
| 41 | CCF-0430650. Current email: {\tt matt@mattmccutchen.net}.}\\ |
| 42 | %Department of Computer Science\\ |
| 43 | %University of Maryland, College Park, MD 20742. |
| 44 | } |
| 45 | |
| 46 | \date{} |
| 47 | |
| 48 | \begin{document} |
| 49 | |
| 50 | \maketitle |
| 51 | |
| 52 | %\begin{abstract} |
| 53 | % |
| 54 | %\end{abstract} |
| 55 | |
| 56 | \section{Introduction} |
| 57 | Assignment problems arise in a variety of settings. |
| 58 | For funding agencies such as NSF program directors |
| 59 | that co-ordinate panels, assigning proposals to |
| 60 | reviewers is a major challenge. It is important that each proposal receive |
| 61 | sufficient review by qualified experts, and at the same time we would like to |
| 62 | roughly balance the workload across reviewers and to honor the reviewers' |
| 63 | preferences for which proposals they would like to read. The same |
| 64 | issue arises for a program committee chair, who may have to assign |
| 65 | literally hundreds of papers to a program committee consisting of |
| 66 | thirty to forty program committee members. |
| 67 | |
| 68 | %{\em What does CMT use? What does Easychair use?} |
| 69 | |
| 70 | From now on we will focus on the problem of assigning papers to |
| 71 | reviewers. |
| 72 | We assume that each reviewer is given access to the |
| 73 | list of papers to be reviewed, and gives each paper both a ``desirability'' |
| 74 | score indicating his/her level of interest in reviewing the paper and an |
| 75 | ``expertise'' score indicating how qualified he/she is to evaluate the paper. |
| 76 | (Some organizations may choose to use a single set of scores for both |
| 77 | desirability and expertise. We believe that making this distinction may better |
| 78 | model the real-world objective.) |
| 79 | A reviewer may also declare a conflict of interest with a particular paper, |
| 80 | meaning that he/she is forbidden to review the paper. |
| 81 | |
| 82 | We do not consider stable marriage type preference lists, |
| 83 | because a strict ranking of papers would be rather tedious |
| 84 | to produce. In this scheme, the papers are essentially grouped |
| 85 | into a few categories. |
| 86 | |
| 87 | Let $N$ be the number of papers and $P$ be the number of reviewers. |
| 88 | Suppose that each paper needs $q$ reviews, so a total of $qN$ |
| 89 | reviews need to be generated. |
| 90 | Ideally, from the perspective of the papers, we would like to |
| 91 | assign each paper the $q$ most qualified reviewers for the paper. |
| 92 | Of course, this could lead to a load imbalanced solution where |
| 93 | the load on some program committee members is very high, and the |
| 94 | load on others is low. On the other hand, we could insist |
| 95 | on a perfectly load balanced solution in which the number |
| 96 | of papers assigned to each program committee member does not |
| 97 | exceed $L= \lceil qN/P \rceil$. However, this may lead to a solution |
| 98 | which is not optimal from the perspective of the papers. |
| 99 | |
| 100 | One of our goals is to study precisely this tradeoff, and allow each |
| 101 | reviewer to be assigned up to $L + C$ papers, |
| 102 | where $C$ is the {\em load tolerance}. We consider the question: |
| 103 | is it possible to obtain a high quality |
| 104 | assignment with a fairly low value of $C$? One can also ask whether, |
| 105 | in such an assignment, the reviewers receive the papers that they would |
| 106 | have most liked to review. |
| 107 | |
| 108 | {\em Stinkers} are papers that pretty much no-one wanted to review. |
| 109 | We would like to spread the load of the stinkers as evenly as possible. |
| 110 | |
| 111 | |
| 112 | \section{Formulation as a Min-Cost Max-Flow Problem} |
| 113 | |
| 114 | \begin{figure*} |
| 115 | \begin{center} |
| 116 | \def\eg#1{\fbox{#1}} |
| 117 | \centerline{\includexfig{flow}} |
| 118 | \caption{Flow Construction.} |
| 119 | \label{flow-fig} |
| 120 | \end{center} |
| 121 | \end{figure*} |
| 122 | |
| 123 | We formulate the assignment problem as a min-cost max-flow problem, |
| 124 | where each unit of flow represents one review. |
| 125 | The construction is somewhat involved in order to incorporate all |
| 126 | the desired incentives. In several places, it makes |
| 127 | use of sets of ``parallel edges'' of different costs connecting a |
| 128 | single pair of nodes $(x, y)$ to allow flow to be sent from $x$ to $y$ |
| 129 | at a cost that grows faster than linear in the amount of the flow. For |
| 130 | example, if there are five unit-capacity edges from $x$ to $y$ of costs |
| 131 | 1, 3, 5, 7, and 9, then any integer amount $f \le 5$ of flow can be |
| 132 | sent from $x$ to $y$ at a cost of $f^2$. |
| 133 | |
| 134 | An example of the graph is shown in Figure~\ref{flow-fig}. |
| 135 | We have a source $s$ and a sink $t$. |
| 136 | For each paper $j$ we create a set of nodes $p^1_j, p^2_j,p^3_j$, and for each |
| 137 | reviewer $i$ we create a set of nodes $r^1_i, r^2_i, r^3_i$. (The rationale |
| 138 | for these sets is discussed below.) |
| 139 | Flow can pass from $s$ through one or more of the nodes $r^t_i$ and |
| 140 | one or more of the nodes $p^t_j$ to the sink to represent a review |
| 141 | by reviewer $i$ of paper $j$. |
| 142 | |
| 143 | Each paper has an edge of capacity $r$ to |
| 144 | the sink, indicating that it needs $r$ reviews. In general, these |
| 145 | edges will constitute the min cut, so any max flow will saturate them |
| 146 | and thereby provide all the required reviews. We take the min-cost |
| 147 | max flow in order to provide the reviews in the ``best'' possible way. |
| 148 | |
| 149 | Each reviewer has a zero-cost edge of capacity $L$ from the source so that |
| 150 | he/she can be assigned $L$ papers. If that were all, we would get a perfectly |
| 151 | load-balanced solution, but we may be able to improve the quality of the |
| 152 | assignments by allowing some imbalance. |
| 153 | Therefore, we allow each reviewer to be overloaded by up to $C$ papers |
| 154 | ($C$ is the load imbalance parameter) via a set |
| 155 | of $C$ additional unit-capacity edges from the source. We make the cost of the |
| 156 | $l$th edge an increasing function $f(l)$ to provide an incentive to |
| 157 | balance load across reviewers. Since $2f(1) < f(1) + f(2)$, |
| 158 | a solution that loads two reviewers each by $L+1$ will be preferred |
| 159 | to a solution that loads one reviewer by $L$ and the other by $L+2$ |
| 160 | unless the load imbalance in the second solution is outweighed by |
| 161 | other benefits. |
| 162 | |
| 163 | For each reviewer $i$ and proposal $j$, there is a unit-capacity edge from $i$ |
| 164 | to $j$ allowing that pair to be assigned, unless the reviewer declared a |
| 165 | conflict of interest, in which case the edge is not present. The edge cost is |
| 166 | based on the desirability value $d_{ij}$ stated by reviewer $i$ for proposal |
| 167 | $j$. For values on the NSF scale of 1 (best) to 40 (worst), we chose the cost |
| 168 | function $(10 + d_{ij})^2$, in an attempt to provide an incentive to avoid |
| 169 | really bad matched pairs without completely masking the difference between a |
| 170 | good matched pair and an excellent one. This choice seeks only to achieve a |
| 171 | natural relationship between a linear preference scale as normally interpreted |
| 172 | and the costs to be used in the optimization. We realize that strategic |
| 173 | reviewers will take the cost function into account in choosing what desirability |
| 174 | values to submit, in which case its form matters little. |
| 175 | |
| 176 | Alongside these purely additive per-review costs, |
| 177 | we want to avoid an individual reviewer |
| 178 | getting too many papers he/she does not like. |
| 179 | With respect to a reviewer $i$, we classify papers as ``interesting'', |
| 180 | ``boring'', or ``very boring'' based on their desirability values; |
| 181 | the thresholds for these classes are currently |
| 182 | the same for all reviewers. The edge for reviewer $i$ and paper |
| 183 | $j$ leaves from $r^1_i$ if $j$ is interesting, $r^2_i$ if $j$ is boring, or |
| 184 | $r^3_i$ if $j$ is very boring. Since all edges from the source enter $r^1_i$, |
| 185 | flow for boring and very boring papers is forced to pass through a set of |
| 186 | parallel edges from $r^1_i$ to $r^2_i$, and flow for very boring papers |
| 187 | must pass through an additional set of parallel edges from $r^2_i$ to $r^3_i$. |
| 188 | In each of these sets, we make the cost of the $l$th edge an increasing |
| 189 | function of $l$ to provide an incentive to balance the load of boring |
| 190 | papers in the same way as for overload. |
| 191 | |
| 192 | Finally, we would like at least one or two of each paper's reviews to be well |
| 193 | qualified, if possible. The method is the same as that for reviewer fairness. |
| 194 | With respect to a paper $j$, we classify reviewers as ``expert'', |
| 195 | ``knowledgeable'', or ``general'' by comparing the expertise values to uniform |
| 196 | thresholds. (Since this is the only place the expertise values are used, we |
| 197 | effectively assume expertise takes on only these three values.) |
| 198 | Edges representing |
| 199 | expert reviews enter $p^1_j$, edges for knowledgeable reviews enter $p^2_j$, |
| 200 | and edges for general reviews enter $p^3_j$; the edge to the sink leaves |
| 201 | $p^3_j$. A paper's first knowledgeable (or expert) review scores a bonus $c_2$ |
| 202 | by traversing a unit-capacity edge of cost $-c_2$ from $p^2_j$ to $p^3_j$, |
| 203 | and an additional expert review scores another bonus $c_1$ by traversing a |
| 204 | unit-capacity edge of cost $-c_1$ from $p^1_j$ to $p^3_j$. |
| 205 | In addition to the bonus edges, |
| 206 | there are edges of zero cost and unlimited capacity that reviews can follow |
| 207 | from $p^1_j$ to $p^2_j$ and from $p^2_j$ to $p^3_j$ in order to reach the sink. |
| 208 | The choice to offer bonuses for two reviews was based on the value $r = 3$; |
| 209 | this would be easy to change for other values of $r$. |
| 210 | |
| 211 | In the example in Figure~\ref{flow-fig}, |
| 212 | paper 1 is interesting to reviewer 1 and boring to reviewers 2 and 3. |
| 213 | Reviewer 2 is expert on paper 1, with reviewers 1 and 3 merely knowledgeable. |
| 214 | (Reviewer edges for paper 2 are not shown.) |
| 215 | This illustrates how, in principle, |
| 216 | the desirability and expertise relations might differ. |
| 217 | Each is taken into account at a different stage of the construction. |
| 218 | |
| 219 | The cost of a flow (assignment) is the sum of its reviewer overload costs, |
| 220 | per-review costs, and reviewer boring / very boring load costs, |
| 221 | minus paper bonuses. Any one of these components can be traded off against |
| 222 | the others. |
| 223 | |
| 224 | \section{Evaluation} |
| 225 | |
| 226 | We have implemented the matching algorithm, based on the construction above, in |
| 227 | Haskell. |
| 228 | The construction is intended to illustrate |
| 229 | ways to model real concerns that we find reasonable a priori. |
| 230 | We do not have enough experience with real-world |
| 231 | instances to be confident that each part of the construction serves its intended |
| 232 | purpose or that the parameter values we have chosen are suitable. |
| 233 | Fortunately, the parameter values are easy to change in the source code of our |
| 234 | implementation, and even substantive changes to the graph structure are not too |
| 235 | hard to make. At a minimum, we believe that min-cost max-flow provides a |
| 236 | reasonable framework for global optimization of an assignment. |
| 237 | |
| 238 | We have worked with Michael Hicks, program chair of POPL 2012, to apply our |
| 239 | method to the paper assignment problem for that conference. |
| 240 | The set of POPL 2012 reviewers consisted of the |
| 241 | program committee (PC) and an external review committee (ERC), |
| 242 | where the ERC served two purposes: |
| 243 | \begin{itemize} |
| 244 | \item To provide up to one knowledgeable (or expert) review per paper if |
| 245 | prior knowledge of the topic was hard to find on the PC. |
| 246 | \item To provide all reviews of papers with an author on the PC (``PC papers''), |
| 247 | which were considered to |
| 248 | pose an implicit conflict of interest to all PC members. |
| 249 | \end{itemize} |
| 250 | The special policy for ERC reviews of non-PC papers was realized via a more |
| 251 | complex paper-side gadget, not described here. |
| 252 | Based on an evaluation tool he wrote as well as manual inspection, |
| 253 | Dr.~Hicks was satisfied that the decisions made by the matching tool |
| 254 | closely reflected the best he could have done manually. |
| 255 | |
| 256 | We are looking for additional opportunities to apply the matching tool. |
| 257 | Anyone interested is invited to contact us so we can help adapt it |
| 258 | to the scenario and document the experience gained. |
| 259 | |
| 260 | %Waiting for data from NSF. |
| 261 | |
| 262 | %Synthetic Data. |
| 263 | |
| 264 | \section{Getting the Tool} |
| 265 | |
| 266 | A distribution containing the source code for the matching tool as well as this |
| 267 | document may be browsed or downloaded at (NOT YET): |
| 268 | \[\hbox{\url{https://mattmccutchen.net/match/}}\] |
| 269 | There are currently two branches: |
| 270 | \begin{itemize} |
| 271 | \item \code{master} has the tool as originally designed for NSF, with no |
| 272 | distinction between desirability and expertise. |
| 273 | \item \code{popl2012} is the basis of the version used for POPL 2012. The main |
| 274 | differences are that it has separate desirability and expertise, support for |
| 275 | ``fixing'' previously chosen reviewer-paper pairs (buggy, however), |
| 276 | and the special ERC gadget. |
| 277 | \end{itemize} |
| 278 | We regret that we do not currently have a single well-maintained version of the |
| 279 | tool to recommend. |
| 280 | |
| 281 | \begin{thebibliography}{99} |
| 282 | |
| 283 | \bibitem{Flow} |
| 284 | R. Ahuja, T. Magnanti and J. Orlin. |
| 285 | Network Flows: Theory and Applications. |
| 286 | {\em Prentice Hall}. |
| 287 | |
| 288 | |
| 289 | |
| 290 | \end{thebibliography} |
| 291 | |
| 292 | \end{document} |