paper/paper.tex

   1 \documentclass[11pt]{article}
   2 \usepackage{url}
   3 \usepackage[square,comma,numbers]{natbib}
   4 \usepackage{color}
   5
   6 % Figure stuff.  To add a new figure named `foo.fig', put
   7 % `\includexfig{foo}' in the appropriate place and add `foo' to the
   8 % figures line at the top of the Makefile.
   9 \usepackage{ifpdf}
  10 \ifpdf
  11  \usepackage[pdftex]{graphicx}
  12  \def\includexfig#1{\input{#1.pdf_t}}
  13 \else
  14  \usepackage[dvips]{graphicx}
  15  \def\includexfig#1{\input{#1.ps_t}}
  16 \fi
  17
  18 \usepackage{amssymb, amsmath, amsthm}
  19 \newtheorem{theorem}{Theorem}
  20 \newtheorem{lemma}{Lemma}
  21
  22 \usepackage[letterpaper,left=2.6cm,right=2.6cm,top=2.6cm,bottom=2.6cm]{geometry}
  23
  24 \usepackage{pstricks}
  25 \usepackage{delarray}
  26
  27 \def\code#1{\texttt{#1}}
  28
  29 \title{Assigning Papers to Reviewers}
  30
  31 \author{
  32 Samir Khuller\thanks{Research currently supported by CCF-0728839.
  33 Email:{\tt samir@cs.umd.edu}.}\\
  34 Dept.\ of Computer Science\\
  35 University of Maryland, College Park MD 20742.
  36 \and
  37 Richard Matthew McCutchen\thanks{
  38 The bulk of this work was done while Matt was at the Dept.\ of Computer Science,
  39 University of Maryland, College Park, and supported by REU Supplement to
  40 CCF-0430650. Current email: {\tt matt@mattmccutchen.net}.}\\
  41 %Department of Computer Science\\
  42 %University of Maryland, College Park, MD 20742.
  43 }
  44
  45 \date{}
  46
  47 \begin{document}
  48
  49 \maketitle
  50
  51 %\begin{abstract}
  52 %
  53 %\end{abstract}
  54
  55 \section{Introduction}
  56 Assignment problems arise in a variety of settings.
  57 For funding agencies such as NSF program directors
  58 that co-ordinate panels, assigning proposals to
  59 reviewers is a major challenge. It is important that each proposal receive
  60 sufficient review by qualified experts, and at the same time we would like to
  61 roughly balance the workload across reviewers and to honor the reviewers'
  62 preferences for which proposals they would like to read. The same
  63 issue arises for a program committee chair, who may have to assign
  64 literally hundreds of papers to a program committee consisting of
  65 thirty to forty program committee members.
  66
  67 %{\em What does CMT use? What does Easychair use?}
  68
  69 From now on we will focus on the problem of assigning papers to
  70 reviewers.
  71 We assume that each reviewer is given access to the
  72 list of papers to be reviewed, and gives each paper both a ``desirability''
  73 score indicating his/her level of interest in reviewing the paper and an
  74 ``expertise'' score indicating how qualified he/she is to evaluate the paper.
  75 (Some organizations may choose to use a single set of scores for both
  76 desirability and expertise. We believe that making this distinction may better
  77 model the real-world objective.)
  78 A reviewer may also declare a conflict of interest with a particular paper,
  79 meaning that he/she is forbidden to review the paper.
  80
  81 We do not consider stable marriage type preference lists,
  82 because a strict ranking of papers would be rather tedious
  83 to produce. In this scheme, the papers are essentially grouped
  84 into a few categories.
  85
  86 Let $N$ be the number of papers and $P$ be the number of reviewers.
  87 Suppose that each paper needs $q$ reviews, so a total of $qN$
  88 reviews need to be generated.
  89 Ideally, from the perspective of the papers, we would like to
  90 assign each paper the $q$ most qualified reviewers for the paper.
  91 Of course, this could lead to a load imbalanced solution where
  92 the load on some program committee members is very high, and the
  93 load on others is low. On the other hand, we could insist
  94 on a perfectly load balanced solution in which the number
  95 of papers assigned to each program committee member does not
  96 exceed $L= \lceil qN/P  \rceil$. However, this may lead to a solution
  97 which is not optimal from the perspective of the papers.
  98
  99 One of our goals is to study precisely this tradeoff, and allow each
 100 reviewer to be assigned up to $L + C$ papers,
 101 where $C$ is the {\em load tolerance}. We consider the question:
 102 is it possible to obtain a high quality
 103 assignment with a fairly low value of $C$? One can also ask whether,
 104 in such an assignment, the reviewers receive the papers that they would
 105 have most liked to review.
 106
 107 {\em Stinkers} are papers that pretty much no-one wanted to review.
 108 We would like to spread the load of the stinkers as evenly as possible.
 109
 110
 111 \section{Formulation as a Min-Cost Max-Flow Problem}
 112
 113 \begin{figure*}
 114 \begin{center}
 115 \def\eg#1{\fbox{#1}}
 116 \centerline{\includexfig{flow}}
 117 \caption{Flow Construction.}
 118 \label{flow-fig}
 119 \end{center}
 120 \end{figure*}
 121
 122 We formulate the assignment problem as a min-cost max-flow problem,
 123 where each unit of flow represents one review.
 124 The construction is somewhat involved in order to incorporate all
 125 the desired incentives.  In several places, it makes
 126 use of sets of ``parallel edges'' of different costs connecting a
 127 single pair of nodes $(x, y)$ to allow flow to be sent from $x$ to $y$
 128 at a cost that grows faster than linear in the amount of the flow.  For
 129 example, if there are five unit-capacity edges from $x$ to $y$ of costs
 130 1, 3, 5, 7, and 9, then any integer amount $f \le 5$ of flow can be
 131 sent from $x$ to $y$ at a cost of $f^2$.
 132
 133 An example of the graph is shown in Figure~\ref{flow-fig}.
 134 We have a source $s$ and a sink $t$.
 135 For each paper $j$ we create a set of nodes $p^1_j, p^2_j,p^3_j$, and for each
 136 reviewer $i$ we create a set of nodes $r^1_i, r^2_i, r^3_i$.  (The rationale
 137 for these sets is discussed below.)
 138 Flow can pass from $s$ through one or more of the nodes $r^t_i$ and
 139 one or more of the nodes $p^t_j$ to the sink to represent a review
 140 by reviewer $i$ of paper $j$.
 141
 142 Each paper has an edge of capacity $q$ to
 143 the sink, indicating that it needs $q$ reviews.  In general, these
 144 edges will constitute the min cut, so any max flow will saturate them
 145 and thereby provide all the required reviews.  We take the min-cost
 146 max flow in order to provide the reviews in the ``best'' possible way.
 147
 148 Each reviewer has a zero-cost edge of capacity $L$ from the source so that
 149 he/she can be assigned $L$ papers.  If that were all, we would get a perfectly
 150 load-balanced solution, but we may be able to improve the quality of the
 151 assignments by allowing some imbalance.
 152 Therefore, we allow each reviewer to be overloaded by up to $C$ papers
 153 ($C$ is the load imbalance parameter) via a set
 154 of $C$ additional unit-capacity edges from the source.  We make the cost of the
 155 $l$th edge an increasing function $f(l)$ to provide an incentive to
 156 balance load across reviewers.  Since $2f(1) < f(1) + f(2)$,
 157 a solution that loads two reviewers each by $L+1$ will be preferred
 158 to a solution that loads one reviewer by $L$ and the other by $L+2$
 159 unless the load imbalance in the second solution is outweighed by
 160 other benefits.
 161
 162 For each reviewer $i$ and paper $j$, there is a unit-capacity edge from $i$
 163 to $j$ allowing that pair to be assigned, unless the reviewer declared a
 164 conflict of interest, in which case the edge is not present.  The edge cost is
 165 based on the desirability value $d_{ij}$ stated by reviewer $i$ for paper
 166 $j$.  For values on the NSF scale of 1 (best) to 40 (worst), we chose the cost
 167 function $(10 + d_{ij})^2$, in an attempt to provide an incentive to avoid
 168 really bad matched pairs without completely masking the difference between a
 169 good matched pair and an excellent one.  This choice seeks only to achieve a
 170 natural relationship between a linear preference scale as normally interpreted
 171 and the costs to be used in the optimization.  We realize that strategic
 172 reviewers will take the cost function into account in choosing what desirability
 173 values to submit, in which case its form matters little.
 174
 175 Alongside these purely additive per-review costs,
 176 we want to avoid an individual reviewer
 177 getting too many papers he/she does not like.
 178 With respect to a reviewer $i$, we classify papers as ``interesting'',
 179 ``boring'', or ``very boring'' based on their desirability values;
 180 the thresholds for these classes are currently
 181 the same for all reviewers.  The edge for reviewer $i$ and paper
 182 $j$ leaves from $r^1_i$ if $j$ is interesting, $r^2_i$ if $j$ is boring, or
 183 $r^3_i$ if $j$ is very boring.  Since all edges from the source enter $r^1_i$,
 184 flow for boring and very boring papers is forced to pass through a set of
 185 parallel edges from $r^1_i$ to $r^2_i$, and flow for very boring papers
 186 must pass through an additional set of parallel edges from $r^2_i$ to $r^3_i$.
 187 In each of these sets, we make the cost of the $l$th edge an increasing
 188 function of $l$ to provide an incentive to balance the load of boring
 189 papers in the same way as for overload.
 190
 191 Finally, we would like at least one or two of each paper's reviews to be well
 192 qualified, if possible.  The method is the same as that for reviewer fairness.
 193 With respect to a paper $j$, we classify reviewers as ``expert'',
 194 ``knowledgeable'', or ``general'' by comparing the expertise values to uniform
 195 thresholds.  (Since this is the only place the expertise values are used, we
 196 effectively assume expertise takes on only these three values.)
 197 Edges representing
 198 expert reviews enter $p^1_j$, edges for knowledgeable reviews enter $p^2_j$,
 199 and edges for general reviews enter $p^3_j$; the edge to the sink leaves
 200 $p^3_j$.  A paper's first knowledgeable (or expert) review scores a bonus $c_2$
 201 by traversing a unit-capacity edge of cost $-c_2$ from $p^2_j$ to $p^3_j$,
 202 and an additional expert review scores another bonus $c_1$ by traversing a
 203 unit-capacity edge of cost $-c_1$ from $p^1_j$ to $p^3_j$.
 204 In addition to the bonus edges,
 205 there are edges of zero cost and unlimited capacity that reviews can follow
 206 from $p^1_j$ to $p^2_j$ and from $p^2_j$ to $p^3_j$ in order to reach the sink.
 207 The choice to offer bonuses for two reviews was based on the value $q = 3$;
 208 this would be easy to change for other values of $q$.
 209
 210 In the example in Figure~\ref{flow-fig},
 211 paper 1 is interesting to reviewer 1 and boring to reviewers 2 and 3.
 212 Reviewer 2 is expert on paper 1, with reviewers 1 and 3 merely knowledgeable.
 213 (Reviewer edges for paper 2 are not shown.)
 214 This illustrates how, in principle,
 215 the desirability and expertise relations might differ.
 216 Each is taken into account at a different stage of the construction.
 217
 218 The cost of a flow (assignment) is the sum of its reviewer overload costs,
 219 per-review costs, and reviewer boring / very boring load costs,
 220 minus paper bonuses.  Any one of these components can be traded off against
 221 the others.
 222
 223 \section{Evaluation}
 224
 225 We have implemented the matching algorithm, based on the construction above, in
 226 Haskell.
 227 The construction is intended to illustrate
 228 ways to model real concerns that we find reasonable a priori.
 229 We do not have enough experience with real-world
 230 instances to be confident that each part of the construction serves its intended
 231 purpose or that the parameter values we have chosen are suitable.
 232 Fortunately, the parameter values are easy to change in the source code of our
 233 implementation, and even substantive changes to the graph structure are not too
 234 hard to make.  At a minimum, we believe that min-cost max-flow provides a
 235 reasonable framework for global optimization of an assignment.
 236
 237 We have worked with Michael Hicks, program chair of POPL 2012, to apply our
 238 method to the paper assignment problem for that conference.
 239 The set of POPL 2012 reviewers consisted of the
 240 program committee (PC) and an external review committee (ERC),
 241 where the ERC served two purposes:
 242 \begin{itemize}
 243 \item To provide up to one knowledgeable (or expert) review per paper if
 244 prior knowledge of the topic was hard to find on the PC.
 245 \item To provide all reviews of papers with an author on the PC (``PC papers''),
 246 which were considered to
 247 pose an implicit conflict of interest to all PC members.
 248 \end{itemize}
 249 The special policy for ERC reviews of non-PC papers was realized via a more
 250 complex paper-side gadget, not described here.
 251 Based on an evaluation tool he wrote as well as manual inspection,
 252 Dr.~Hicks was satisfied that the decisions made by the matching tool
 253 closely reflected the best he could have done manually.
 254
 255 We are looking for additional opportunities to apply the matching tool.
 256 Anyone interested is invited to contact us so we can help adapt it
 257 to the scenario and document the experience gained.
 258
 259 %Waiting for data from NSF.
 260
 261 %Synthetic Data.
 262
 263 \section{Getting the Tool}
 264
 265 A distribution containing the source code for the matching tool as well as this
 266 document may be browsed or downloaded at (NOT YET):
 267 \[\hbox{\url{https://mattmccutchen.net/match/}}\]
 268 There are currently two branches:
 269 \begin{itemize}
 270 \item \code{master} has the tool as originally designed for NSF, with no
 271 distinction between desirability and expertise.
 272 \item \code{popl2012} is the basis of the version used for POPL 2012.  The main
 273 differences are that it has separate desirability and expertise, support for
 274 ``fixing'' previously chosen reviewer-paper pairs (buggy, however),
 275 and the special ERC gadget.
 276 \end{itemize}
 277 We regret that we do not currently have a single well-maintained version of the
 278 tool to recommend.
 279
 280 \begin{thebibliography}{99}
 281
 282 \bibitem{Flow}
 283 R. Ahuja, T. Magnanti and J. Orlin.
 284 Network Flows: Theory and Applications.
 285 {\em Prentice Hall}.
 286
 287
 288
 289 \end{thebibliography}
 290
 291 \end{document}