paper/paper.tex

   1 \documentclass[11pt]{article}
   2 \usepackage{url}
   3 \usepackage[square,comma,numbers]{natbib}
   4 \usepackage{color}
   5
   6 % Figure stuff.  To add a new figure named `foo.fig', put
   7 % `\includexfig{foo}' in the appropriate place and add `foo' to the
   8 % figures line at the top of the Makefile.
   9 \usepackage{ifpdf}
  10 \ifpdf
  11  \usepackage[pdftex]{graphicx}
  12  \def\includexfig#1{\input{#1.pdf_t}}
  13 \else
  14  \usepackage[dvips]{graphicx}
  15  \def\includexfig#1{\input{#1.ps_t}}
  16 \fi
  17
  18 \usepackage{amssymb, amsmath, amsthm}
  19 \newtheorem{theorem}{Theorem}
  20 \newtheorem{lemma}{Lemma}
  21
  22 \usepackage[letterpaper,left=2.6cm,right=2.6cm,top=2.6cm,bottom=2.6cm]{geometry}
  23
  24 \usepackage{floatflt}
  25 \usepackage{pstricks}
  26 \usepackage{delarray}
  27
  28 \def\code#1{\texttt{#1}}
  29
  30 \title{Assigning Papers to Reviewers}
  31
  32 \author{
  33 Samir Khuller\thanks{Research currently supported by CCF-0728839.
  34 Email:{\tt samir@cs.umd.edu}.}\\
  35 Dept.\ of Computer Science\\
  36 University of Maryland, College Park MD 20742.
  37 \and
  38 Richard Matthew McCutchen\thanks{
  39 The bulk of this work was done while Matt was at the Dept.\ of Computer Science,
  40 University of Maryland, College Park, and supported by REU Supplement to
  41 CCF-0430650. Current email: {\tt matt@mattmccutchen.net}.}\\
  42 %Department of Computer Science\\
  43 %University of Maryland, College Park, MD 20742.
  44 }
  45
  46 \date{}
  47
  48 \begin{document}
  49
  50 \maketitle
  51
  52 %\begin{abstract}
  53 %
  54 %\end{abstract}
  55
  56 \section{Introduction}
  57 Assignment problems arise in a variety of settings.
  58 For funding agencies such as NSF program directors
  59 that co-ordinate panels, assigning proposals to
  60 reviewers is a major challenge. It is important that each proposal receive
  61 sufficient review by qualified experts, and at the same time we would like to
  62 roughly balance the workload across reviewers and to honor the reviewers'
  63 preferences for which proposals they would like to read. The same
  64 issue arises for a program committee chair, who may have to assign
  65 literally hundreds of papers to a program committee consisting of
  66 thirty to forty program committee members.
  67
  68 %{\em What does CMT use? What does Easychair use?}
  69
  70 From now on we will focus on the problem of assigning papers to
  71 reviewers.
  72 We assume that each reviewer is given access to the
  73 list of papers to be reviewed, and gives each paper both a ``desirability''
  74 score indicating his/her level of interest in reviewing the paper and an
  75 ``expertise'' score indicating how qualified he/she is to evaluate the paper.
  76 (Some organizations may choose to use a single set of scores for both
  77 desirability and expertise. We believe that making this distinction may better
  78 model the real-world objective.)
  79 A reviewer may also declare a conflict of interest with a particular paper,
  80 meaning that he/she is forbidden to review the paper.
  81
  82 We do not consider stable marriage type preference lists,
  83 because a strict ranking of papers would be rather tedious
  84 to produce. In this scheme, the papers are essentially grouped
  85 into a few categories.
  86
  87 Let $N$ be the number of papers and $P$ be the number of reviewers.
  88 Suppose that each paper needs $q$ reviews, so a total of $qN$
  89 reviews need to be generated.
  90 Ideally, from the perspective of the papers, we would like to
  91 assign each paper the $q$ most qualified reviewers for the paper.
  92 Of course, this could lead to a load imbalanced solution where
  93 the load on some program committee members is very high, and the
  94 load on others is low. On the other hand, we could insist
  95 on a perfectly load balanced solution in which the number
  96 of papers assigned to each program committee member does not
  97 exceed $L= \lceil qN/P  \rceil$. However, this may lead to a solution
  98 which is not optimal from the perspective of the papers.
  99
 100 One of our goals is to study precisely this tradeoff, and allow each
 101 reviewer to be assigned up to $L + C$ papers,
 102 where $C$ is the {\em load tolerance}. We consider the question:
 103 is it possible to obtain a high quality
 104 assignment with a fairly low value of $C$? One can also ask whether,
 105 in such an assignment, the reviewers receive the papers that they would
 106 have most liked to review.
 107
 108 {\em Stinkers} are papers that pretty much no-one wanted to review.
 109 We would like to spread the load of the stinkers as evenly as possible.
 110
 111
 112 \section{Formulation as a Min-Cost Max-Flow Problem}
 113
 114 \begin{figure*}
 115 \begin{center}
 116 \def\eg#1{\fbox{#1}}
 117 \centerline{\includexfig{flow}}
 118 \caption{Flow Construction.}
 119 \label{flow-fig}
 120 \end{center}
 121 \end{figure*}
 122
 123 We formulate the assignment problem as a min-cost max-flow problem,
 124 where each unit of flow represents one review.
 125 The construction is somewhat involved in order to incorporate all
 126 the desired incentives.  In several places, it makes
 127 use of sets of ``parallel edges'' of different costs connecting a
 128 single pair of nodes $(x, y)$ to allow flow to be sent from $x$ to $y$
 129 at a cost that grows faster than linear in the amount of the flow.  For
 130 example, if there are five unit-capacity edges from $x$ to $y$ of costs
 131 1, 3, 5, 7, and 9, then any integer amount $f \le 5$ of flow can be
 132 sent from $x$ to $y$ at a cost of $f^2$.
 133
 134 An example of the graph is shown in Figure~\ref{flow-fig}.
 135 We have a source $s$ and a sink $t$.
 136 For each paper $j$ we create a set of nodes $p^1_j, p^2_j,p^3_j$, and for each
 137 reviewer $i$ we create a set of nodes $r^1_i, r^2_i, r^3_i$.  (The rationale
 138 for these sets is discussed below.)
 139 Flow can pass from $s$ through one or more of the nodes $r^t_i$ and
 140 one or more of the nodes $p^t_j$ to the sink to represent a review
 141 by reviewer $i$ of paper $j$.
 142
 143 Each paper has an edge of capacity $r$ to
 144 the sink, indicating that it needs $r$ reviews.  In general, these
 145 edges will constitute the min cut, so any max flow will saturate them
 146 and thereby provide all the required reviews.  We take the min-cost
 147 max flow in order to provide the reviews in the ``best'' possible way.
 148
 149 Each reviewer has a zero-cost edge of capacity $L$ from the source so that
 150 he/she can be assigned $L$ papers.  If that were all, we would get a perfectly
 151 load-balanced solution, but we may be able to improve the quality of the
 152 assignments by allowing some imbalance.
 153 Therefore, we allow each reviewer to be overloaded by up to $C$ papers
 154 ($C$ is the load imbalance parameter) via a set
 155 of $C$ additional unit-capacity edges from the source.  We make the cost of the
 156 $l$th edge an increasing function $f(l)$ to provide an incentive to
 157 balance load across reviewers.  Since $2f(1) < f(1) + f(2)$,
 158 a solution that loads two reviewers each by $L+1$ will be preferred
 159 to a solution that loads one reviewer by $L$ and the other by $L+2$
 160 unless the load imbalance in the second solution is outweighed by
 161 other benefits.
 162
 163 For each reviewer $i$ and proposal $j$, there is a unit-capacity edge from $i$
 164 to $j$ allowing that pair to be assigned, unless the reviewer declared a
 165 conflict of interest, in which case the edge is not present.  The edge cost is
 166 based on the desirability value $d_{ij}$ stated by reviewer $i$ for proposal
 167 $j$.  For values on the NSF scale of 1 (best) to 40 (worst), we chose the cost
 168 function $(10 + d_{ij})^2$, in an attempt to provide an incentive to avoid
 169 really bad matched pairs without completely masking the difference between a
 170 good matched pair and an excellent one.  This choice seeks only to achieve a
 171 natural relationship between a linear preference scale as normally interpreted
 172 and the costs to be used in the optimization.  We realize that strategic
 173 reviewers will take the cost function into account in choosing what desirability
 174 values to submit, in which case its form matters little.
 175
 176 Alongside these purely additive per-review costs,
 177 we want to avoid an individual reviewer
 178 getting too many papers he/she does not like.
 179 With respect to a reviewer $i$, we classify papers as ``interesting'',
 180 ``boring'', or ``very boring'' based on their desirability values;
 181 the thresholds for these classes are currently
 182 the same for all reviewers.  The edge for reviewer $i$ and paper
 183 $j$ leaves from $r^1_i$ if $j$ is interesting, $r^2_i$ if $j$ is boring, or
 184 $r^3_i$ if $j$ is very boring.  Since all edges from the source enter $r^1_i$,
 185 flow for boring and very boring papers is forced to pass through a set of
 186 parallel edges from $r^1_i$ to $r^2_i$, and flow for very boring papers
 187 must pass through an additional set of parallel edges from $r^2_i$ to $r^3_i$.
 188 In each of these sets, we make the cost of the $l$th edge an increasing
 189 function of $l$ to provide an incentive to balance the load of boring
 190 papers in the same way as for overload.
 191
 192 Finally, we would like at least one or two of each paper's reviews to be well
 193 qualified, if possible.  The method is the same as that for reviewer fairness.
 194 With respect to a paper $j$, we classify reviewers as ``expert'',
 195 ``knowledgeable'', or ``general'' by comparing the expertise values to uniform
 196 thresholds.  (Since this is the only place the expertise values are used, we
 197 effectively assume expertise takes on only these three values.)
 198 Edges representing
 199 expert reviews enter $p^1_j$, edges for knowledgeable reviews enter $p^2_j$,
 200 and edges for general reviews enter $p^3_j$; the edge to the sink leaves
 201 $p^3_j$.  A paper's first knowledgeable (or expert) review scores a bonus $c_2$
 202 by traversing a unit-capacity edge of cost $-c_2$ from $p^2_j$ to $p^3_j$,
 203 and an additional expert review scores another bonus $c_1$ by traversing a
 204 unit-capacity edge of cost $-c_1$ from $p^1_j$ to $p^3_j$.
 205 In addition to the bonus edges,
 206 there are edges of zero cost and unlimited capacity that reviews can follow
 207 from $p^1_j$ to $p^2_j$ and from $p^2_j$ to $p^3_j$ in order to reach the sink.
 208 The choice to offer bonuses for two reviews was based on the value $r = 3$;
 209 this would be easy to change for other values of $r$.
 210
 211 In the example in Figure~\ref{flow-fig},
 212 paper 1 is interesting to reviewer 1 and boring to reviewers 2 and 3.
 213 Reviewer 2 is expert on paper 1, with reviewers 1 and 3 merely knowledgeable.
 214 (Reviewer edges for paper 2 are not shown.)
 215 This illustrates how, in principle,
 216 the desirability and expertise relations might differ.
 217 Each is taken into account at a different stage of the construction.
 218
 219 The cost of a flow (assignment) is the sum of its reviewer overload costs,
 220 per-review costs, and reviewer boring / very boring load costs,
 221 minus paper bonuses.  Any one of these components can be traded off against
 222 the others.
 223
 224 \section{Evaluation}
 225
 226 We have implemented the matching algorithm, based on the construction above, in
 227 Haskell.
 228 The construction is intended to illustrate
 229 ways to model real concerns that we find reasonable a priori.
 230 We do not have enough experience with real-world
 231 instances to be confident that each part of the construction serves its intended
 232 purpose or that the parameter values we have chosen are suitable.
 233 Fortunately, the parameter values are easy to change in the source code of our
 234 implementation, and even substantive changes to the graph structure are not too
 235 hard to make.  At a minimum, we believe that min-cost max-flow provides a
 236 reasonable framework for global optimization of an assignment.
 237
 238 We have worked with Michael Hicks, program chair of POPL 2012, to apply our
 239 method to the paper assignment problem for that conference.
 240 The set of POPL 2012 reviewers consisted of the
 241 program committee (PC) and an external review committee (ERC),
 242 where the ERC served two purposes:
 243 \begin{itemize}
 244 \item To provide up to one knowledgeable (or expert) review per paper if
 245 prior knowledge of the topic was hard to find on the PC.
 246 \item To provide all reviews of papers with an author on the PC (``PC papers''),
 247 which were considered to
 248 pose an implicit conflict of interest to all PC members.
 249 \end{itemize}
 250 The special policy for ERC reviews of non-PC papers was realized via a more
 251 complex paper-side gadget, not described here.
 252 Based on an evaluation tool he wrote as well as manual inspection,
 253 Dr.~Hicks was satisfied that the decisions made by the matching tool
 254 closely reflected the best he could have done manually.
 255
 256 We are looking for additional opportunities to apply the matching tool.
 257 Anyone interested is invited to contact us so we can help adapt it
 258 to the scenario and document the experience gained.
 259
 260 %Waiting for data from NSF.
 261
 262 %Synthetic Data.
 263
 264 \section{Getting the Tool}
 265
 266 A distribution containing the source code for the matching tool as well as this
 267 document may be browsed or downloaded at (NOT YET):
 268 \[\hbox{\url{https://mattmccutchen.net/match/}}\]
 269 There are currently two branches:
 270 \begin{itemize}
 271 \item \code{master} has the tool as originally designed for NSF, with no
 272 distinction between desirability and expertise.
 273 \item \code{popl2012} is the basis of the version used for POPL 2012.  The main
 274 differences are that it has separate desirability and expertise, support for
 275 ``fixing'' previously chosen reviewer-paper pairs (buggy, however),
 276 and the special ERC gadget.
 277 \end{itemize}
 278 We regret that we do not currently have a single well-maintained version of the
 279 tool to recommend.
 280
 281 \begin{thebibliography}{99}
 282
 283 \bibitem{Flow}
 284 R. Ahuja, T. Magnanti and J. Orlin.
 285 Network Flows: Theory and Applications.
 286 {\em Prentice Hall}.
 287
 288
 289
 290 \end{thebibliography}
 291
 292 \end{document}