\documentclass[11pt]{article}
\usepackage{url}
\usepackage[square,comma,numbers]{natbib}
\usepackage{color} 

% Figure stuff.  To add a new figure named `foo.fig', put
% `\includexfig{foo}' in the appropriate place and add `foo' to the
% figures line at the top of the Makefile.
\usepackage{ifpdf}
\ifpdf
 \usepackage[pdftex]{graphicx}
 \def\includexfig#1{\input{#1.pdf_t}}
\else
 \usepackage[dvips]{graphicx}
 \def\includexfig#1{\input{#1.ps_t}}
\fi

\usepackage{amssymb, amsmath, amsthm}
\newtheorem{theorem}{Theorem}
\newtheorem{lemma}{Lemma}

\usepackage[letterpaper,left=2.6cm,right=2.6cm,top=2.6cm,bottom=2.6cm]{geometry}

\usepackage{floatflt}
\usepackage{pstricks}
\usepackage{delarray}

\title{Assigning Reviewers to Proposals}

\author{
Samir Khuller\thanks{Research currently supported by CCF-0728839.
Email:{\tt samir@cs.umd.edu}.}\\
Dept.\ of Computer Science\\
University of Maryland, College Park MD 20742.
\and
Richard Matthew McCutchen\thanks{
The bulk of this work was done while Matt was at the Dept.\ of Computer Science,
University of Maryland, College Park, and supported by REU Supplement to 
CCF-0430650. Current email: {\tt matt@mattmccutchen.net}.}\\
%Department of Computer Science\\
%University of Maryland, College Park, MD 20742.
}

\date{}

\begin{document}

\maketitle

%\begin{abstract}
%
%\end{abstract}

\section{Introduction}
Assignment problems arise in a variety of settings.  
For funding agencies such as NSF program directors 
that co-ordinate panels, assigning proposals to 
reviewers is a major challenge. It is important that each proposal receive
sufficient review by qualified experts, and at the same time we would like to
roughly balance the workload across reviewers and to honor the reviewers'
preferences for which proposals they would like to read. The same
issue arises for a program committee chair, who may have to assign
literally hundreds of papers to a program committee consisting of
thirty to forty program committee members. 

%{\em What does CMT use? What does Easychair use?}

From now on we will focus on the problem of assigning papers to
reviewers. 
We assume that each reviewer is given access to the
list of papers to be reviewed, and gives each paper both a ``desirability''
score indicating his/her level of interest in reviewing the paper and an
``expertise'' score indicating how qualified he/she is to evaluate the paper.
(Some organizations may choose to use a single set of scores for both
desirability and expertise. We believe that making this distinction may better
model the real-world objective.)
A reviewer may also report a conflict of interest with a particular paper,
meaning that he/she is forbidden to review the paper.

We do not consider stable marriage type preference lists,
because a strict ranking of papers would be rather tedious
to produce. In this scheme, the papers are essentially grouped
into a few categories.

Let $N$ be the number of papers and $P$ be the number of reviewers.
Suppose that each paper needs $r$ reviews, so a total of $rN$
reviews need to be generated.
Ideally, from the perspective of the papers, we would like to
assign each paper the $r$ most qualified reviewers for the paper.
Of course, this could lead to a load imbalanced solution where
the load on some program committee members is very high, and the
load on others is low. On the other hand, we could insist 
on a perfectly load balanced solution in which the number
of papers assigned to each program committee member does not
exceed $L= \lceil rN/P  \rceil$. However, this may lead to a solution
which is not optimal from the perspective of the papers.

One of our goals is to study precisely this tradeoff, and allow each
reviewer to be assigned up to $L + C$ papers,
where $C$ is the {\em load tolerance}. We consider the question:
is it possible to obtain a high quality
assignment with a fairly low value of $C$? One can also ask whether,
in such an assignment, the reviewers receive the papers that they would
have most liked to review.

{\em Stinkers} are papers that pretty much no-one wanted to review. 
We would like to spread the load of the stinkers as evenly as possible.
 

\section{Formulation as a Min-Cost Flow Problem}

Our main approach is to formulate this as a min-cost flow problem.
The construction is somewhat involved in order to incorporate all
the appropriate incentives.  It makes
use of sets of ``parallel edges'' of different costs connecting a
single pair of nodes $(x, y)$ to allow flow to be sent from $x$ to $y$
at a cost that grows faster than linear in the amount of the flow.  For
example, if there are five unit-capacity edges from $x$ to $y$ of costs
1, 3, 5, 7, and 9, then any integer amount $f \le 5$ of flow can be
sent from $x$ to $y$ at a cost of $f^2$.

The construction is done as follows: we have a source $s$ and a sink $t$.
For each paper $j$ we create a set of nodes $p^1_j, p^2_j,p^3_j$, and for each
reviewer $i$ we create a set of nodes $r^1_i, r^2_i, r^3_i$.  (The rationale
for these sets is discussed below.)  See Figure~\ref{flow-fig} for an
example.  Flow can pass from $s$ through one or more of the nodes $r^t_i$ and
one or more of the nodes $p^t_j$ to the sink to represent a review
by reviewer $i$ of paper $j$.

Each paper has an edge of capacity $r$ to
the sink, indicating that it needs $r$ reviews.  In general, these
edges will constitute the min cut, so any max flow will saturate them
and thereby provide all the required reviews.  We take the min-cost
max flow in order to provide the reviews in the ``best'' possible way.

Each reviewer has a zero-cost edge of capacity $L$ from the source so that
he/she can be assigned $L$ papers.  If that were all, we would get a perfectly
load-balanced solution, but we may be able to improve the quality of the
assignments by allowing some imbalance.
Therefore, we allow each reviewer to be overloaded by up to $C$ papers
($C$ is the load imbalance parameter) via a set
of $C$ additional unit-capacity edges from the source.  We make the cost of the
$l$th edge an increasing function $f(l)$ to provide an incentive to
balance load across reviewers.  Since $2f(1) < f(1) + f(2)$,
a solution that loads two reviewers each by $L+1$ will be preferred
to a solution that loads one reviewer by $L$ and the other by $L+2$
unless the load imbalance in the second solution is outweighed by
other benefits.

For each reviewer $i$ and proposal $j$, there is a unit-capacity edge from $i$
to $j$ allowing that assignment to be made, unless the reviewer declared a
conflict of interest, in which case the edge is not present.  The edge has
cost $(10 + p_{ij})^2$, where $p_{ij}$ is the preference value
expressed by reviewer $i$ for proposal $j$.
We assume $p_{ij}$ is a value between 1 and 40 (as used by NSF).
The cost function was chosen to provide an incentive to avoid really bad
assignments without completely masking the difference between a good assignment
and an excellent assignment.

These purely additive assignment costs make the algorithm prefer better
assignments in general but do nothing to promote fairness among reviewers or
among papers.  To do that, we introduce additional reviewer-side costs and
paper-side bonuses.  With respect to a
reviewer $i$, we classify papers as interesting (preference value 1 to 15), boring
(16 to 25), or very boring (26 to 40).  The edge for reviewer $i$ and paper
$j$ leaves from $r^1_i$ if $j$ is interesting, $r^2_i$ if $j$ is boring, or
$r^3_i$ if $j$ is very boring.  Since all edges from the source enter $r^1_i$,
flow for boring and very boring assignments is forced to pass through a set of
parallel edges from $r^1_i$ to $r^2_i$, and flow for very boring assignments
must pass through an additional set of parallel edges from $r^2_i$ to $r^3_i$.
In each of these sets, we make the cost of the $l$th edge an increasing
function of $l$ to provide an incentive to balance the load of boring
assignments in the same way as for overload.

Similarly, we wish to guarantee each paper at least one or two good reviews.
With respect to a paper $j$, we classify reviewers as expert (preference value
1 to 10), knowledgeable (11 to 20), or general (21 to 40).  Edges representing
expert reviews enter $p^1_j$, edges for knowledgeable reviews enter $p^2_j$,
and edges for general reviews enter $p^3_j$; the edge to the sink leaves
$p^3_j$.  A paper's first knowledgeable (or expert) review scores a bonus $c_2$
by traversing a unit-capacity edge of cost $-c_2$ from $p^2_j$ to $p^3_j$,
and an additional expert review scores another bonus $c_1$ by traversing a
unit-capacity edge of cost $-c_1$ from $p^1_j$ to $p^3_j$.
In addition to the bonus edges,
there are edges of zero cost and unlimited capacity that reviews can follow
from $p^1_j$ to $p^2_j$ and from $p^2_j$ to $p^3_j$ in order to reach the sink.
The choice to offer bonuses for two reviews was based on the value $r = 3$;
this would be easy to change for other values of $r$.

The cost of a flow is the sum of its reviewer overload costs,
assignment costs, and reviewer boring / very boring load costs,
minus paper bonuses.  Any one of those components can be traded off against
the others.  We attempted to assign reasonable weights to each component,
but it is difficult to know without testing the algorithm on real data.
In any event, all the parameters are easy to tune to realize the priorities
of a particular application.

\begin{figure*}
\begin{center}
\centerline{\includexfig{flow}}
\caption{Flow Construction.}
\label{flow-fig}
\end{center}
\end{figure*}


\section{Experimental Results} 
Waiting for data from NSF. 

Synthetic Data.

\section{Conclusions}

The source code for the current version of the proposal matcher may be
browsed or downloaded at:
\[\hbox{\url{TODO}}\]

\begin{thebibliography}{99}

\bibitem{Flow}
R. Ahuja, T. Magnanti and J. Orlin.
Network Flows: Theory and Applications.
{\em Prentice Hall}.


\end{thebibliography}

\end{document}