Matt McCutchen's Web Site - match/match.git/blame

Commit	Line	Data
f3084293 MM	1	\documentclass[11pt]{article}
	2	\usepackage{url}
	3	\usepackage[square,comma,numbers]{natbib}
	4	\usepackage{color}
	5
	6	% Figure stuff. To add a new figure named `foo.fig', put
	7	% `\includexfig{foo}' in the appropriate place and add `foo' to the
	8	% figures line at the top of the Makefile.
	9	\usepackage{ifpdf}
	10	\ifpdf
	11	\usepackage[pdftex]{graphicx}
	12	\def\includexfig#1{\input{#1.pdf_t}}
	13	\else
	14	\usepackage[dvips]{graphicx}
	15	\def\includexfig#1{\input{#1.ps_t}}
	16	\fi
	17
	18	\usepackage{amssymb, amsmath, amsthm}
	19	\newtheorem{theorem}{Theorem}
	20	\newtheorem{lemma}{Lemma}
	21
	22	\usepackage[letterpaper,left=2.6cm,right=2.6cm,top=2.6cm,bottom=2.6cm]{geometry}
	23
f3084293 MM	24	\usepackage{pstricks}
	25	\usepackage{delarray}
	26
17e0995d MM	27	\def\code#1{\texttt{#1}}
	28
	29	\title{Assigning Papers to Reviewers}
f3084293 MM	30
	31	\author{
	32	Samir Khuller\thanks{Research currently supported by CCF-0728839.
	33	Email:{\tt samir@cs.umd.edu}.}\\
	34	Dept.\ of Computer Science\\
	35	University of Maryland, College Park MD 20742.
	36	\and
	37	Richard Matthew McCutchen\thanks{
	38	The bulk of this work was done while Matt was at the Dept.\ of Computer Science,
	39	University of Maryland, College Park, and supported by REU Supplement to
	40	CCF-0430650. Current email: {\tt matt@mattmccutchen.net}.}\\
	41	%Department of Computer Science\\
	42	%University of Maryland, College Park, MD 20742.
	43	}
	44
	45	\date{}
	46
	47	\begin{document}
	48
	49	\maketitle
	50
	51	%\begin{abstract}
	52	%
	53	%\end{abstract}
	54
	55	\section{Introduction}
	56	Assignment problems arise in a variety of settings.
	57	For funding agencies such as NSF program directors
	58	that co-ordinate panels, assigning proposals to
	59	reviewers is a major challenge. It is important that each proposal receive
	60	sufficient review by qualified experts, and at the same time we would like to
	61	roughly balance the workload across reviewers and to honor the reviewers'
	62	preferences for which proposals they would like to read. The same
	63	issue arises for a program committee chair, who may have to assign
	64	literally hundreds of papers to a program committee consisting of
	65	thirty to forty program committee members.
	66
	67	%{\em What does CMT use? What does Easychair use?}
	68
	69	From now on we will focus on the problem of assigning papers to
	70	reviewers.
	71	We assume that each reviewer is given access to the
08f65a95	72	list of papers to be reviewed, and gives each paper both a ``preference''
f3084293 MM	73	score indicating his/her level of interest in reviewing the paper and an
f3084293 MM	74	``expertise'' score indicating how qualified he/she is to evaluate the paper.
08f65a95 MM	75	(Some organizations may use a single preference score and assume that it
08f65a95 MM	76	also indicates expertise. We believe that making the distinction may better
f3084293	77	model the real-world objective.)
17e0995d	78	A reviewer may also declare a conflict of interest with a particular paper,
f3084293 MM	79	meaning that he/she is forbidden to review the paper.
	80
	81	We do not consider stable marriage type preference lists,
	82	because a strict ranking of papers would be rather tedious
	83	to produce. In this scheme, the papers are essentially grouped
	84	into a few categories.
	85
	86	Let $N$ be the number of papers and $P$ be the number of reviewers.
17e0995d	87	Suppose that each paper needs $q$ reviews, so a total of $qN$
f3084293 MM	88	reviews need to be generated.
f3084293 MM	89	Ideally, from the perspective of the papers, we would like to
17e0995d	90	assign each paper the $q$ most qualified reviewers for the paper.
f3084293 MM	91	Of course, this could lead to a load imbalanced solution where
	92	the load on some program committee members is very high, and the
	93	load on others is low. On the other hand, we could insist
	94	on a perfectly load balanced solution in which the number
	95	of papers assigned to each program committee member does not
17e0995d	96	exceed $L= \lceil qN/P \rceil$. However, this may lead to a solution
f3084293 MM	97	which is not optimal from the perspective of the papers.
	98
	99	One of our goals is to study precisely this tradeoff, and allow each
	100	reviewer to be assigned up to $L + C$ papers,
	101	where $C$ is the {\em load tolerance}. We consider the question:
	102	is it possible to obtain a high quality
	103	assignment with a fairly low value of $C$? One can also ask whether,
	104	in such an assignment, the reviewers receive the papers that they would
	105	have most liked to review.
	106
	107	{\em Stinkers} are papers that pretty much no-one wanted to review.
	108	We would like to spread the load of the stinkers as evenly as possible.
	109
	110
17e0995d MM	111	\section{Formulation as a Min-Cost Max-Flow Problem}
	112
	113	\begin{figure*}
	114	\begin{center}
	115	\def\eg#1{\fbox{#1}}
	116	\centerline{\includexfig{flow}}
	117	\caption{Flow Construction.}
	118	\label{flow-fig}
	119	\end{center}
	120	\end{figure*}
f3084293	121
17e0995d MM	122	We formulate the assignment problem as a min-cost max-flow problem,
17e0995d MM	123	where each unit of flow represents one review.
f3084293	124	The construction is somewhat involved in order to incorporate all
17e0995d	125	the desired incentives. In several places, it makes
f3084293 MM	126	use of sets of ``parallel edges'' of different costs connecting a
	127	single pair of nodes $(x, y)$ to allow flow to be sent from $x$ to $y$
	128	at a cost that grows faster than linear in the amount of the flow. For
	129	example, if there are five unit-capacity edges from $x$ to $y$ of costs
	130	1, 3, 5, 7, and 9, then any integer amount $f \le 5$ of flow can be
	131	sent from $x$ to $y$ at a cost of $f^2$.
	132
17e0995d MM	133	An example of the graph is shown in Figure~\ref{flow-fig}.
17e0995d MM	134	We have a source $s$ and a sink $t$.
f3084293 MM	135	For each paper $j$ we create a set of nodes $p^1_j, p^2_j,p^3_j$, and for each
f3084293 MM	136	reviewer $i$ we create a set of nodes $r^1_i, r^2_i, r^3_i$. (The rationale
17e0995d MM	137	for these sets is discussed below.)
17e0995d MM	138	Flow can pass from $s$ through one or more of the nodes $r^t_i$ and
f3084293 MM	139	one or more of the nodes $p^t_j$ to the sink to represent a review
	140	by reviewer $i$ of paper $j$.
	141
b43dc33a MM	142	Each paper has an edge of capacity $q$ to
b43dc33a MM	143	the sink, indicating that it needs $q$ reviews. In general, these
f3084293 MM	144	edges will constitute the min cut, so any max flow will saturate them
	145	and thereby provide all the required reviews. We take the min-cost
	146	max flow in order to provide the reviews in the ``best'' possible way.
	147
	148	Each reviewer has a zero-cost edge of capacity $L$ from the source so that
	149	he/she can be assigned $L$ papers. If that were all, we would get a perfectly
	150	load-balanced solution, but we may be able to improve the quality of the
	151	assignments by allowing some imbalance.
	152	Therefore, we allow each reviewer to be overloaded by up to $C$ papers
	153	($C$ is the load imbalance parameter) via a set
	154	of $C$ additional unit-capacity edges from the source. We make the cost of the
	155	$l$th edge an increasing function $f(l)$ to provide an incentive to
	156	balance load across reviewers. Since $2f(1) < f(1) + f(2)$,
	157	a solution that loads two reviewers each by $L+1$ will be preferred
	158	to a solution that loads one reviewer by $L$ and the other by $L+2$
	159	unless the load imbalance in the second solution is outweighed by
	160	other benefits.
	161
4e46fa87	162	For each reviewer $i$ and paper $j$, there is a unit-capacity edge from $i$
17e0995d MM	163	to $j$ allowing that pair to be assigned, unless the reviewer declared a
17e0995d MM	164	conflict of interest, in which case the edge is not present. The edge cost is
08f65a95	165	based on the preference value $a_{ij}$ stated by reviewer $i$ for paper
17e0995d	166	$j$. For values on the NSF scale of 1 (best) to 40 (worst), we chose the cost
08f65a95	167	function $(10 + a_{ij})^2$, in an attempt to provide an incentive to avoid
17e0995d MM	168	really bad matched pairs without completely masking the difference between a
	169	good matched pair and an excellent one. This choice seeks only to achieve a
	170	natural relationship between a linear preference scale as normally interpreted
	171	and the costs to be used in the optimization. We realize that strategic
08f65a95	172	reviewers will take the cost function into account in choosing what preference
17e0995d MM	173	values to submit, in which case its form matters little.
	174
	175	Alongside these purely additive per-review costs,
	176	we want to avoid an individual reviewer
	177	getting too many papers he/she does not like.
	178	With respect to a reviewer $i$, we classify papers as ``interesting'',
08f65a95	179	``boring'', or ``very boring'' based on their preference values;
17e0995d MM	180	the thresholds for these classes are currently
17e0995d MM	181	the same for all reviewers. The edge for reviewer $i$ and paper
f3084293 MM	182	$j$ leaves from $r^1_i$ if $j$ is interesting, $r^2_i$ if $j$ is boring, or
f3084293 MM	183	$r^3_i$ if $j$ is very boring. Since all edges from the source enter $r^1_i$,
17e0995d MM	184	flow for boring and very boring papers is forced to pass through a set of
17e0995d MM	185	parallel edges from $r^1_i$ to $r^2_i$, and flow for very boring papers
f3084293 MM	186	must pass through an additional set of parallel edges from $r^2_i$ to $r^3_i$.
	187	In each of these sets, we make the cost of the $l$th edge an increasing
	188	function of $l$ to provide an incentive to balance the load of boring
17e0995d MM	189	papers in the same way as for overload.
	190
	191	Finally, we would like at least one or two of each paper's reviews to be well
	192	qualified, if possible. The method is the same as that for reviewer fairness.
	193	With respect to a paper $j$, we classify reviewers as ``expert'',
	194	``knowledgeable'', or ``general'' by comparing the expertise values to uniform
	195	thresholds. (Since this is the only place the expertise values are used, we
	196	effectively assume expertise takes on only these three values.)
	197	Edges representing
f3084293 MM	198	expert reviews enter $p^1_j$, edges for knowledgeable reviews enter $p^2_j$,
	199	and edges for general reviews enter $p^3_j$; the edge to the sink leaves
	200	$p^3_j$. A paper's first knowledgeable (or expert) review scores a bonus $c_2$
	201	by traversing a unit-capacity edge of cost $-c_2$ from $p^2_j$ to $p^3_j$,
	202	and an additional expert review scores another bonus $c_1$ by traversing a
	203	unit-capacity edge of cost $-c_1$ from $p^1_j$ to $p^3_j$.
	204	In addition to the bonus edges,
	205	there are edges of zero cost and unlimited capacity that reviews can follow
	206	from $p^1_j$ to $p^2_j$ and from $p^2_j$ to $p^3_j$ in order to reach the sink.
b43dc33a MM	207	The choice to offer bonuses for two reviews was based on the value $q = 3$;
b43dc33a MM	208	this would be easy to change for other values of $q$.
f3084293	209
17e0995d MM	210	In the example in Figure~\ref{flow-fig},
	211	paper 1 is interesting to reviewer 1 and boring to reviewers 2 and 3.
	212	Reviewer 2 is expert on paper 1, with reviewers 1 and 3 merely knowledgeable.
	213	(Reviewer edges for paper 2 are not shown.)
	214	This illustrates how, in principle,
08f65a95	215	the preference and expertise relations might differ.
17e0995d MM	216	Each is taken into account at a different stage of the construction.
	217
	218	The cost of a flow (assignment) is the sum of its reviewer overload costs,
	219	per-review costs, and reviewer boring / very boring load costs,
	220	minus paper bonuses. Any one of these components can be traded off against
	221	the others.
	222
	223	\section{Evaluation}
	224
	225	We have implemented the matching algorithm, based on the construction above, in
	226	Haskell.
	227	The construction is intended to illustrate
	228	ways to model real concerns that we find reasonable a priori.
	229	We do not have enough experience with real-world
	230	instances to be confident that each part of the construction serves its intended
	231	purpose or that the parameter values we have chosen are suitable.
	232	Fortunately, the parameter values are easy to change in the source code of our
	233	implementation, and even substantive changes to the graph structure are not too
	234	hard to make. At a minimum, we believe that min-cost max-flow provides a
	235	reasonable framework for global optimization of an assignment.
	236
	237	We have worked with Michael Hicks, program chair of POPL 2012, to apply our
	238	method to the paper assignment problem for that conference.
	239	The set of POPL 2012 reviewers consisted of the
	240	program committee (PC) and an external review committee (ERC),
	241	where the ERC served two purposes:
	242	\begin{itemize}
	243	\item To provide up to one knowledgeable (or expert) review per paper if
	244	prior knowledge of the topic was hard to find on the PC.
	245	\item To provide all reviews of papers with an author on the PC (``PC papers''),
	246	which were considered to
	247	pose an implicit conflict of interest to all PC members.
	248	\end{itemize}
	249	The special policy for ERC reviews of non-PC papers was realized via a more
	250	complex paper-side gadget, not described here.
	251	Based on an evaluation tool he wrote as well as manual inspection,
	252	Dr.~Hicks was satisfied that the decisions made by the matching tool
	253	closely reflected the best he could have done manually.
	254
	255	We are looking for additional opportunities to apply the matching tool.
	256	Anyone interested is invited to contact us so we can help adapt it
	257	to the scenario and document the experience gained.
	258
	259	%Waiting for data from NSF.
	260
	261	%Synthetic Data.
	262
	263	\section{Getting the Tool}
	264
	265	A distribution containing the source code for the matching tool as well as this
fcdd42c0	266	document may be browsed or downloaded at:
17e0995d MM	267	\[\hbox{\url{https://mattmccutchen.net/match/}}\]
	268	There are currently two branches:
	269	\begin{itemize}
	270	\item \code{master} has the tool as originally designed for NSF, with no
08f65a95	271	distinction between preference and expertise.
17e0995d	272	\item \code{popl2012} is the basis of the version used for POPL 2012. The main
08f65a95	273	differences are that it has separate preference and expertise, support for
17e0995d MM	274	``fixing'' previously chosen reviewer-paper pairs (buggy, however),
	275	and the special ERC gadget.
	276	\end{itemize}
	277	We regret that we do not currently have a single well-maintained version of the
	278	tool to recommend.
f3084293 MM	279
	280	\begin{thebibliography}{99}
	281
	282	\bibitem{Flow}
	283	R. Ahuja, T. Magnanti and J. Orlin.
	284	Network Flows: Theory and Applications.
	285	{\em Prentice Hall}.
	286
	287
	288
	289	\end{thebibliography}
	290
	291	\end{document}