Wenhao Jia

HotCRP Beamer Generator

Download the code here and try it on your own HotCRP site. Read on for some background and detailed instructions.

Introduction

HotCRP is a commonly used software package for setting up and managing peer reviewed academic conference websites. It is maintained by Professor Eddie Kohler.

In 2013, while serving as the submission co-chair for the 40th International Symposium on Computer Architecture (ISCA 2013), I wrote a few scripts to automatically generate slides from HotCRP webpages for the program committee (PC) meeting. These slides contain conflict-of-interest information. In particular, each paper submission generates two slides. The first slide lists PC members who conflict with the upcoming paper under discussion (and should thus leave the meeting room), and (after the conflicted PC members have left) the second slide shows various statistics of the paper submission. An example of the slides can be found here. They are made using Latex's Beamer class.

In response to popular demands, I'm releasing the scripts and the corresponding directions below. These scripts have been tested to work with HotCRP v2.52. However, they are actually quite fragile because they rely a lot on some arbitrary HotCRP HTML file structures. If a future HotCRP release breaks the compatibility, please drop me a line and I'll try to fix it.

The steps below may look daunting, but they are actually quite easy to follow. In particular, the most time-consuming part is actually the preparatory steps.

NOTE: I forgo all copyrights of the code. You are free to make any modifications to the code and/or use it in your own projects. However, I would appreciate your mentioning my name or this webpage if you use the code in a conference PC meeting. Thank you!

Preparatory Steps

  1. Download the code and extract the content into a work directory. For this tutorial, the directory is named isca40.
  2. In your web browser, log in to the HotCRP website as an administrator. On the main page, use the search bar to list all needed papers (e.g. all Submitted Papers). Click the subsequent table column headers to sort papers into the display order you want in the generated slides (e.g. by decreasing Overall Metric). Save that web page as papers.html into the isca40 folder. Just saving the page source is fine; there is no need for images or scripts.
  3. In your web browser, log in to the HotCRP website as an administrator. On the main page, click Users in the right-hand side Administration toolbox. Save that web page as users.html into the isca40 folder. Just saving the page source is fine; there is no need for images or scripts.
  4. Make some custom configuration if needed.
    1. (Required) Open makefile and replace isca40.cs.princeton.edu with your HotCRP website address at the beginning of the file.
    2. (Optional) Sometimes, PC members submit papers that are registered under their non-primary HotCRP accounts. The automated scripts may then fail to recognize the conflict-of-interest. If this happens, open pc.py and change the first variable definition to manually add these exceptions.
    3. (Optional) If some PC members are absent from a PC meeting and cannot lead discussions, add their names to the beginning of beamer.py. The script will pick a different discussion leader for relevant papers.
    4. (Optional) Normally, the PC member reviewer who gives the highest score to a paper is chosen as the discussion lead. However, if for some papers certain PC members must be hard-coded to lead the discussions, add those rules to the beginning of beamer.py.

Main Steps

The steps below have been tested under OS X and Ubuntu. They assume some minimal proficiency with command line tools.

  1. In a command line window, go to the work directory (isca40 for this tutorial) and execute the following command.
    make
    That's all you need to do to get the final Beamer file, except for some interactive questions if this is the first time you invoke this command. Read on to find out what the command does.
    1. It invokes id.py to make a list of paper IDs from papers.html.
    2. You may then be asked for a HotCRP website administrator's login email and password. This information is used by cookies.sh to save the login cookies for later use. Don't worry, your password is discarded immediately after use and is not saved to the disk.
    3. It executes papers.sh to download every paper's information. This may take a few minutes. Very rarely, due to poor network connections, the crawler may hang forever on a particular paper, e.g. papers/i.html. If this happens, break out of the script with Ctrl + C, delete the partially downloaded papers/i.html, and rerun the make command above. The crawler will resume from papers/i.html onward.
    4. The Python script pc.py is executed to parse users.html and generate a list of PC members and papers authored by them in pc.csv. You can open or print this file with Excel to track PC conflicts during a PC meeting.
    5. The Python script beamer.py is invoked to generate the final Beamer file (beamer/beamer.tex), a papers.csv file (a sort of reverse index of pc.csv), and a authors.csv file (a list of every paper's full author information). Like pc.csv, papers.csv and authors.csv can be used during a PC meeting for tracking purposes.
  2. Open the generated Beamer file, beamer/beamer.tex, and make whatever custom changes you want. When all is done, compile the Beamer file with the following command.
    cd beamer; pdflatex beamer.tex; pdflatex beamer.tex
    CAUTION: If the Beamer file fails to compile, it might be due to the limited character set Latex supports. If so, add rules to the escape() function in beamer.py to escape non-ASCII HTML characters.

It is worth knowing that the beamer.py script caches parsing results in a beamer.pickle file. Deleting that file would force the script to automatically re-parse all HTML files in the papers folder. However, if you are ever in a situation in which you need to re-generate the Beamer file, the safest way to ensure up-to-date results is probably by running make clean to remove all intermediate files. This will delete every non-script file and reset the directory to a pristine state with the exception of the manually saved papers.html and users.html files.

Final Output

The main output is a Latex Beamer source file. There are also a few useful CSV files.

  • beamer/beamer.tex: Use Latex to compile this file to get the final slides.
  • pc.csv: A list of PC members and IDs of the papers they have authored.
  • papers.csv: A list of all papers and the number of PC member authors.
  • authors.csv: For each paper, the complete author information including full names and affiliations.

Acknowledgments

I learned some neat shell scripting tricks from my colleague, Yavuz Yetim, in the process of making these scripts and I appreciate his help. I wold also like to thank my advisor, Professor Margaret Martonosi, for giving me the opportunity to contribute to the organization of an academic conference. Finally, much kudos to Professor Eddie Kohler for having made HotCRP such great open-source software!

This article was last updated on 2/12/2014.