I’m working on the literature review chapter of the dissertation and it’s gotten me thinking. It’s a real pain to put together a good survey. It’s hard to know what papers are out there, what they say, and what’s notable about them. I’ve been using a few tools for this, but there’s a lot of room for improvement.
I’ve been using citeulike for a while, and it’s great for scraping IEEE and JASA abstracts. It can import bibtex files, but they’re harder to get linked in to pdfs and you might end up with a lot of duplicates if you’re not careful. On Neeraj’s recommendation, I’ve been trying out mendeley and it’s a bit cooler. It can read in a directory of pdfs and figure out to some extent what they are. This is most useful with popular papers, because they have some sort of fingerprinter that recognizes the same pdf from multiple users and matches up the metadata. That way, only one person has to correct each entry and others can benefit. I’m not sure if it’s been able to recognize when a pdf is the same as a bibtex entry, but it might be able to. They also seem to have a very responsive feedback system using UserVoice. And of course there’s always google scholar to actually find these papers.
But, I think these apps could be a lot more useful. Instead of just linking to the papers that cite a paper, a lot could be gained by keeping track of the “anchor text” that does the linking. This means not only noting that paper X cites paper Y, but that paper X describes paper Y in this way, uses this information from it, cites it in this context, etc.
The first thing that this would enable would be the annotation of a paper’s bibliography with the relevant parts of its text. These are all of the outgoing links from a paper. By analyzing the paper, all of the [22]s or the (Mojo, 1987)s could be associated with the right entry in the bibliography. It would give the references some much-needed context. It would also show which references in a bibliography were actually discussed and which were just mentioned in passing. This could be an application by itself. And while we’re linking the references to the bibliography, it could put in some hyperlinks, like the hyperref package in latex does, but after the fact.
The second thing that it would enable would be the annotation of a paper with all of the things that other papers have said about it. These would be all of the incoming links from other papers. It would give you some context on a new paper that you had just come across in addition to what the authors wrote in the abstract. If you wanted to be really fancy, each reader could have trusted sources of these incoming links. These opinions are like little mini reviews or summaries that have already been published, no need to solicit readers’ opinions. Instead of the first few sentences on a “papers that cite X” page on google scholar, you’d get a page of summaries, reviews, and extracts.
Both of these features would make it much easier to get introduced to a new field or to write a more balanced review of a familiar or semi-familiar field. I know it’s tough to match up bibliography entries with references and with papers themselves, and that there are some user interface issues to work out here, but it shouldn’t be that hard. Maybe crowdsourcing could help if necessary. Hopefully all of this would help allay that niggling fear of missing an important paper.

I’d like to have a firefox memory monitor, like the unix program top. It would show a list of all of the web pages currently open in different tabs and windows and how much of my system resources they’re each using. At the last, I’d like to know how much memory and CPU each is using, but other things like network connections, bandwidth, etc would be nice to know as well.