The actual site: Project Gutenberg
I haven't been actively helping out much lately with Distributed Proofreader's or PG, but I did recently send them a complete list of problematic entries in their machine readable catalog. The catalog is an XML document with bibliographic and file entries. One bibliographic entry may correspond to multiple files in different formats or possibly parts. However there were quite a few file entries for which the bibliographic entry was missing entirely (someone had been getting sloppy). The five missing entries for existing texts were all in the last 3000 or so of nearly 30000 texts. I discovered the problem and the list of missing entries trying to figure out a bug with parsing a catalog with this problem using GutenPy. It is a nice little browser and reader for PG that has apparently been given up on by its author and aside from some minor bugs such as this which could be shaken out pretty easily has some very nice features like bookmarking.