Code Challenge
Update: the winner of the Code Challenge is
From the application home page:
This application finds all the different names for a particular species, and visualizes how often those names were published through history. You can see how the use of names grow and change. The names are provided by the Encyclopedia of Life and the publications are from the Biodiversity Heritage Library.
Ryan also provided us with a statement about his application:
The graphs are intentionally a bit obtuse. My idea was to use heavy
smoothing and symmetry to create graphs that felt very organic—as
organic as the organisms they represent. I also wanted these blobby
graphs to represent the "flow" of time and and knowledge. While it's
true that these graphs are a bit self-indulgent and do not show
individual data points as well as a line graph with annotated maxima,
they put a unique spin on biodiversity data that perhaps reveals large
trends more effectively than a more traditional line graph. I think
it's valuable to have both numerically rigorous graphs as well as a
more artistically expressive interpretation of the same data; Rod Page
et al do a great job of the former, and I'm happy to provide the
latter.
One consequence of these graphs is that they make the copyright issue
in the BHL corpus immediately apparent. One possible next direction
would be to normalize the number of publications against the total
number of works in BHL. Thus instead of the graphs displaying the raw
counts of publication, they would display the percentage of the BHL
corpus in which a species name occurs. This would be a good experiment
to see how they affect the shape of the graphs.
I made the synynyms application completely from scratch for the Life
and Literature conference, based on an idea I had while working for
Holly Miller at the MBL. When the Life and Literature conference
announced its code challenge, I knew that it was the perfect
opportunity to turn my idea into an application. I am currently
available for freelance design, development, and data visualization
projects.
Congratulations, Ryan! Thanks as well to our other entrants.
As part of the Life and Literature conference we are holding a code challenge to find new, innovative ways to use, disseminate or display BHL (Biodiversity Heritage Library) data.
Details
The Biodiversity Heritage Library (BHL) is a consortium of 12 natural history and botanical libraries that cooperate to digitize and make accessible the legacy literature of biodiversity held in their collections and to make that literature available for open access and responsible use as a part of a global “biodiversity commons.” BHL also serves as the foundational literature component of the Encyclopedia of Life (EOL). BHL content may be freely viewed through the online reader or downloaded in part or as a complete work in PDF, OCR text, or JPG2000 file formats.
Your challenge is to provide
a new, innovative way to use, disseminate or display BHL data
a description of what your project is trying to accomplish
the source code to reproduce the application
any libraries or supporting code needed to reproduce the application
any build instructions or scripts are needed to build application or instructions how to run it
any notes about your experience implementing this code: how you came up with your design, blind alleys you went up, or surprising problems you ran into or anything else you want to share.
The dataset
Through local and global digitization efforts, BHL has digitized over 32 million pages of taxonomic literature, representing over 45,000 titles and 87,000 volumes (January 2011). The entire -corpus- dataset is freely available and accessible via many open methods (see next section for details).
Methods to access the BHL dataset
APIs –SOAP or simple HTTP Query (REST-like). The HTTP Query interface returns JSON or XML.
Data exports - PDF, MODS XML, text files, EndNote, BibTex
Scientific name services using TaxonFinder Stable URLs
OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting)
Full documentation on all of these methods is available on our wiki.
Rules
Any programming language is acceptable for any hardware platform and OS, mobile phones and web applications included.
The program or application can be command line, a GUI, a mobile application, a web application, or—for languages that provide some kind of interactive environment—a function that can be called with the name of the file to parse
The use of the APIs is restricted as defined in Terms of Service for BHL Application Programming Interfaces (APIs)
Developers may collaborate with others to develop the best apps to enhance the experience
Developers can retain full IP rights to their submissions and can host their apps wherever they'd like, or make their projects open source under the license of their choosing