Zooming out: New tools for probing the historical record and the human genome
New structures often emerge when we explore a known phenomenon from a more global vantage point. For instance, the local structure of DNA is a double helix. But if DNA did not fold further, the human genome - which is two meters long - could never fit inside the nucleus of a cell. How does it fold? Or: any given book can be read and comprehended. But what happens when we try to read all the books at once? This talk will focus on the extraordinary potential of technologies that enable us to zoom out, in the process transforming familiar concepts, like the shape of DNA or the contents of a book, into new research horizons.
First, I will describe Hi-C, a novel technology for probing the three-dimensional architecture of whole genomes. Developed together with collaborators at the Broad Institute and UMass Medical School, Hi-C couples proximity-dependent DNA ligation and massively parallel sequencing. My lab employs Hi-C to construct spatial proximity maps of the human genome. Hi-C maps have revealed that active and inactive portions of the human genome are spatially segregated, ie, that cells employ a sort of 'regulatory origami' as they turn genes on and off. At the megabase scale, the genomic fold is consistent with a fractal globule, a knot-free conformation that enables maximally dense packing while preserving the ability to easily fold and unfold any genomic locus.
Next, I will describe efforts, together with my collaborator Jean-Baptiste Michel and Google, to create tools for the quantitative analysis of a significant portion of the historical record. We began by constructing a reliable corpus of digitized texts containing about 4% of all books ever printed. Analysis of this corpus enables us to investigate cultural trends quantitatively. We survey the vast terrain of 'culturomics,' focusing on linguistic and cultural phenomena that were reflected in the English language between 1800 and 2000. We show how this approach can provide insights about fields as diverse as lexicography, the evolution of grammar, collective memory, the adoption of technology, the pursuit of fame, censorship, and historical epidemiology. Such analyses are intuitive and addictive: the Google Ngram Viewer, a simple web-based tool we released for the analysis of this corpus was used over a million times in the first 24 hours. Culturomics extends the boundaries of rigorous quantitative inquiry to a wide array of new phenomena spanning the social sciences and the humanities
Speaker: Erez Lieberman-Aiden, Google / Boston & Harvard Society of Fellows
Room 430
Monday, 04/23/12
Contact:
Website: Click to VisitCost:
FreeSave this Event:
iCalendarGoogle Calendar
Yahoo! Calendar
Windows Live Calendar
