The IPython notebook 'Project 1948 - Language detection and conversion.ipynb' contains information of how the data is being extracted. Since the data is stored in Google Drive, a client-secrets file would be needed.
The IPython notebook 'Using Local Copy of Project 1948' contains information of how the data from local copies of the text files is being stored in strings.