As the City embarks on implementing Intro 363-2014 and unlocking its daily actions, we are building a public workgroup to unlock the decades of historical information and make it accessible to all, at no charge.
Our group’s goal is to disassemble digital copies of City Record and convert them into usable notifications, words, dates, and events. We want to make solicitation procurement notices and awards, public hearings, meetings, court notices, property dispositions, agency public hearings, agency rules, and changes in personnel into a powerful archive for all.
This project will start by converting the City Record PDFs, with more than 15 years of data, into usable information. Facilitated by BetaNYC and Socrata, we will turn these files into a first class collection of information that builds a smarter city. Through this process businesses, community groups, academics, and the public will learn how their City government works. This unique collaboration of government, industry, hackers, and advocates, illustrates that opening up data isn't just about transparency but actually building smarter, more inclusive, and resilient governance.
- City of New York
- BetaNYC
- Citizens Union
- Dev Bootcamp
- Ontodia
- Socrata
- Sunlight Foundation
Currently, there are three things you can do.
- Join the City Record Online Working Group, aka CROW discussion list.
- Help download and share PDFs
- Help document schemas
- Help document tools to help scrape
Currently, we are working on developing a number of outlets to download this treasure trove of information. In total, there are 16.3 gigs of archival PDFs. If you have any problems downloading these files, report an issue on GitHub or the Discussion List
- 1998 to 2008 are scaned documents
- March 2008 till present are 'text selectable'
If you have a tool that will crawl and download websites, you can download all of the PDFs from the City.
- Browse All City Records
- Download XML Listing for All City Records
- Suggested tools - SiteSucker, a Mac OSX, Windows Tool, SUGGESTION NEEDED
We have shared the complete collection of files via Dropbox. You can download them individually or you can add the primary folder and sync to a local storage device. (shareable link bit.ly/dropbox-crow)
As a bit of an experiment we are using BitTorrent protocol. This is the complete 16.3 gigs. Please help by downloading and socializing these PDFs. You will need to download BitTorrent Sync and use the following read-only 'secret' passcode "BDGM4KAQHZ6XII2JNJREDX6VDN3QTLI7G" (shareable link bit.ly/bts-crow)
BetaNYC is hosting an FTP server with all of the PDFs. These files can be fetched anonymously via files.betanyc.us.
If you have the Google Drive for your computer, you can download all of the PDFs to your local computer. We are in the process of uploading all of the documents. For now, you can only access the text selectable documents (March 2008 till present) via http://bit.ly/gdocs-crow