Tyler A. Green

In Transit

Tag: ruby

Updated: New York City Transit Frequency Visualization

Since I detailed my New York City transit frequency visualization project last month, there have been a few updates. Check out the web tool to view the changes!

What’s new?

  • The frequency buckets have been realigned to better parallel the psychology of how we use transit. The bins now group trips of less than 4 trips per hour, 4 to 8 trips per hour, and more than 8 trips per hour. Less than 4 trips per hour is generally the threshold where riders should consult a schedule before waiting on a curb, so it was important to separate these visually. The thickness of each edge now also increases with frequency.
  • There is now much more coverage in Queens bus data. No, MTA did not see my first update and decide to expand Queens service, though that would be awesome! I communicated with the Transitland team and my tool helped them discover they were previously missing the feed for the MTA Bus Company. It was historically a separate company and still has its own GTFS feed. I came up with some wild conclusions in my previous post on this project, several of which were rendered invalid by the completion of the data set.

What’s up next?

I’d still like to filter the express bus routes, provide finer-grained sorting by mode, and increase the dynamic nature of the tool in general. I’ve been working on an updated Ruby client to pair with the Transitland datastore, and have already updated my project source with the new interface. I’ve also begun dabbling with GTFS-realtime and plan to build a project with this specification soon.

We’re all #InTransit everyday and I hope to have many more updates soon!

What kind of things are you working on? Let me know in the comments below!

The frequency data for subway routes on a Friday morning for New York City transit. The darker the color, the higher the frequency!

The frequency data for subway routes on a Friday morning in New York City. The darker the color, the higher the frequency!

A Ruby Gem for GTFS to GeoJSON Conversion

I published my first Ruby gem: gtfs-geojson! You can view the source on GitHub. gtfs-geojson is a Ruby utility to convert a GTFS feed to a GeoJSON file. It’s a simple endeavor, for sure, but I’m pleased with what I learned along the way.

Let’s start out with some before-and-after views of the data. These images were created using QGIS, OpenStreetMap, Transfort’s GTFS feed, and the gtfs-geojson library.

The Transfort GTFS data loaded in QGIS before applying the Ruby gem for GTFS to GeoJSON conversion.

This map displays the shapes.txt file from Transfort’s GTFS feed loaded into QGIS. The seemingly-inconsistent shading on the lines is because there are no lines at all; each “line” is made up of a sequence of points. Each point contains a route ID and is ordered relative to the other points in its route by a point sequence value.

The Transfort GTFS data loaded in QGIS after applying the Ruby gem for GTFS to GeoJSON conversion.

After running the GTFS feed through gtfs-geojson, you now have a GeoJSON file whose features are each route from the original feed. I used “Categorized” styles in QGIS to quickly apply a unique color to each route.

As with most transit projects, the input to gtfs-geojson is a GTFS feed. GTFS is the standard format published by transit agencies worldwide to make their routes, stops, and even fares usable by developers. The data is a series of comma-separated text files. To validate a GTFS feed, I used an existing gem. gtfs will fail gracefully if the shapes.txt file is not present, which is the only file I actually need for the conversion to GeoJSON.

gtfs-geojson implements the same algorithm as the “Points to path” QGIS tool I used when looking at Transfort bus data. The main trick is that the points within each route ID must be sorted by their point sequence value. Several other QGIS plugins I tried did not do this correctly, so don’t forget this if implementing this yourself!

While QGIS tools output shapefiles, gtfs-geojson produces a GeoJSON file, which is a JSON stream with geospatial points and polylines data served up in a standard format. I have previously loaded GeoJSON files in Mapbox applications, and they are also useful in a GIS context. The following three lines will load the library, validate the GTFS feed, convert its shapes.txt file to GeoJSON format, and write the GeoJSON to a file.

require 'gtfs-geojson'
geojson = GTFS::GeoJSON.generate("gtfs.zip")
File.open("gtfs.geojson",'w') do { |f| f.write(geojson) }

That’s it! Let me know if you have any suggestions! The README on the GitHub repo gives installation instructions.

The most valuable tip I learned while creating this gem was the use of the $RUBYLIB environment variable. This isn’t necessary when installing a gem onto your system using bundler, but it is extremely helpful during development. $RUBYLIB lets you specify the path searched when the require keyword is used. To add paths dynamically to $RUBYLIB, you can push items to the ‘$:‘ array. $: is shorthand for $LOAD_PATH within a Ruby program. My require_relative days are over!

If you are considering writing your own gem, I highly recommend RubyGems.org’s “Make Your Own Gem” guide. It is comprehensive and just generally fantastic.

I plan to use gtfs-geojson in a Rails project in the future. And speaking of gems, I’ve also been dabbling on a Ruby API client for Transitland. I hope to have more to share on both fronts soon!

Until then, ride on!

Have any transit projects to share? Let me know!

New York City Transit Depicted With (A New Set Of) Colorful Lines

Update 3/29/16: The transit visualization has been updated! The technical details in this post are still relevant, but some of the conclusions are no longer valid. Read about the updates here!

Stop the buses! Hold the phone! I now have visual proof that buses and subways in the Big Apple run more often on Fridays than Saturdays. How insightful, right? Okay, so maybe not, but I still enjoyed making a New York City transit frequency visualization using Transitland and Mapbox.

VIEW THE TOOL HERE. Try hovering over each route and turning on different days and modes (subway versus bus) of service.

Below are a few images showing the difference in frequency of transit service on Friday and Saturday, followed by a discussion of each component of the project.

Friday service in a New York City Transit Visualization

Friday morning subway and bus frequency. The coverage and frequencies are impressive!

Saturday service in a New York City Transit Visualization

Saturday morning bus and subway service. As to be expected, the coverage is similar to on Friday, but the frequencies drop significantly.

What can we learn from this frequency visualization of New York City transit?

Some items this visualization illustrates are to be expected:

  • Transit runs with higher frequencies during the week.
  • Transit runs with higher frequencies in denser areas (Manhattan, Brooklyn) than less dense areas (Staten Island).

A few things made sense after seeing them, but were ideas I had not anticipated:

  • Even in dense areas, bus frequencies are higher in areas that have less subway service, and vice versa. While this is true in Manhattan (more subways and subway frequency) and Brooklyn (more buses and bus frequency), it is quite noticeable in Queens. When you turn the subway layer off, western Queens appears almost devoid of transit. While its subway connections do not reach to the eastern edge of Queens, they do begin to make up for a lack of bus routes in western Queens. A few images below show this.
  • The inter-borough connections between Queens and Brooklyn that are notoriously absent in all heavy rail maps of the area are almost as weak even when viewing bus data. It just isn’t easy to travel between Long Island’s two boroughs. Maybe the planned streetcar will finally help this.

One thing to keep in mind: the trips per hour numbers that appear when you hover over lines on the map are not specific to a transit route. They encompass all transit services, potentially multiple routes and even modes, between the two stops that create an edge.

Queens bus routes in a New York City Transit Visualization

Bus routes in western Queens. Doesn’t this seem like it’s missing something?

Queens subway routes in a New York City Transit Visualization

Bus and subway routes in western Queens. That’s a bit better.

The Data

Transitland is an open source project that aggregates transit feeds from across the world. You can query its JSON API to create apps and visualizations easier than directly crunching the underlying GTFS data.

I was inspired to dig into Transitland by this similar frequency visualization for San Francisco. We both use the stops and schedule_stop_pairs API endpoints to calculate how often the “edge” between any two consecutive transit stops is visited in a given time frame.

I chose an appropriate bounding box to encompass all the transit stops operated by MTA and picked a window of 7:30am to 8:00am on the mornings of Friday, January 22, 2016, and Saturday, January 23, 2016. In addition to buses and subways, ferry service is also returned by Transitland in this bounding box, which explains the trips to Staten Island and oddly-direct routes to New Jersey.

The data returned by Transitland is not real-time data of actual transit performance, only the scheduled service times on those dates. I was able to extrapolate a “trips per hour” frequency metric by dividing the edge weight by the length of my query’s time frame.

The Map

I considered publishing a map using QGIS, but I was fortunate enough to stumble upon Mapbox. Mapbox does not have the analytical tools that QGIS does, but its ease of creating interactive web-based maps is impressive.

GeoJSON is a standard JSON variant that holds geographical information, such as points and line segments. In addition to its required fields, I loaded the GeoJSON output files with styling from Mapbox’s simplestyle-spec based on the frequency for that line segment. Mapbox interprets these “properties” fields when displays the data on a map.

A good tool should be simple enough to let you spend time solving real problems and I found Mapbox to reach this goal swimmingly (is there a similar term for transit??). The small amount of code needed to plot four GeoJSON files, toggle between them, show a map legend, and allow zooming and a loading screen all on top of a satisfactory OpenStreetsMap was remarkable. I will most definitely be using Mapbox for future transit projects!

The Code

As the JSON Transitland interface language-agnostic, any scripting language could be used. Ruby is by far my favorite, so I stuck with what I know. You can view the visualization in my GitHub repository.  The code is divided into an HTML front-end and Ruby back-end, though they do not connect directly. A few ideas I have for the future of this project:

  • The TransitlandAPIReader class could be generalized into a gem with a decent test suite, similar to one Transitland used to maintain and intends to bring back.
  • The run.rb script could take a job spec input to produce GeoJSON files for multiple days and cities in a single run.
  • The Mapbox front-end could be used to visualize any arbitrary transit system’s GTFS shape data. This would likely be done using a Rails back-end, rather than the offline Ruby script I am currently using.

Other News

I spent another few hours this week getting lost reading about the Cincinnati subway. If you haven’t dove into that tunnel of information before, I’d highly recommend it. Something about using an old canal which had become economically unfeasible due to competition from railroads to build a tunnel system that was halted due to a moratorium on capital bonds during World War I and never successfully revived just fascinates me. Seriously, any single part of that last sentence would make for a good story, but all those together create a sort transit tragedy worthy of a Shakespearean drama.

In the bed of the canal née Erie

doth thou venture to lay parallel rails.

To endure and inspire they began,

ere citizens above were admonished

their Sisyphean ambitions would fail.

I’m getting cold shivers just imagining a chorus reciting that at the opening of a transit conference. Please let me know of any other examples of transit stories told in iambic pentameter.

Until next time, ride on!