Metra

Chicago commuters now track live train arrivals with a cloud, CDN, and DevOps environment architected by Deft.

Metra

Metra is a commuter railroad in the Chicago metropolitan area. It is the largest and busiest commuter rail system outside New York City.

 

INDUSTRY
Transportation
LOCATIONS
Chicago, IL
CUSTOMER SINCE
2015

We had the incredible opportunity to work with Clarity Partners on an exciting large-scale project to update, expand, and revamp the backend of Metra’s website.

The challenge

With the advent of new technologies and standards, Metra wanted to enhance their rider experiences by developing a new website that made getting pertinent information quick and easy. Inclement weather, track and station construction, heavy usage during peak hours, and special events contributed to an ever-growing frustration amongst Metra riders because of the inability to get accurate and real-time information on trains and their locations.

“Our goal was to create a customer-friendly website that presents information in a logical and intuitive way. We hope these changes will make using our website—and using our train service—an even more satisfying experience.”

Don Orseno

Executive Director/CEO, Metra

“Metra debuts new and improved metrarail.com”

metrarail.com

A significant problem with the Metra website, and its infrastructure, was the way in which GTFS data were delivered to riders via a browser-based pull mechanism, rather than modern server-based push technology such as WebSockets. Alerts were not presented within the context of their respective train or route and were, instead, provided in a manner that caused riders to scroll through an often lengthy list of notices to find any that may have impacted their particular train or route.

The new Metra system needed to be able to handle massive spikes in traffic to their website during special events and severe weather. Scaling of the site could be scheduled in advance, but unforeseen events, such as inclement weather or mechanical failures, left them vulnerable to unpredictable spikes. Due to the elastic nature of Metra’s ridership, the Metra project was a perfect opportunity to utilize much of the functionality already available in cloud offerings.

Deft was brought in to leverage our expertise in AWS’ vast functionality, implementation, and management. We needed to architect an environment that would expand and contract automatically with demand and load.

Designing this type of environment can be challenging. It requires developers to understand the elastic principles of the cloud and to create software solutions that can handle the variable nature of auto-scaling scenarios, whereas servers often appear and disappear as they ramp up or ramp down the number of cloud server instances.

Because a website crash would be disastrous, the site could not afford downtime and the system would need to be highly available and fault tolerant.

It was also vital that development and deployment were straightforward, consistent, reliable, and non-impactful to site availability.

Once complete, we needed to be able to hand off the environment to Metra, ensuring that all involved software and systems engineers had the ability to update code, content, and live GTFS feed data promptly to improve rider communication.

Finally, because the website would process credit cards for ticket sales, we and Clarity Partners collaborated to achieve Payment Card Industry (PCI) compliance. PCI compliance verified, via audit, that Metra and its partners take all available precautions to provide security standards and multiple layers of defense to protect sensitive consumer data.

The solution

Because websites and the infrastructures that run them do not scale to meet demand automatically and are not fault-tolerant by default, the environment was designed with multiple availability zones.

This meant hosting the site across multiple physical data centers.

The solution required the use of both a CDN and elastic load balancers.

A well-architected auto-scaling policy was also critical, as the environment needed to grow and shrink based on utilization.

Using auto-scaling to cut costs

Auto-scaling helps maintain application availability and allows you to scale capacity up or down automatically according to conditions you define. Auto-scaling can be used to help ensure that you are running a desired number of cloud instances. It can automatically increase the number of instances during demand spikes, and decrease capacity during lulls.

In addition to overall management of the Metra project, Clarity Partners was responsible for the site front-end design and architecture of the Drupal components, ticket sales e-commerce integration, and supporting the PCI compliance initiative.

As a DevOps-focused company, we guided the Clarity development team on how to deploy, run, and code Drupal for an auto-scaling environment. We also helped design and deploy the Continuous Integration (CI) jobs and processes and led triage and management of all issues.

Detailed service alert data

Transit Tracker app preview

Find A Metra Train mobile app feature

Live transit tracker view in mobile app

Pop-up alert in mobile app

The hosting infrastructure now consists of development, staging, and production environments. Through CI tools and processes, Metra and Clarity are able to test new code and ideas in the development and staging environments prior to pushing them to production with confidence. The DevOps-oriented process also allows for multiple development teams to contribute to the code base, including Metra’s internal development team.

To address the lack of real-time push capability of GTFS data, we built and continue to maintain the real-time data feed for alerts, train positions, and schedule data. Our system makes this data available via WebSockets and a standard JSON-based REST API.

Additionally, we provide direct access to the raw GTFS data feed. This access protects Metra’s GTFS source from being overloaded by too many requests.

In addition to providing data to riders via the website, Metra provides GTFS data to major providers such as Google, Microsoft, Yahoo, and others for their mapping and routing applications.

The new real-time messaging platform, based on NodeJS and PubNub, is generating and delivering millions of messages a day to website users and other connected devices. To be useful, these messages need to be delivered in a reliable, scalable, and timely fashion.

 

Figure 6: Real-Time GTFS Messages – 3 Months Trailing (Spike on 11/4 was for the Chicago Cubs World Series parade and rally)

Real-Time GTFS Messages – 3 Months Trailing
(Spike on 11/4 was for the Chicago Cubs World Series parade and rally)

 

Metra now enjoys a fully managed environment and automated code deployments throughout the development cycle.

Managing technology costs with automation

CDNs accelerate delivery of websites, APIs, video content or other web assets. With a CDN, you don’t need to worry about maintaining expensive web server capacity to meet the demand for content from potential traffic spikes.

The service automatically responds as demand increases or decreases without any intervention. It allows us to serve cached data to metrarail.com end users, rather than passing their requests directly to the web/application servers to fulfill every client request. This process significantly increases the efficiency of the Metra site and ultimately scales back on the instances/resources needed to service riders.

The positive impact of improved site performance was significant and immediate: at one point, over 5.5 TB of data was pushed from the CDN — yet only 61 GB of data had to be served from the web servers.

Running far fewer cloud server instances and other infrastructure to service end users helps Metra manage their server costs.

With a CDN, you don’t need to worry about maintaining expensive web-server capacity to meet the demand for your content from potential traffic spikes.

Figure 7: Benefits of CloudFront, Highlighted (The spike to 500GB on 11/4/16 was in support of the Chicago Cubs World Series parade & rally)

   Benefits of CDN, Highlighted
(The spike to 500GB on 11/4/16 was in support of the Chicago Cubs World Series parade & rally)

 

The auto-scaling solution we developed with Clarity has allowed the site to scale and respond to demand in three fundamental ways:

First, we programmed scheduled scale-up and scale-down scenarios that increase the size of the cluster before rush hour and decreases it back down after rush hour, since that is when the site is in highest demand. If full capacity is not required 24×7, it’s better to run cloud instances at peak capacity for only 8 hours a day.

Second, the site is also capable of auto-scaling itself if the load on servers or the page response times get too high. This ability to dynamically scale means that Metra only runs servers when they’re needed, automatically adjusting to peak and off-peak times, which saves Metra a considerable amount of money in hosting costs. Metra also doesn’t need to try and plan for unforeseen spike conditions since the site will autoscale up and handle them as the need arises.

Finally, we are able to proactively scale up the Metra environment in anticipation of supporting traffic spikes related to Chicago events, like Lollapalooza and Taste of Chicago. We even managed the Metra environment through surges in ridership after the Cubs won the World Series. 

Metra Executive Director/CEO Don Orseno accurately predicted that the day of the Cubs World Series parade and rally was expected to be the day of highest ridership in Metra history.

Though the streets, train stations, and sidewalks of Chicago struggled to accommodate the physical demands of nearly 5 million Cubs fans attending the celebration, we ensured the Metra environment kept the site live with rapid data flows to provide all riders with accurate schedules and timely alerts.

Figure 8: Auto Scaling Graph for Two Weeks (# of active web servers in the cluster over time) (Spikes on 11/4 and 11/5 were for the Chicago Cubs World Series parade and rally)

Auto Scaling Graph for Two Weeks (# of active web servers in the cluster over time)
(Spikes on 11/4 and 11/5 were for the Chicago Cubs World Series parade and rally)

 

Our proprietary mix of tools and processes were utilized from the start, along with our DevOps philosophy and approach to environment architecture. From infrastructure build and deployment to application deployment and Continuous Integration (CI), we make sure to always operate in a way that is consistent with current best practices in the DevOps space. This approach allows for much greater efficiency, consistency and repeatability within the environments we manage – and the Metra project was no exception. We utilized tools like Ansible, CloudFormation, Git, Jenkins and others to make sure all changes to the environment were vetted, self-documenting, easily rolled back in case of issues, and well-orchestrated. As a result, Metra benefits from quicker, cleaner and more seamless deployment of their applications and cloud infrastructure.

The new Metra environment has over 28 different CI jobs that support three different teams of developers. Typically, coordinating deployments and pushes would be complicated and error-prone. Because of how we implemented CI, each development group can take control of their own deployments in a consistent fashion.

What does all this mean for Metra and its riders? Rapid deployments, faster fixes, and accurate data reaching riders. There will also be more frequent feature releases and a more stable environment overall.

The DevOps approach also contributes to PCI compliance because the automation process documents releases and deployments, and helps account for changes to the environment. With traditional, old-school deployments, systems and software engineers may have had to log in to servers and manually deploy/install new versions of custom software. By using a DevOps approach and Continuous Integration tools, we eliminated the need for much of this, as Metra’s developers and engineers can simply rely on external processes to deploy new versions of software. Building solutions that keep these developers and engineers from logging into servers directly reduces the PCI burden and takes many of the access concerns out of scope.

Saving Metra 50% over their previous contract

As of the launch, Metra is expected to save 50 percent, or about $400,000 a year, over their previous contract.

Development costs are also less, and the open-source platform means that Metra can perform both support and development in-house, saving money and ensuring timely updates to the site and its content.

“Metra is proud of our new site. We’ve included enhancements that directly improve our customers’ ability to make the best travel decisions for themselves. For instance, the schedule finder tool has been upgraded to provide more information: customers can decide whether to view the schedule between two stops or the whole schedule for the line, and the results will show if the train is running behind schedule or if there are any other service changes affecting that train, such as a decision to add or skip stops. Innovations like these are possible because of the technical choices we’ve made, and those same choices will allow us to continue to innovate for our customers.”

Cherie Kizer

CIO, Metra

 

Contact us about starting your own project

Deft, a Summit company

Deft, a Summit company
2200 Busse Rd.
Elk Grove Village, IL 60007
+1 (312) 829-1111