API Orchestration At Tapjoy

by Matt Carbone in , ,


At Tapjoy, we’re currently working on a large scale migration to a Service Oriented Architecture (SOA) for all of our core applications. Over the last few years, our core platform has evolved into what has been dubbed the “monorail” - a large monolithic Rails application. We’re now at a point where we’re seriously feeling some of the negative aspects of the monolith, mostly around excessive complexity and testing/refactoring difficulty which results in reduced speed and agility when building out new features.

As we work towards extracting existing functionality from large applications and writing new functionality as discrete services, we’re planning to invest heavily in designing and building out the necessary infrastructure. This collection of technologies that supports our SOA effort and will be used across all engineering teams will be called our service “Orchestration Layer” (OL), using the phrase coined by Netflix engineering director Daniel Jacobson.

The Problem

As we move to a world where we have many services communicating with each other bidirectionally, we’ll be introducing a new set of problems as we solve for another. At a high level, the primary goals of the OL are to enforce consistency, reduce complexity, and reduce development time as our teams construct new services.

One of the biggest problems that we’re facing is related to how we configure services to connect with each other. Right now, we have some applications that function like services but all of the endpoint configuration is static and managed locally in the application. This is not scalable or fault tolerant. Also, since we’re in an AWS environment, using ELBs in front of internal service clusters is not an option.

Communicating between services adds more complexity for application level feature development. More specifically, things like network calls, caching, and security should be abstracted away. Service implementors shouldn’t have to worry about the details of how requests are made and processed or the lower level details of how data is formatted.

Inconsistency between service implementations is another huge problem. It’s important for us to avoid ending up in a situation where services are solving the same problems but with different solutions. Thinking about our SOA infrastructure as its own project allows us to enforce consistency and provided a well tested solution to the common set of challenges that any service will have.

Components

Service Discovery

Service discovery is the component that will solve our problems related to configuration management for services to connect to each other. There are three primary subcomponents:

Service Registry - A central housing for service endpoint locations. We’re evaluating distributed communication management systems, such as Eureka.

Service Registry Agent - Sends config info to the registry. Likely a daemon that's deployed with each server in a service.

Service Registry Consumer - Pulls config info from the registry for any given “app” that will talk to remote services.

JSON Schema and JSON Hyper-Schema

We’re using JSON Schema and Hyper-Schema to describe our APIs available functionality and data formats. This is a JSON based approach to what has traditionally been done with XML and WSDL in a SOA environment.

With Hyper-Schema, we’re able to ingest a valid JSON Schema of a web service, and wrap all the available resources in that schema into a client which handles all the remote calls. Since we’re dynamically generating our client objects, we’re able to greatly reduce the amount of work needed to serve and consume data and even add additional functionality.

Service Communication Library

We’re working on a set of libraries to consume, modify, and expose data. This will help ensure that services are implemented in a consistent way while eliminating repeated boiler-plate code.

Our client module uses Hyper-Schema to generate an object that represents the given remote API. Service implementors can write code using the existing patterns that apply to local datastores, with the remote aspects (configuration, network communication, retry logic, and error handling) abstracted away.

The server module provides all of the generic functionality for exposing web services (validations, verifying signatures, CRUD helpers, etc). In addition to that, helpers for creating JSON schema service descriptors and exposing them are provided. Custom responders build data based on schemas provided without any manual formatting required. Caching lives in front of our service boundaries and is also managed by our communication libraries. We’re using a standard read-through caching strategy with the majority of calls reading directly from cache, avoiding any significant performance impact.

These libraries are delivered via a standard Ruby gem and a Rails Engine supports easy integration with any Rails based application interacting with services and/or exposing one.

Service Containers

Service container is a bit of an overloaded term, but in our case it really means a preconfigured ruby application. We’ll likely have a few of these, each one providing a specific template for a given type of service. With containers, we’ll be able to ensure that the critical configurations are consistent across all services we create. Most of all, the biggest value-add here is in reducing time to get up and running with new service.

Summary

We’re treating our SOA infrastructure as its own project and we’re investing heavily in it in 2014. It’s definitely something that we’re excited about and will continue to share details on as it progresses.

 

Think this is interesting? We're hiring! See our current openings here.