Using Open Source Methods to Integrate Enterprise Data with Drupal at OSU

Experience level
Intermediate

At Ohio State, there are literally dozens of small independently operating web development groups spread across college, department, and business unit boundaries.  When building websites, we often want to integrate information from a variety of discrete data silos:  human resources, student information systems, faculty tenure databases, media collections, and more.

However, getting access to these systems for web development can be prohibitively difficult, politically, procedurally, and programmatically.  We found a solution by pooling resources across units and using open source development methods to

  1. Build centralized web services connecting the data silos.
  2. Build easy ways to consume the data in Drupal.

Since this is a DrupalCamp, this talk will focus primarily on some of the open source methods we've used and on the method we're using to integrate enterprise data.

Open Source Methods & Tools

  • Drupal Project Repository (w\Issue Tracking, VersionControl)
  • Git, Gitosis, and GitWeb
  • Monthly Campus Drupal Meetup
  • Weekly Project Meetings
  • Mailing Lists
  • Campus Talks

Data Integration

Our first end to end project was to build reuseable directories and faculty profiles based on a centralized web service providing data from several systems.  

We wanted:

  • a turn key solution
  • reuseable components
  • an api at every level
  • customizable caching
  • ondemand updates
  • overrides of any data
  • a solution that worked with cck, views, etc

 

When most people think about integrating data with Drupal, they immediately and rightly think "feeds", but feeds doesn't easily map into this type of model.  

Instead of using feeds, we looked to the architecture of the amazon module on drupal.org.  This implements:

  • a drupal api for accessing amazon product data
  • a views enabled caching mechanism
  • a cck autocomplete field that references a product (and uses the api to pull in data)

The advantage of this approach is that it makes it ridiculously easy to make applications that use amazon data.  Since it integrates with CCK, you don't even need to write any code.  

We embraced and extended this model implementing a:

  • central web service for information on people
  • OO php client for consuming information on people
  • procedural drupal api for consuming information on people
  • cck field to reference a person
  • feature/module to implement a directory

Along the way we abstracted and added a variety of features:

  • caching and views enabled caches
  • customizable update intervals for caches
  • ability to override any piece of data.

The results can be found on our project repository under kmdata.

Status
Accepted