This article discusses the importance of a data services layer built upon an enterprise data integration (EDI)
platform for service-oriented architecture (SOA).
A large company found itself handicapped by an ornery snarl of siloed applications that compromised its agility, performance and profitability. Its IT department was constantly behind schedule and over budget in hand-coding point-to-point connectivity among supply chain, financials, CRM, and other packaged and custom-built legacy applications.
The solution: Integrating critical business processes and applications by adopting a service-oriented architecture, or SOA. Internal IT personnel and consultants engineered a loosely coupled infrastructure, with reusable services based on XML and standard Web services protocols such as SOAP and WSDL. Once the system went live, the CFO ran a routine query through his dashboard. The answer came back:
You forgot the data.
It's a playful fiction, of course, but it illustrates the perils of a service-oriented architecture without enterprise data integration (EDI): one that focuses only on the business process interactions and application interfaces, and neglects the devilish details of data-level incompatibility among the disparate IT systems participating in those processes, including varying formats, semantics and hierarchies.
Our hypothetical company based its SOA on a Web services-based enterprise application integration (EAI) engine. The technology worked flawlessly in enabling high-level application integration and orchestrating business processes - but it was not designed to deal with the complexities of heterogeneous, inconsistent, dirty data that lie fragmented across the enterprise.
The result: Costly and time-consuming hand-coding to resolve these data inconsistencies in the SOA implementation, thus violating the very promise of reusability and interoperability that is driving the movement towards SOA. The missing ingredient in this company's SOA was a Data Services layer built upon an enterprise data integration platform.
The SOA opportunity
The buzz around SOA has been fast and furious. It's no wonder. Organizations recognize an opportunity to slash the cost of application and middleware development and accelerate time to market by "loosely coupling" siloed applications using open standards such as Extensible Markup Language (XML), Simple Object Access Protocol (SOAP), and Web Services Description Language (WSDL).
The widespread adoption of these standards by IT organizations and vendors alike paves the way to expose applications as component-based services for delivery over the Web. By abstracting the underlying business logic, SOA enables services to be wrapped, re-used, and orchestrated to give both IT and business far greater responsiveness, flexibility, and speed of execution.
Many early SOA-based implementations have been built on EAI, and J2EE- and .NET-based middleware, including message brokers, application servers, and enterprise service buses. But increasingly, data integration has become a primary objective. Some 76 percent of AMR Research respondents using or planning to use an SOA named process or data integration as the leading initiative, according to the August 2005 AMR Research report, "Service-Oriented Architecture: Survey Findings on Deployment and Plans for the Future." The findings reflect a growing awareness that a data integration platform can -- and should -- enrich an SOA with sophisticated data services beyond the scope of application integration-centric technologies.
In other words, to realize the full potential of SOA, including loose coupling and reusability, it's critical that client application access business-relevant data wherever is resides, in whatever form it requires, in a consistent and accurate manner.
Ready for prime time: Service-Oriented data integration
EDI technologies are ready to help SOA become a transformative force for IT. Over the past several years, data EDI technology has been enhanced with built-in support for XML transformations, Web services protocols, JDBC connectivity, and Java Message Service (JMS) connectivity. Advanced data integration platforms also feature metadata capabilities driving the core of their development and run-time infrastructure. This metadata provides an abstraction of the business logic from the technical implementation, and enables them to deliver advanced data integration functionality over a data services layer to the myriad components in the SOA.
For too many years, EDI initiatives, undertaken without the foundation of a data services layer, have resulted in a further proliferation of the siloed systems they were meant to integrate. For instance, a retailer might have deployed an extraction, transformation, and loading (ETL) tool to synchronize point-of-sale data from retail outlets into an SAP financials application. A second instance of the tool might serve to move SAP financials information into a DB2 data warehouse for analysis. And a third instance might work on the front end of the value chain to feed product procurement data to an operational data store.
So while the retailer will have achieved EDI among targeted applications, it's still several steps removed from realizing a fluid, end-to-end data ecosystem. SOA removes these barriers of siloed development.
In a modular SOA, a data integration platform serves as another component-based service. Its functionality can be packaged and reused across multiple projects to reduce development and deployment costs. It can help an organization leverage data assets currently locked in mainframe, packaged, and homegrown systems through open standards. It can eliminate the need to hand-code data integration connectivity, and enables businesses to realize rapid time to value.
That's what SOA offers EDI technology. Now let's look at the flip side - what EDI does for SOA.
Data components and services in an SOA
The most advanced SOA deployments will take advantage of both EAI and EDI technologies. SOA provides an ideal framework for these two technologies to complement one another, with EAI managing transactions and processes among applications, and the EDI platform performing the atomic-level data processing that is generally beyond the scope of EAI systems.
In fact, a common use case is where a company deploys an EAI bus and an EDI platform in a SOA to support master data management initiatives, such as customer data integration (CDI). The EAI bus drives business processes and checks customer records in the master data repository. The EDI platform creates the master data repository, and populates back-end ERP systems with updated customer information transformed to the appropriate format and semantic definition. In strategizing options and objectives for an SOA, organizations should assess and understand the functional distinctions between the two technology sets. Let's take a look at three functional components that are the exclusive province of EDI technology - universal data access, a metadata repository and services, and an EDI engine.
Universal data access: Scope of data
EDI extends the reach of the SOA and its constituent applications into virtually any data source. Prebuilt connectivity and visual mapping environment provides IT architects and developers with a mechanism to tap into information from a variety of sources, including packaged and homegrown applications such as SAP, mainframe and midrange systems such as IMS and VSAM, relational databases such as Oracle and Sybase, and unstructured and semi-structured data.
Organizations can use EDI to reach into multiple systems to fetch data, cleanse and transform it into the appropriate formats and semantic definitions, and propagate it across multiple distributed systems. Its service may be invoked by, for instance, an online customer order application to trigger event-driven, read/write data updates across financials, manufacturing, and distribution in near-real time.
Metadata repository and services: Meaning of data
A metadata repository provides the SOA with an underlying foundation to understand the lineage of data, the ripple effects of changes, and data-related deficiencies in the architecture. The repository provides a data interaction framework to store and manage data models, transformations, workflows, and dependencies- metadata describes the data logic and its meaning. Through metadata services, EDI technology provides a means to reconcile data semantics across disparate systems, improve reporting, auditing, and data governance and enable reuse to streamline development and accelerate deployment.
Metadata is also key in equipping organizations with an auditable record of data lineage covering all data resources, providing an important tool in meeting the compliance requirements of Sarbanes-Oxley and other regulations.
EDI engine: Value of data
At the core of EDI is an engine that provides organizations a host of options for moving, integrating, and delivering data among various consumers in a SOA. Its flexibility is important in letting IT professionals architect a system optimized for "right time" data delivery, including high-volume batch data movement, near-real time capture and movement and changed data capture - only data updated since last service invocation.
EDI also offers functionality to help "future-proof" an SOA against rising data volumes, meet the requirements for reduced data latency, and demands for toughened security and privacy. For example, EDI supports partitioning to optimize parallel processing on multi-CPU hardware, deployment on multi-node server grids for distributed workflow execution and fault tolerance, failover, and fortified security through authentication, authorization, and encryption.
At your service: Data dividends in an SOA
So those are the core EDI components in an SOA. Now let's examine the services and benefits that they deliver for an SOA - data profiling, cleansing, transformation, movement, and auditing.
Data profiling: Data profiling is the process of assessing and understanding the content, quality, and structure of enterprise data. It is an essential step in reconciling semantic differences in common business vocabulary such as customer, address, and product that varies among applications, and which, if unaddressed, results in contradictory information across the enterprise.
Data cleansing: Once data is profiled, an EDI platform can execute data cleansing functions to ensure the validity and consistency of information. It standardizes name, address, and other values, and resolves missing data fields, parses data elements, and corrects poorly formatted or conflicting data.
Data transformation: Data transformation services enable data to be transformed from one form to another to allow reconciliation between data elements residing in different information sources. The transformation services leverage pre-built and customized mappings that take into account complex data hierarchies and relationships.
Data movement: EDI offers flexible mechanisms for "right-time" data delivery in an SOA, including high-volume bulk data movement, near-real time capabilities, data federation, and changed data capture that handles only information that has been updated to accelerate load times and minimize operational impact.
Data auditing: EDI provides in-depth lineage of data - when it was changed, how, by whom, and across which applications - to enable auditing, reporting, and analysis essential to meeting the demands of legislated regulations and internal/external auditors.
The data services provided by the EDI platform can be accessed by other components in the SOA via Web Services protocols such as SOAP, messaging systems such as MQSeries or JMS, and programmatic approaches such as JDBC and ODBC.
Where do we go from here?
SOA may still be in its early phases, but the time is right to take a hard look at data-related business objectives and IT resources in your service-oriented architecture blueprint. One key to success is an iterative approach that focuses first on targeted projects with quantifiable business value that are relatively easy to implement. SOA, after all, is a matter of architecture, and no organization is going to rearchitect its systems overnight.
By implementing a data integration platform at the ground level, you can ready your IT systems to fully leverage that most valuable of business assets - data - without re-engineering, hand-coding, and having to worry about data quality problems down the line. Plus, you'll never worry about receiving another response that says, "You forgot the data."
- For more information on enterprise data integration and service-oriented architectures, listen to the podcast Data discussion: SOA and EDI, with Rick Sherman.