Enterprise Data Integration strategies and solutions
At the core of most IT integration projects is data integration. Despite that does not mean this should always be a starting point of your integration projects, I would focus exactly on the data integration topic later in this article. However, before that I’d like to mention different levels of integration that have different levels of semantics. These are:
The process integration is optional. It is used when the context of data exchange is related to a specific sequence of action, itself depending on different business or technical events. One can assimilate also “workflow” type of integrations in that category.
Data Integration is always used. It is the top level for information exchange patterns which need no or minimal specific rules.
Infrastructure integration or technical integration
This is always to be addressed to ensure the appropriate exchange of data. Here the data volumes, frequency of exchange, real time exchange need to be considered. Each combination of these parameters calls for a specific technical IT solution. For example technology like EAI is appropriate for publish and subscribe patterns; ETL tools are appropriate for batch / bulk data exchange. The technical integration also needs to consider the security aspects in terms of authentication, authorization etc. Usually this security consideration must happen at the database level (e.g. segregation of access to tables), as well as Access to databases (e.g. gateways to prevent SQL injections).
Data integration has many forms and solutions, from simple file transfers, database replication systems and data warehouses to real-time data integration platforms, master data management (MDM) systems and virtual databases. However the main challenge will stay the same – number, complexity and types of IT systems to be integrated. This top technology issue should be addressed by enterprises via following action points:
– Support a technology architecture / standard (i.e., J2EE, XML, SOAP, etc) for example within an “internal cloud”
– Eliminate / reduce redundant technology / systems / applications
– Unify multiple data structures / formats
– Provide a common user interface across applications
– Integrate heterogeneous systems and source codes
After these points considered, there should be a decision made on one of the following integration strategies or solutions:
A consolidated data integration solution moves all data into a single database and manages it in a central location. There are some considerations that need to be known regarding the differences between different database mechanisms. While there are many obvious benefits to the consolidated solution, it is not practical for any organization that must deal with legacy systems or integrate with data it does not own. Oracle for example has extensive support for consolidated data integration.
A federated data integration solution leaves data in the individual data sources where it is normally maintained and updated and simply consolidates it on the fly as needed. In this case, multiple data sources will appear to be integrated into a single virtual database, masking the number and different kinds of databases behind the consolidated view. These solutions can work bi-directionally. Federated data integration can be very complicated. This is especially the case for distributed environments where several heterogeneous remote databases are to be synchronized using two-phase commit.
A shared data integration solution actually moves data and events from one or more source databases to a consolidated resource, or queue, created to serve one or more new applications. Data can be maintained and exchanged using technology such as replication, message queuing, transportable table spaces, FTP, etc. Data sharing-based integration involves the sharing of data, transactions, and events among various applications in an organization. It can be accomplished within seconds or overnight, depending on the requirement. It may be done in incremental steps, over time, as individual one-off implementations are required. Such an environment needs to include a rules-based engine, support for popular development languages, and comply with open standards.
In the next articles I’ll describe what Oracle or others offer to address the data integration issues providing different solutions for them.