Master data management (Ofer Abarbanel online library)

In business, master data management (MDM) is a method used to define and manage the critical data of an organization to provide, with data integration, a single point of reference.[1] The data that is mastered may include reference data – the set of permissible values, and the analytical data that supports decision making.[2]

In computing, a master data management tool can be used to support master data management by removing duplicates, standardizing data (mass maintaining),[3] and incorporating rules to eliminate incorrect data from entering the system in order to create an authoritative source of master data. Master data are the products, accounts and parties for which the business transactions are completed. The root cause problem stems from business unit and product line segmentation, in which the same customer will be serviced by different product lines, with redundant data being entered about the customer (a.k.a. party in the role of customer) and account in order to process the transaction. The redundancy of party and account data is compounded in the front to back office life cycle, where the authoritative single source for the party, account and product data is needed but is often once again redundantly entered or augmented.

Master data management has the objective of providing processes for collecting, aggregating, matching, consolidating, quality-assuring, persisting and distributing such data throughout an organization to ensure a common understanding, consistency, accuracy and control[4] in the ongoing maintenance and application use of this information.

The term recalls the concept of a master file from an earlier computing era.


MDM is a comprehensive method of enabling an enterprise to link all of its critical data to one file, called a master file, that provides a common point of reference. When properly done, master data management streamlines data sharing among personnel and departments. In addition, master data management can facilitate computing in multiple system architectures, platforms and applications.[5]

At its core MDM can be viewed as a “discipline for specialized quality improvement”[6] defined by the policies and procedures put in place by a data governance organization. The ultimate goal being to provide the end user community with a “trusted single version of the truth” from which to base decisions.


At a basic level, master data management seeks to ensure that an organization does not use multiple (potentially inconsistent) versions of the same master data in different parts of its operations, which can occur in large organizations. A typical example of poor master data management is the scenario of a bank at which a customer has taken out a mortgage and the bank begins to send mortgage solicitations to that customer, ignoring the fact that the person already has a mortgage account relationship with the bank. This happens because the customer information used by the marketing section within the bank lacks integration with the customer information used by the customer services section of the bank. Thus the two groups remain unaware that an existing customer is also considered a sales lead. The process of record linkage is used to associate different records that correspond to the same entity, in this case the same person.

Other problems include (for example) issues with the quality of data, consistent classification and identification of data, and data-reconciliation issues. Master data management of disparate data systems requires data transformations as the data extracted from the disparate source data system is transformed and loaded into the master data management hub. To synchronize the disparate source master data, the managed master data extracted from the master data management hub is again transformed and loaded into the disparate source data system as the master data is updated. As with other Extract, Transform, Load-based data movement, these processes are expensive and inefficient to develop and to maintain which greatly reduces the return on investment for the master data management product.

One of the most common reasons some large corporations experience massive issues with master data management is growth through mergers or acquisitions. Any organizations which merge will typically create an entity with duplicate master data (since each likely had at least one master database of its own prior to the merger). Ideally, database administrators resolve this problem through deduplication of the master data as part of the merger. In practice, however, reconciling several master data systems can present difficulties because of the dependencies that existing applications have on the master databases. As a result, more often than not the two systems do not fully merge, but remain separate, with a special reconciliation process defined that ensures consistency between the data stored in the two systems. Over time, however, as further mergers and acquisitions occur, the problem multiplies, more and more master databases appear, and data-reconciliation processes become extremely complex, and consequently unmanageable and unreliable. Because of this trend, one can find organizations with 10, 15, or even as many as 100 separate, poorly integrated master databases, which can cause serious operational problems in the areas of customer satisfaction, operational efficiency, decision support, and regulatory compliance.

Another problem concerns determining the proper degree of detail and normalization to include in the master data schema. For example, in a federated HR environment, the enterprise may focus on storing people data as a current status, adding a few fields to identify date of hire, date of last promotion, etc. However this simplification can introduce business impacting errors into dependent systems for planning and forecasting. The stakeholders of such systems may be forced to build a parallel network of new interfaces to track onboarding of new hires, planned retirements, and divestment, which works against one of the aims of master data management.

However, master data management can suffer in its adoption within a large organization if the “single version of the truth” concept is taken too far and becomes overly restrictive. Many times different departments within a large company require different versions of a single master data element or hierarchy of elements in order to accomplish the department’s objective. For example, the product hierarchy needed to manage inventory may be (and most likely will be) entirely different than the product hierarchies needed to support marketing efforts or pay sales reps. A robust (or agile) MDM process will allow multiple versions of the truth to exist, but will provide simple, transparent ways to reconcile the necessary differences. Without this flexibility, users that need the alternate versions will simply “go around” the official MDM processes, thus reducing the effectiveness of the company’s overall MDM program.


Processes commonly seen in master data management include source identification, data collection, data transformation, normalization, rule administration, error detection and correction, data consolidation, data storage, data distribution, data classification, taxonomy services, item master creation, schema mapping, product codification, data enrichment, hierarchy management, business semantics management and data governance.

The selection of entities considered for master data management depends somewhat on the nature of an organization. In the common case of commercial enterprises, master data management may apply to such entities as customer (customer data integration), product (product information management), employee, and vendor. Master data management processes identify the sources from which to collect descriptions of these entities. In the course of transformation and normalization, administrators adapt descriptions to conform to standard formats and data domains, making it possible to remove duplicate instances of any entity. Such processes generally result in an organizational master data management repository, from which all requests for a certain entity instance produce the same description, irrespective of the originating sources and the requesting destination.

The tools include data networks, file systems, a data warehouse, data marts, an operational data store, data mining, data analysis, data visualization, data federation and data virtualization. One of the newest tools, virtual master data management utilizes data virtualization and a persistent metadata server to implement a multi-level automated master data management hierarchy. ..

Transmission of master data

There are several ways in which master data may be collated and distributed to other systems.[7] This includes:

  • Data consolidation – The process of capturing master data from multiple sources and integrating into a single hub (operational data store) for replication to other destination systems.
  • Data federation – The process of providing a single virtual view of master data from one or more sources to one or more destination systems.
  • Data propagation – The process of copying master data from one system to another, typically through point-to-point interfaces in legacy systems.


  1. ^Rouse, Margaret (2018-04-09). “Definition from”. SearchDataManagement. Retrieved 2018-04-09.
  2. ^Loshin, David (May 2006). “Defining Master Data”. BeyeNETWORK. Retrieved 2018-04-09.
  3. ^Jürgensen, Knut (2016-05-16). “Master Data Management (MDM): Help or Hindrance?”. Simple Talk. Retrieved 2018-04-09.
  4. ^“Learn how to create a MDM change request – LightsOnData”. LightsOnData. 2018-05-09. Retrieved 2018-08-17.
  5. ^“Master data management”. IBM.
  6. ^DAMA-DMBOK Guide,2010 DAMA International
  7. ^“Creating the Golden Record: Better Data Through Chemistry”, DAMA, slide 26, Donald J. Soulsby, 22 October 2009

Ofer Abarbanel – Executive Profile

Ofer Abarbanel online library

Ofer Abarbanel online library

Ofer Abarbanel online library

Ofer Abarbanel online library