Quality of Data (QoD) (Ofer Abarbanel online library)

Quality-of-Data (QoD) is a designation coined by L. Veiga, that specifies and describes the required Quality of Service of a distributed storage system from the Consistency point of view of its data. It can be used to support Big Data management frameworks, Workflow management, and HPC systems (mainly for data replication and consistency).

It takes into account data semantics, namely Time interval of data freshness, Sequence of tolerable number of outstanding versions of the data read before refresh, and Value divergence allowed before displaying it. Initially it was based in a model from an existing research work regarding vector-field Consistency,[1] awarded the best-paper prize in the ACM/IFIP/Usenix Middleware Conference 2007 and later enhanced for increased scalability and fault-tolerance.[2]

This consistency model has been successfully applied and proven in Big Data key/value store Apache HBase,[3] initially designed as a middleware[4] module seating between clusters from separate data centres. The HBase-QoD coupling [5] minimises bandwidth usage and optimises resources allocation during replication achieving the desired consistency level at a more fine-grained level.

QoD is defined by the three-dimensions of vector k=(θ,σ,ν), but with a broader view of the issue, applicable also to large-scale data management techniques in regards to their timely delivery.[6]

Other descriptions

Quality-of-Data should not be confused with other definitions for data quality such as [7] [8] – Completeness – Validity – Accuracy

References

  1. ^Nuno Santos; Luís Veiga; Paulo Ferreira (2007). “Vector-Field Consistency for Adhoc Gaming” (PDF). ACM/IFIP/Usenix Middleware Conference 2007.
  2. ^Luís Veiga; André Negrão; Nuno Santos; Paulo Ferreira (2010). “Unifying Divergence Bounding and Locality Awareness in Replicated Systems with Vector-Field Consistency” (PDF). JISA, Journal of Internet Services and Applications, Volume 1, Number 2, 95-115, Springer, 2010.
  3. ^Welcome to Apache HBase™
  4. ^Sergio Estéves; João Silva & Luís Veiga (2013). “Quality-of-service for consistency of data geo-replication in cloud computing” (PDF). Euro-Par 2012 Parallel Processing. Springer Berlin Heidelberg, 2012. 285-297.
  5. ^Álvaro García-Recuero; Sergio Estéves; Luís Veiga (2013). “Quality-of-Data for Consistency Levels in Geo-replicated Cloud Data Stores” (PDF). IEEE CloudCom 2013.
  6. ^Data Quality Published by IBM]
  7. ^Richard Y. Wang (1992). “Toward quality data : an attribute-based approach” (PDF). Decision Support Systems 13, MIT.
  8. ^George A. Mihaila; Louiqa Raschid; María-Esther Vidal (2000). Using Quality of Data Metadata for Source Selection and Ranking. CiteSeerX 10.1.1.34.9361.

Ofer Abarbanel – Executive Profile

Ofer Abarbanel online library

Ofer Abarbanel online library

Ofer Abarbanel online library

Ofer Abarbanel online library