Are you worried about the quality of your data? - The AI Company - 0 views
www.autolearn.ai/...about-the-quality-of-your-data
AI company autolearn Business innovation technology digital journey strategy data transformations
shared by pintadachica on 20 Dec 18
- No Cached
-
pintadachica on 20 Dec 18The quality or the lack thereof can be a huge contributing factor to a fractured and sluggish digital journey where ROI is hard to achieve and results come in short supply. The quality of data has a direct impact on the ability of the enterprise to be aware of relevant events, its reaction time, the decision time and its action time. A clear and concerted effort is required to measure and improve the data quality to drive better decisions and actions. Common Quality Issues The following are the most common quality issues Comprehensiveness Comprehensiveness quality issues refer to key attributes or data points missing from the data collected by the enterprise. This can occur when the data producing systems or the data delivery networks have glitches or malfunction or are incorrectly configured to miss entire rows of data or attributes of the data. Integrity Integrity quality issues refer to the corruption of the values of key attributes to contain unidentifiable or unreadable data. When key attributes are empty or null when they are by design, not allowed to be empty/null or when an attribute contains a value that does not meet the specifications of the type of the attribute for example, a string column contains an integer or a timestamp column contains a string not parse-able into a timestamp. Integrity of data is important before data can be included in the data set to drive analysis, decisions and actions. Sampling Sampling quality issues refer to the inclusion or exclusion of a certain percentage of the records in a data set with the assumption that the remainder records are good, representative sample of the original data set. Bad or inaccurate sampling can lead to a distorted view of reality and that can lead to bad decisions. In addition, sampling itself can make the data set inappropriate for certain types of analysis that require the entire dat set to be utilized for training. Filtering An upstream filtering scheme can end up removing too many or