Stuart Madnick, MIT, at the 7th annual MIT Chief Data Officer and Information Quality (CDOIQ) Symposium with Dave Vellante and Jeff Kelly.
From Silicon Angle’s summary of the interview: “[Dave] Vellante asks if Madnick sees a ‘hard connection between big data and data quality?’ Madnick suggests this is an important issue as, ‘having a lot of data doesn’t mean anything if it isn’t any good.’ While some tend to believe that having enough data will ‘wash out the bad data’… Madnick says this may not typically be the case. The key is to fully understand data to mitigate against faulty interpretations. Madnick notes that such an error occurred amidst the housing crisis, when statistics were interpreted to suggest housing sales had increased. In reality, the registry of deeds, which merely counted how many had been filed that month, was used to make the conclusion. The large quantity of deeds was due to ownership that had been transferred to banks because of increased foreclosures. In light of such cases, Madnick stresses the importance of understanding the implications of data.
Velllante notes that as people are having more access to data, in general, it also means that low quality data is also circulating in large amounts. He asks Madnick, ‘What gives you hope that this new big data theme is not going to overwhelm us with bad data?’ Madnick suggests bringing discipline to the field involves taking both a bottom-up and top-down approach. This involves analyzing data (bottom-up) and developing a narrative or theory to make sense of the data (top-down). Narratives help contextualize and put parameters around what can be drawn from the data, just as data has a similar effect on narratives.”