December 2019 marked the inaugural run of MADS (Marketing and Data Science) East conference, following four years of whirlwind success with MADS West conferences in San Francisco. MADS East was held right in the beating heart of the big apple itself, attracting thought leaders from Analytics, Data Science, Marketing, R&D, Insights, Business Intelligence and Operations roles. What makes MADS so exciting and unique is the breadth of sectors are so wide: ranging from philanthropic start-ups such as Jukay Hsu and his company Pursuit, which is transforming lives and creating future leaders from underprivileged backgrounds; all the way through to monolithic Fortune 500 companies such as Alphabet, J&J, and IBM who are leading the way modelling the importance of analytics and data science in this data rich world we live in.
Day one of the conference kicked off with a bang thanks to the Keynote from Seth Stephens-Davidowitz (former data scientist for Google and best-selling author of ‘Everybody Lies’) who took us all on a whistle-stop tour of the human psyche utilising search engine data as its guide. He elegantly used data to hit home the point that what we thought from our bubble of traditional offline media sources has been totally wrong, and dangerously misleading due to phenomena such as the social desirability bias (the tendency of respondents to avoid given responses that, while true, could reveal socially unacceptable attitude so behaviours). Specifically, he focused in on the rise of racism in the US (it doesn’t follow that archaic North / South divide many of us perceive to be the case) and how this online activity acted as a more reliable predictor of the 2016 presidential election than any polls! Beyond this he also went on to state three guiding principles for any practitioner of data science:
- Data newness is far more important than data size
- Effective application for data requires an entrepreneurial approach to its utility
- You need to be ready for a lot (and we mean a LOT!) of failures with the utility of your data in order to identify something that is a game changer
This conference Keynote really set the tone for what was in store over the next two days for marketeers and data scientists alike to ‘geek out’ over some of the bleeding edge applications of data in our industries. The talks were numerous and diverse, but I would distil the key learnings for our industry into two themes:
The quality of our insights is reliant on the quality of our data
This may seem like a fairly obvious point, but one which I agree can very often be underappreciated despite being fundamentally linked to the utility and actionability of insights generated from research. In order to effectively leverage datasets and create actionable models, ‘extreme feature engineering’ is required, which is effectively understanding your data; appreciating the biases present in your data; taking steps to utilize secondary and tertiary data sources to enrich your dataset; and gravitating towards hypothesis-led variable testing to create new derived variables or variable interactions
This is something that we take very seriously at HRW and actively encourage our clients to become “thought partners” with us, as opposed to simple commissioners of research. By working with one another to better understand the business needs and utility of models, we can better execute extreme feature engineering in our analysis and ensure that the models produced internally are both actionable and adaptable for our clients’ business needs.
Furthermore, within the healthcare research space we need to be even more vigilant about data quality given the smaller sample set we typically deal with. A point which rather eloquently hit home on this was raised by Elisha Heaps (Principle data scientist at Staples) who stated that “Small data is only small when viewed in isolation”; she made the argument that we need to be entrepreneurial with our data, amplifying it with the use of covariates through the dataset. Amplifying and blending data in this manner assists with the application of more sophisticated machine learning algorithms and heightens the accuracy of applied predictive analytics.
Are we using the data we have effectively enough?
Another stand out quote from the conference came from MADS veteran and Direct Marketing Association Hall-of-Famer Ernan Roman, when discussing the need for both human intelligence and data science to effectively serve customers: “The chasm between brand fantasies and customer realities is growing”. He was referencing that customers are constantly donating their preference data for reciprocity, but we are failing to hold up our end of this exchange and provide a heightened experience to the customer. Given this chasm, we need to ensure that we are taking research data from respondents willing to provide it and integrating it back into the system as effectively as possible; once again this is an opportunity for our clients (and their extended team) to act as thought partners so we can solve this problem together – how can we better tailor a typing tool to serve your sales reps who need to use it, can we integrate an algorithm into your CRM to prevent you having to type these people manually, how can you effectively leverage the segment allocation to demonstrate value to your customer base beyond just knowing how to sell to them?
Another thought which has been evolving within HRW alongside the drive for more agile research, is the potential for progressive profiling in customer segmentations to get to the heart of online honesty a little more than single survey segmentations are able to – similar to how people use dating services online, people are willing to offer up more and more personal information to another once we see a tangible benefit for ourselves in return. We’ve seen many patients and physicians enjoying learning about themselves and their peers through this type of co-creative approach, and it offers greater potential for iterative creation of a more robust profile.
Given the small universes that we find ourselves dealing with in healthcare, ensuring that we are using innovative approaches and making the most of the data we have is even more vital for us than the rest of the analytical world. This can take time and therefore many of our efforts to address this can go unnoticed: at HRW we have been working on a silent evolution of decision trees utilizing advanced machine learning algorithms such as Random Forests and Boosted Trees like the xgboost model. These newer techniques are much more stable and repeatable than the once loved but now archaic CART and CHAID models. They do however come with their own pitfalls in relation to the interpretability problem where increasingly sophisticated algorithms such as these decision trees or neural nets are inherently opaquer in their function. These are typically termed as ‘black box analysis’; a term I personally despise as it makes the lack of transparency seem less dangerous than it is. When we cannot always see how an algorithm comes to a decision, but as researchers we should be questioning the ‘insights’ it generates. AI had evolved to a point these days that we should not have to defer to using ‘black box’ analysis methods, at HRW we are focused on making AI trustworthy, reliable, and explainable to humans without sacrificing on the algorithm’s performance. This can be achieved by either using machine learning approaches which are inherently explainable with clear visual maps; or to develop new algorithms to explain the decision making of existing complex algorithms themselves, known as XAI (explainable artificial intelligence).
Overall the MADS conference provided a fantastic platform for discussion and debate amongst data science’s foremost experts on what place data science and analytics has within today’s customer paradigm. In turn it provided me with some clear directives on how we can be making more of the work we are conducting within healthcare research. Whilst we are continually testing and implementing new machine learning methods at HRW, this space is growing ever rapidly and requires knowledge sharing and thoughtful design to ensure that they most effectively serve us.
To find out more about our approaches or exploratory initiatives please get in touch.
By Jaz Gill