Human Centered Data Science
What if we approached data problems as people problems too?
On the surface, data science and user research don't seem to have much in common. User research is about focusing on users and demonstrating empathy for the user with the aim of creating joyful and satisfactory user experiences in the form of effective products. Data on the other hand is perceived to be rational, and numbers as irrefutable. Data science reveals the truth - it's seen as a quantitative, scientific endeavour that seeks to explain the world around us.
In this article, I explain why data science needs to be informed by user research in order to deliver successful analytical projects. But before we dig into the why, let's think about the what.
Human Centered Design
IDEO (a global design and innovation company) defines human-centred design (HCD) as:
A creative approach to problem-solving that starts with people and ends with innovative solutions that are tailor-made to suit their needs.
In their Field Guide to Human-Centered Design, IDEO say:
When you understand the people you're trying to reach - and then design from their perspective - not only will you arrive at unexpected answers, but you'll come up with ideas that they'll embrace.
This is the central philosophy that human-centered design revolves around - the belief that the best solutions to a problem are desirable in that someone wants to use them. This is where User Research comes in - it enables a team to understand people, make complicated things simple, inform design based on data, and tell stories that people will believe in.
A great introduction to HCD is found through studying the definitions from Bruce Tognazzini's extensive work on the subject:
- Discoverability - "Ensures that users can find out and understand what the system can do."
- Affordances - "A relationship between the properties of an object and the capabilities of the agent that determine just how the object could possibly be used."
- Signifiers - "Affordances determine what actions are possible. Signifiers communicate where the action should take place."
- Mappings - "Spatial correspondence between the layout of the controls and the devices being controlled."
- Feedback - Immediate reaction and appropriate amount of response
User Research is not about predicting the future, pitching ideas or solutions, gathering opinions, nor testing the knowledge of users. Instead, User Research helps teams learn about users to create services that meet their needs, understand the variety of user types and the context in which people live their lives. User research in government is about understanding your users, making research inclusive and finding out what works. User research is a continuous and iterative process, involving wider teams where findings are proactively shared and inform iterative project planning and design.
Design Council's Double Diamond clearly conveys a design process for designers and non-designers alike. The two diamonds represent a process of exploring an issue more widely and deeply (divergent thinking) and then taking focused action (convergent thinking).
- Discover - The first diamond helps people understand, rather than simply assume, what the problem is. It involves speaking to and spending time with people who are affected by the issues.
- Define - The insight gathered from the discovery phase can help you to define the challenge in a different way.
- Develop - The second diamond encourages people to give different answers to the clearly defined problem, seeking inspiration from elsewhere and co-designing with a range of different people.
- Deliver - Delivery involves testing different solutions at small-scale, rejecting those that will not work and improving the ones that will.
What is Data Science
At a high level, data science is a set of applied mathematical and computer science techniques that support and guide the principled extraction of information and knowledge from data. It's the ability to take data and to process it, to extract value and actionable insights from it, to visualize it and to ultimately communicate your understanding of it.
Monica Rogati's Data Science Hierarchy of Needs is a good place to start in order to understand the different applied techniques that constitute data science.
How do we tie Data Science to User Research
In this section, I combine the Data Science Hierarchy of Needs with Design Thinking & the Double Diamond from Design Council resulting in a series of concrete considerations and actions.
DISCOVERABILITY / DISCOVER
At the bottom of the pyramid is data collection. What data do you need, and what's available? For digital products, we may ask whether relevant user interactions are being logged? While the nitty gritty of this falls into the purview of a Business Analyst, by taking a human-centred approach using user research methods we start to unearth how people are interacting with this data. It stops being about ‘what data is needed' but ‘why is this data needed' – what's the impact of this person not having access to this data?
AFFORDANCES / SIGNIFIERS / DEFINE
Next, how does the data flow through the system? What reliable data streams /pipelines are in place? Where is the data stored, and how easy is it to access and analyse? How are people able to find data? It's one thing to look at data as a single asset, but it means nothing if no one can find it or use it. Only when data is accessible can you then explore and transform it. You can't achieve much or derive much value from messy data. Here, data cleaning and data wrangling are priorities – but also understanding how this is done.
Next, Business Intelligence (BI) and Data Analysis is generally undertaken using cleansed data. In Data Science, training data is prepared by generating labels, either automatically or with humans in the loop. And humans should always be in the loop. We've all heard of dashboards that are built but then no one knows how to use them or what's even gone into populating the different fields.
The next stage involves experimentation. Here, A/B testing or experimentation models are put in place so that features can be deployed incrementally to get a rough estimate of the effects before they scale. This is a crucial stage in which to gather feedback from people. Not just A/B testing, but to test the workflow, to see how what you've built fits into the work that people – the people you want to use your service or product - are doing or intending to do.
Finally, with data that's been collected, parsed efficiently, cleansed and with baseline algorithms defined, you're able to run complex machine learning algorithms and implement artificial intelligence techniques. BUT when User Research informs every single step, you end up with data and a service or product that has been created with your users in mind. This means that it is more likely to be used and not just another poorly-researched project that is created top-down with no consideration for the human that will use it.
When numerical data and qualitative user understanding are combined, we can prove user behaviour hypotheses right or wrong. After thousands of hypotheses, tests and improvement measures, we know an astonishing amount of information about the desires, habits and behaviour of our users.
Data scientists must ensure discoverability in products. There is often a disconnect here because there is a belief that insights derived from statistical models or advanced analytics are in and of themselves discoveries and so therefore are already discoverable. However, this assertion is incorrect because insights are only as valuable as they are applicable to the business or user. Therefore, we must articulate what it means to deliver data science products that are more discoverable. This includes all the elements of discoverability including identifying affordances, signifiers, mappings, and opportunities for feedback.
A data science product is delivered in the context of interacting humans and is thus only as good as it allows users to discover how its affordances improve their experiences. An affordance relates to the relationship between the user and the product. If a data science classification model replaces the need for someone to click through thousands of documents to find information, then its affordances are time, improved quality, and augmented performance. These should be clearly discoverable through the way the product is delivered through documentation and signifiers.
Data science may provide an insight into what's happened and what will happen, but it's User Research methods that help us to truly understand why those things happen, and what should be done about it.
- Cosley, Brandon - Human-Centred Data Science
- Design Council - What is the framework for innovation? Design Council's evolved Double Diamond.
- IDEO - Design Kit: The Human-Centered Design Toolkit
- Olusesan, Peter - On data science in human-centered design
- Rogati, Monica - Data Science Hierarchy of Needs
- Valmari, Jarmo and Kuosmanen Samu - Data Science and Design – Why Combine Them?