Skip to content

Unlocking the power of equalities data: benefits, challenges and the way forward

Rainbow Lego Figures
Blog posts

Zainab Hashmi | Average reading time 7 minutes

11 Aug 2025

Master's student Zainab Hashmi walks us through her research project on the benefits, challenges, and potential road forward when considering using equalities data for research.

Over the past two years I have been studying part-time towards a Master's degree in Data, Inequality and Society at the Edinburgh Futures Institute, a new, future-focused school at the University of Edinburgh. Through my studies, I have learnt about various types of inequality, how data can be used as a tool for good, and what risks and limitations the data can pose.

So, when I met with Research Data Scotland (RDS) to discuss working with them on my final research project, I was very interested when they proposed a piece of research focused on the benefits, risks and practice around equalities data.

RDS has been exploring the use of data on protected characteristics in research for the public good. This covers information on age, disability, gender reassignment, marriage and civil partnership, pregnancy and maternity, race, religion or belief, sex, and sexual orientation. They were interested in the support of an independent researcher to produce guidelines around engaging with this type of data, as well as the potential benefits and risks this data carries. My role was to gather perspectives from individuals working in data-focused roles at various organisations across Scotland (and a couple from beyond) through a survey and focus group, to investigate their perspectives on the benefits, challenges, and current practices of using protected characteristics data, and what can be changed in current approaches to better enable the use of data for reducing inequality.

“Data about groups that face inequality is seriously lacking”

The benefits of protected characteristics data

Data about groups that face inequality is seriously lacking. While there are myriad reasons why this is the case, without this data, groups that are marginalised risk not being fully represented in research. This can impact or limit what policies or other measures can be put in place, potentially deepening inequality instead of improving it.

Identifying inequalities and impacts: Participants in this project saw identifying inequalities as one of the main benefits of collecting and sharing data on protected characteristics, including intersectional analysis. This works alongside the ability to identify different population groups’ needs and to assess the impact that policies have on different groups.

Improving outcomes for people: Once identified, acting upon the inequalities that groups face to advance equality and improve outcomes for people with protected characteristics, especially those who may be marginalised, was the main benefit cited by participants, such as through informing fair policies and improving services.

Benefits of using administrative data: Administrative data can give population-level coverage, helping to fill gaps in the data, especially for groups that are currently underrepresented. Using existing data can also reduce the resources needed to undertake data collection.

“Data on marginalised groups is not only more likely to be missing, but also of poorer quality.”

Risks, challenges and limitations

Protected characteristics data can improve lives, but only if collected and used accurately and responsibly. The participants indicated the range of technical and ethical challenges, risks and limitations they face in the collection, sharing and use of protected characteristics data.

Issues with data collection: Technical and resource constraints can limit the collection of this type of data, as can a lack of understanding in those providing or collecting the data of the importance of accurate recording. For example, data collectors may input incorrect data if they make assumptions about an individual’s identity, rather than allowing them to self-report.

Missing data: These issues with data collection result in missing data on protected characteristics, a frequent concern among respondents. Participants found that data on some protected characteristics was particularly lacking, such as data on ethnicity, gender identity and disability. Limited representation risks the needs of some groups becoming invisible, often intersecting with already marginalised populations.

Poor quality data: Data on marginalised groups is not only more likely to be missing, but also of poorer quality. This risks the misrepresentation of reality for groups about whom the data is collected. Issues with data quality that participants cited include protected characteristics data being inaccurate, inconsistency between data sources, and biased data.

Issues with data sharing: Balancing data utility and privacy was a major challenge, often slowing research, reducing granularity, or preventing data release.

Issues with analysis: The combining of groups with small counts in research risks these groups being underrepresented or invisible in research. For example, grouping ethnic categories to reduce the risk of re-identification of individuals can obscure important differences between groups.

Risks of using and interpreting protected characteristics data: Respondents expressed concerns that equalities data could be misused or misrepresented in ways that reinforce stigma, misrepresent certain groups, or lead to discriminatory outcomes. If these harms occur, the trust of groups or individuals in the organisations that collect and store their data can be lost, impacting their willingness to contribute data in future and further limiting representation within the data. Risks are heightened for small or marginalised groups, where poor interpretation can further exclude them.

Discussions in the focus group demonstrated that there is no single, clear-cut solution to these difficult trade-offs. Consideration must be given to the risk of doing the research and potential negative impacts on communities, against the risk of not doing the research and losing the opportunity of revealing and addressing societal inequalities.

“There were some clear principles guiding participants’ approach to protected characteristics data, such as confidentiality, safety, agency and public benefit”

Current practice

As well as exploring the benefits and challenges of using data on protected characteristics, the research also explored what current practices are in place around this type of data, as well as what should be changed or added to existing practices.

Organisations generally treated protected characteristics similarly to other types of sensitive, personal data, such as information about people’s health.

Examples of how this type of data is treated include: the need for researchers to explicitly justify their use, additional requirements around public engagement and training, stricter reviews, and limited sharing and extra privacy safeguards​. Some participants also indicated that their organisations had enhanced review and controls for what they described as more sensitive protected characteristics​, such as sexuality.

There were some clear principles guiding participants’ approach to protected characteristics data, such as confidentiality, safety, agency and public benefit. Collaboration with stakeholders was another key principle, involving data practitioners, third-sector organisations, and members of the public.

“Equalities data holds enormous potential to improve lives — but only if it's used responsibly, thoughtfully, and collaboratively”

Next steps

Ethical guidelines: When asked what, if anything, could be changed or expanded upon in current practice around protected characteristics, participants largely felt that the technical practices were sufficient to safeguard against many risks. They were interested in more standard ethical guidelines around handling this kind of data that were also practical to use.

Clarity and sharing of existing resource: While guidelines and resources exist on this topic, they need to be made more accessible. Participants wanted to see a centralised place for storage, sharing, and access, with clarity provided on how existing safeguarding frameworks should be applied to protected characteristics data. There was also interest in having more opportunities for data practitioners to come together and share their learning.

Empowering researchers to handle this data responsibly: Participants wanted to shift to focusing on empowering researchers to use and represent data on the protected characteristics responsibly, rather than be discouraged from using it at all. For example, researchers could learn from other people who have experience of using this type of data.

Public involvement: While there are a range of public involvement practices in place, participants were keen to see more resources on how to effectively engage the public, involving peer researchers, and giving people who are asked to share their data more information and agency.

Communication: The importance of clear and accurate communication about this type of data and what we do with it was also emphasised by participants. The public should be empowered to understand how their data is collected and used in research.

As this research has shown, equalities data holds enormous potential to improve lives — but only if it's used responsibly, thoughtfully, and collaboratively. The voices of those represented in the data must be part of shaping how it's collected and used.

We need clearer ethical frameworks that are grounded in practice, not just theory; we need more opportunities to learn from one another and share best practice; and we need to build public trust in the use of protected characteristics data through transparency and inclusion.

RDS will continue bringing together relevant information, individuals and organisations – including members of affected communities – to take forward this work. They will work in collaboration with the aim to improve awareness of the issues, improve practice around use of this data and, ultimately improve outcomes for individuals experiencing inequality.

Related content

People Discussing At Table

New report highlights impact of the Scotland Talks Data public panel

Scotland Talks Data – a public panel co-hosted by Research Data Scotland (RDS) and the Scottish Centre for Administrative Data Research (SCADR) – has published its first impact report, highlighting some of the areas where members of the public have made a difference in data research.

26 May 2025

Subscribe to our updates 

To stay updated with Research Data Scotland, subscribe to our monthly newsletter and follow us on X (Twitter) and LinkedIn

Subscribe to our newsletter
Illustration of an envelope with a letter sticking out and a mobile phone with a person