Skip to content

Interview with an analyst: Pupil Census

Classroom scene with desks facing towards a chalkboard.
Blog posts

Average reading time 5 minutes

21 Oct 2025

Get to know the Pupil Census dataset with Data Analysts Stella Telford and Jen Muir.

With a whole host of new and updated Education datasets now available within the National Safe Haven, Research Data Scotland interviewed Data Analysts Jen Muir and Stella Telford about their experience with Pupil Census dataset whilst seconded into the ADR Scotland team within Scottish Government and working closely with their Government counterparts– this was foundational work to enable data linkage across a whole host of education datasets and beyond. 

Can you tell us some of the key information about this dataset, and what makes it special?


Stella Telford, Data Analyst: The Pupil Census is a yearly survey of every pupil in publicly funded schools in Scotland. 

The information is collected and submitted by the schools themselves, using their attendance roll at the start of each academic year. ADR Scotland’s latest update to the Pupil Census dataset available for research covers academic years 2020/2021 through to 2022/23.  The dataset details the demographic data of all individual pupils in Scotland who attend publicly funded schools, as well as valuable class and school level data. In a way, it’s a school version of the Scottish census, giving us a snapshot of the pupils within the publicly funded school system at a specific time. 

Jen Muir, Senior Data Analyst: It’s this census-style information that makes this dataset so valuable. In fact, the Pupil Census is the backbone for research linkage when it comes to Education datasets in the National Safe Haven. By linking this dataset to other datasets such as Achievement of Curriculum for Excellence Levels (ACEL) or Exclusions, it broadens the scope of research possible with these other datasets, which in themselves do not collect a lot of other person or contextual details. 

The demographic information held within the Pupil Census can give valuable insight into the context of this other data, by providing information on variables such as protected characteristics, national identities, access to curriculum adaptations and geographic information. 

Stella: The possibilities created through linking the Pupil Census to other datasets in the National Safe Haven are impressive. Beyond other Education datasets, it’s also possible to link it to many other datasets which cover policy priority areas such as health and social care, and further education and employment. 


It sounds like this is a crucial dataset for Education research in Scotland. Can you tell us more about Research Data Scotland’s role in the update, and the work carried out since we joined ADR Scotland as a delivery partner?

Jen: Research Data Scotland is supporting Scottish Government with delivering elements of the ADR Scotland programme. As part of this work, some analysts within our Data Team were seconded to Scottish Government to carry out secure data processing work. 

This was an opportunity to refresh and speed up the process of preparing these datasets for research use in Scotland’s National Safe Haven. For us, it was also a unique chance to gain valuable insights on the working of Scottish Government and a better understanding of the needs of data controllers, analytical areas and wider user groups within Scottish Government.

The Pupil Census was the first ADR Scotland dataset that we began working on. This worked well as this is a large dataset, which provided a great learning opportunity for getting to grips with the data processing procedures. The oversight provided by colleagues within the Scottish Government (in particular Education Analytical Services, the ADR Scotland SG Research Engagement Team Leader and Assistant Statistician) was key to driving our work and ensuring compliance with Scottish Government data governance.

Stella: The original ingest of the Pupil Census was carried out in SQL – a common coding language – but from our research into different options, the R coding language and R Studio emerged as the best option for processing these datasets. The Pupil Census then became our first dataset that we worked on using R, which not only helped us to understand the language and streamline the processing, but ultimately was the instigator for our work on our reproducible R pipeline. The creation of this Reproducible Analytical Pipeline (RAP) in R has helped us speed up the processing work on datasets, not just Pupil Census but several other Education datasets that have since been ingested. 

Jen: The RAP has been instrumental in delivering ADR Scotland datasets. Usually when analysts work on a data ingest, they have to write code from scratch. With the RAP, a large portion of the coding work has already been completed, and analysts will make some changes to this to fit the dataset they are working on. This standardises and speeds up data processing, which means our analysts have more time to process more datasets. In its current form, the RAP is most useful for Education datasets, but it’s continuing to undergo development. I’m really excited to see how it grows and changes as the data we process continues to vary. 

Is there anything else you think people should know about the Pupil Census? 

Jen:  It really is all about the additional linkage opportunities that this dataset enables. The Pupil Census is the key to so much research potential right across datasets within the National Safe Haven, but particularly within Education. 

The Pupil Census data now covers 2007/2008 to 2022/23 inclusive, and there are several datasets already within the National Safe Haven, or about to be newly ingested or updated, that the Pupil Census is directly relevant for. 

Find out more

Researchers interested in learning more about the Pupil Census or other ADR Scotland datasets can find the metadata in our ADR Scotland Data Catalogue.

Want to know more about data linkage? Watch our short video: