Skip to main content

Technical notes

This FAQ page provides further technical details about the KEF. If your question is not answered below, please let us know by emailing

Frequently asked questions

Metric source data

Where does the data for the KEF come from?

The majority of data is currently derived from the Higher Education Business and Community Interactions (HE-BCI) survey. This annual survey is run by HESA and completed by all English higher education providers registered with the Office for Students as ‘Approved Fee-Cap’ and across the UK by all higher education providers regulated by the Scottish Funding Council, Higher Education Funding Council for Wales (HEFCW) and Department for the Economy, Northern Ireland (although note that the KEF only includes English institutions at present).

The KEF also includes additional data provided by Innovate UK (Working with business) and Elsevier (Co-authorship in Research Partnerships). Full details of the source data for each metric is available as a downloadable excel file named ‘KEF metrics – data sources table’ published alongside the January 2020 KEF decisions report.

The Public & Community Engagement score is derived from a self-assessment – see ‘Public & Community Engagement self-assessment’ for further information.

Further information on the co-authorship metric

Elsevier is a global leader in information analytics and has supported the development of the metric for co-authorship with non-academic partners. The metric, and its underlying data, is generated using Elsevier’s SciVal and Scopus tools.

Elsevier took the list of KEF eligible institutions and identified all known affiliated organisations. From a data extract of 1st February 2021, and covering the three calendar years 2017 to 2019, the following outputs were collected for each institution and its affiliates: articles; conference papers; reviews; books; book chapters. The collected outputs were then analysed for the presence of non-academic authors, enabling the proportion of outputs involving non-academic co-authorship to be calculated. Further details on the method used and how to access the underlying data are given below.

How was the metric for Co-authorship with non-academic partners produced and what database was used?

The co-authorship metric was calculated using Elsevier’s SciVal system and the Scopus database. SciVal is an analytical tool enabling research activity and performance to be systematically evaluated. SciVal has a global reach covering more than 20,000 research institutions and their associated researchers across more than 230 nations. Elsevier’s Scopus database is the World’s largest curated abstract and citation database. Scopus is source-neutral and covers outputs from over 5,000 publishers, drawing from some 24,600 serials, over 101,00 conferences and over 231,000 books. Updates occur daily with some 10,000 articles/day indexed. As at December 2020, the database included over 81 million documents.

What parameters were employed for generating the co-authorship metric?

A snapshot of data from the Scopus database was taken as at 01 February 2021. This has been analysed by calendar year for the three years 2017, 2018 and 2019. Analysis of the snapshot is focused on the following five output types: Article; Conference Paper; Review; Book Chapter; Book. This focus mirrors the methodology Elsevier uses in the Times Higher Education World University Rankings. Similarly, the list of organisational affiliations used to generate the KEF metric began from the affiliations employed for the THES 2021 rankings and was then expanded to reflect affiliation changes that Elsevier had subsequently recorded.

How are non-academic co-authors identified in the analysis?

The Scopus database employs a combination of automated tagging, manual curation and feedback from its user community to classify organisations and to generate affiliations. The database includes over 90,000 organisations and alongside a range of metadata are classified by function e.g. research institutes, policy institutes, charities etc. Within the analysis, having excluded single-author outputs from the co-author set, SciVal was employed to identify all non-academic co-authors through their affiliation to relevant organisations including those where the organisations are not UK based. Organisations (and hence non-academic co-authorships) were classified as being “medical”, “corporate”, “government” and “other”.

How are outputs attributed? Can an output appear more than once within the analysis, for example if the output involves an academic collaboration between a number of KEF eligible institutions?

SciVal has been used to show an output involving collaboration with multiple KEF eligible institutions as an output for each of those KEF eligible institutions. While an output is attributed to each relevant KEF institution, it is recorded only once for each institution e.g. an output appears once for each eligible KEF institution involved no matter how many non-academic collaborators there are, even if those non-academic collaborators are from different sectors or countries.

Can I review the underpinning co-authorship data for an institution?

Elsevier’s SciVal users will be able to generate much of the data for themselves using the parameters and methods described above. In addition, Elsevier has agreed to provide the underlying publication data that has been generated for the co-authorship metric to authorised individuals from each institution. The process to obtain the data is for the institutional KEF contact to contact the Research England team at Research England will pass on all legitimate requests to Elsevier along with details of the relevant KEF contact. Elsevier will then liaise directly with the KEF contact to produce the required data. This will be provided to the KEF contact as an Excel spreadsheet. A copy of the data will also be sent to Research England along with any other details associated with the response.

Other data and calculations

Can I download the source data for the KEF?

No, it is not possible to directly download the source data used for the KEF calculations. It is possible to download a summary of the calculated decile data for all institutions displayed in the metric in CSV or Excel format. It is also possible to download images of the KEF dashboards as Image, PDF or PowerPoint files. If you require the information in a different format, please contact

Will the metrics change in the future?

In the first iteration of the KEF we chose what we considered to be the most suitable metrics currently available. We are actively looking to develop the metrics used in the KEF and will undertake a review of this first iteration to further consider the effectiveness of the metrics used and whether any changes are required.


Is the data by academic, financial or calendar year?

The vast majority of data is provided by academic year. The only exceptions are:

Decile calculations

Which HEIs are included in the sector decile and cluster benchmark calculations?

The metrics of all eligible HEIs were included in the sector decile and cluster benchmark calculations, irrespective of whether their individual metrics are displayed or not.

How have you calculated the average values for the KEF?

We are using two methods for calculating the three-year averages, and the method selected for a particular metric will depend on which is most appropriate for the underlying dataset as follows:

Average Method 1: will be used where the dataset has zero values in the denominator of one or more of the three years being averaged (which would otherwise result in a ‘divide by zero’ error when using method 2).

\[(a_1+a_2+a_3) \over (b_1+b_2+b_3)\]

Average Method 2: will be used for all other metrics

\[ \frac{a_1}{b_1} + \frac{a_2}{b_2} + \frac {a_3}{b_3} \over 3 \]

Where average method 1 needs to be used for a single HEI to prevent a divide by zero error, it will be used for all HEIs within that metric.

Further details and example calculations are included in the January 2020 KEF decisions report.

Note that SOAS did not submit a HESA Finance return in 2018/19. The metrics that use finance data as a denominator are therefore calculated for SOAS by using the 2017/18 finance figures in place of the missing 2018/19 figures.

What happens when the denominator is zero for each of the three years?

In this scenario both averaging methods return a divide by zero error. In these instances, we will manually apply an average of zero for the metric.

How do you use scaling?

Once the three-year average for each metric has been calculated, we will use feature scaling to normalise to a 0-1 scale, using the formula below, where x’ is the normalised value and x is the original value calculated in the previous step.

\[x' = x - min⁡(x) \over max⁡(x) - min⁡(x) \]

How are perspective deciles calculated?

To calculate the perspective decile; first the mean average of all the normalised metrics in the perspective is calculated. This figure is then used to calculate the decile rank for each institution in that perspective.

How are cluster averages calculated?

Cluster averages are calculated by taking the mean average of the deciles of institutions belonging to that cluster for each perspective.

Is the full 1-10 decile range used for every perspective?

In the following three perspectives more than 10% of HEIs returned a zero value or (in the case of Public & Community Engagement), more than 10% of providers returned an identical score. Rather than assign all these providers to decile 1 (the bottom 10% of providers), we reduced the decile range. For example, where 30% of providers have a zero value they will all be assigned to decile 3 (which is displayed on the chart as bottom 30%). While this will provide a fairer representation of an individual institution’s performance, it will marginally affect the calculation of the cluster average. The following perspectives use a reduced decile range:

I think the data is wrong – who do I contact?

Contact us by emailing in the first instance. If your institution wishes to put forward amendments to HESA data (including HE-BCI, and the finance or student records), there is a formal data amendments process.

KEF Clusters

What is the purpose of the KEF Clusters?

The purpose of clustering is to group the KEF participants into KEF clusters that have similar capabilities and resources available to them to engage in knowledge exchange activities. In this way, the KEF provides a fairer comparison of performance between similar institutions.

Is one cluster better than another?

No - it is important to note that the KEF clusters are not a ranking in themselves. No one cluster is better or worse than another – they are simply a means to enable fair comparison across a very diverse higher education sector.

How did you decide which institutions were put into which cluster?

The clusters were determined through a statistical analysis undertaken by Tomas Coates Ulrichsen using the following data:

Scale & focus of knowledge activity by domain
Physical assets
Scale of knowledge generation by domain
Intensity of knowledge generation by domain