BFI Filmography project overview

BFI Head of Data Stephen McConnachie explains how the Filmography was created from historical data sources and the definitions of criteria for inclusion.

BFI Filmography

In 2022 the BFI Filmography data visualisation platform was taken offline. The Collections Information Database has an online version where you can search across the BFI National Archive’s collections data.

Search the database

Find out more about the the BFI Filmography.


In the BFI’s Film Forever five-year strategy, published in 2012, we committed to the creation of a national filmography to include every British film that meets a specific set of criteria. The first phase of that data project has been completed, with a public launch at BFI Southbank in September 2017 and an interactive web application made available to let users interrogate the filmography and view and share data visualisations of the results. Here we explain how the BFI Filmography was created from historical data sources, defines the criteria for inclusion, and goes on to explore some of the data analysis and research projects it is already informing in line with the new BFI2022 five-year strategy.

The idea of a national filmography is not new, and there are several high profile exemplars which offered inspiring models for the BFI when we scoped our own Filmography. Perhaps the most well-known of these is the American Film Institute’s Catalog of Feature Films, which started in print in 1971 and transitioned to database-only in 1997. Another example is the German Filmportal, launched by the Deutsches Filminstitut in 2005. In scoping the BFI Filmography, we referred to these and other groundbreaking exemplars in defining our inclusion criteria and in shaping the data project and its public manifestations.

Background and data sources

The BFI is well placed to define the UK’s national filmography, given its role as creator and publisher of a document of record for films released to British cinemas, since very close to the institute’s inception. As well as running the BFI National Archive, from 1934 the BFI began publishing the Monthly Film Bulletin, with a mandate to review every film released to cinemas in the UK. This document of record agenda persists today in the BFI through Sight and Sound, which absorbed Monthly Film Bulletin in 1991. The entire collection of MFB and S&S issues were digitised by the BFI in 2010, so instant access to pdf versions helped streamline our checking and collating processes. Historically the BFI captured full cast and crew, as credited on the film or in trustworthy secondary sources. In recent years Sight and Sound has reduced the granularity in its print output, but in the BFI’s Collections Information Database, where the data is captured, a completist documentation policy has persisted for films deemed British: we capture close-to-complete cast and credits.[1]

When we defined the Filmography project plan [2], we identified the period from 1895 (the start of British film production) to 1934 (the publication of Monthly Film Bulletin) as a critical gap in the BFI’s ‘document of record’ dataset, and we scoped solutions to fill that gap in the database. Those familiar with British film historiography will surely predict the solution: Denis Gifford’s British Film Catalogue, the authoritative reference volume in this domain, which Gifford compiled over decades of painstaking research using trade journals in the BFI Library and interviewing filmmakers themselves (the interview collection is held in the BFI Reuben Library). The BFI’s Documentation department created database records [3] for every film Gifford catalogued in his Fiction volume, going back to 1895. In fact, we also added identifiers from Gifford to all of our film records, from 1895 to 1994 (when the scope of the Catalogue ends): in this way the Filmography has the DNA from the Catalogue wound into its spine, the two datasets forever enmeshed with matching identifiers.

The BFI Filmography is built – therefore — on the deep, solid foundations laid by Denis Gifford and by the writers and editors of Monthly Film Bulletin and Sight and Sound over eight decades: these historical streams of information wrangled into modern order by the cataloguers, data managers and coders of the BFI’s Filmographic-now-Documentation department.

Inclusion criteria

The Filmography has four key inclusion criteria: Britishness, ‘film’ as a work type, feature-length and release to cinemas.

Britishness

In the BFI there is a well-established methodology for defining a film as British: one or more of the credited production or funding companies is based in the UK, registered at Companies House. This approach has been deeply ingrained in the policies of Monthly Film Bulletin / Sight and Sound, and for that matter Gifford’s Catalogue. A film is considered British if just one small company among many large companies that contributed to the production or the funding, is defined as British at Companies House. It is an inclusive approach that occasionally leads to confusion over specific films which have entered common consciousness as American, but which were partly made by British companies. The Terminator (1984, dir. James Cameron) is an example we often use to illustrate this point [5]. It’s worth noting that there is a second approach in the BFI to defining a British film, which comes from our formal mandate to certify films (on application) as British, to let them access the relevant film tax relief. This ‘Cultural Test for Film’ – in operation since 2007 — assesses a film against a matrix of categories including subject matter and nationality of key creative contributors, and issues a certificate to a film which accrues enough points. In order to apply for certification, a film must have a British production company, registered at Companies House, although often that production company is not credited on the film itself. In any case, a film which receives Cultural Test certification and cinema release, is included in the Filmography.

‘Film’ as a work type

BFI documentation policy for moving image complies with the Work-Manifestation-Item data model described in the EN 15907 data standard, and in describing a work’s type, we aim to reflect the ‘original intention’ of the work, rather than the format of the material carrier on which it is fixed, or its mode of distribution. So if a work was created with the intention of projecting it in cinemas, we classify it as Film, even if it happened to have a simultaneous broadcast on Television. The aim of the Filmography is to document all works that were created for projection in cinemas – this aligns our Filmography with the AFI Catalog. Of course this historical model of cinema release as the primary intention for the distribution of a film work is becoming fluid, with video-on-demand online streaming and download-to-own weakening the boundaries between Film, TV and Internet as work types. No doubt the Filmography will have to adjust, to maintain a relevant definition of the film work, but for now we focus on cinema release.

Feature-length

Definitions of feature-length proliferate in the domains of film festivals and award ceremonies, and in filmography policies. The BFI Filmography’s definition is 40 minutes or greater, which aligns with the AFI Catalog and the Academy Awards. It is an inclusive definition, which lets in ‘B films’ and early cinema three-reelers (including the earliest film in the Filmography – more on that later).

Release to cinemas

The ‘document of record’ foundations on which the Filmography is built focussed on those films released to cinemas in the UK. In the main, this meant mainstream cinema distribution via members of the Film Distributors’ Association or predecessor (the FDA celebrated its centenary in 2015). So the concept of ‘cinema release’ is baked into the Filmography from its historical data sources. As Sight and Sound coverage has reduced slightly to exclude a few small releases, the completist agenda of the Filmography has been maintained by referring to the FDA’s online database. As with the ‘film as work type’ criterion, it is inevitable that this ‘release to cinemas’ criterion will have to adjust in response to contemporary developments (VoD, etc), if we are to maintain the Filmography as a document of films distributed to UK audiences using the dominant modes of distribution.

In addition, it is clear that by excluding films where release is negotiated directly between filmmaker and cinema, or films shown in clubs or less formal commercial contexts, we are excluding an important set of films from British feature film history, and a set that may in particular offer exhibition possibilities for women filmmakers potentially excluded from mainstream commercial supply chains. At this iteration of the Filmography we have limited our inclusion to the ‘document of record’ (Monthly Film Bulletin, Sight & Sound, FDA) set, in order to achieve a complete dataset and publish the Filmography with available resources. For ongoing development of the resource and our data capture methodology, we will engage filmmakers and organisations in discussions of how we might extend our data capture to include these important cases, with available resources.

Now that we have defined the criteria for inclusion and the data sources, we will explore some of the further research, analysis and publication work already underway on the Filmography.

General insights from the Filmography: production, content, careers, collecting

Now that we have a complete national corpus of British feature films, stored in a database with modern query and data extraction functionality, we can ask small and large questions and add new levels to our understanding of that corpus using data analysis and visualisation techniques.

A basic but important insight is simply a new baseline understanding of the numbers of films produced across the timeline. We now know that Britain has produced, and released to cinemas, around 9500 feature films since 1911. The first three-reeler, therefore the first film in the Filmography, is Rob Roy directed by Arthur Vivian, and it is the only film included from that year. 1912 adds four films, but in 1913 the industry ramped up feature production, with 53. We are now able to track the peaks and troughs of British feature film production with forensic accuracy, and to interpret those trends against historical and industrial factors: world wars, the rise of television and home video, domestic politics and policy interventions such as public funding. We know that the 1930s was the peak decade for production, with 1548 films; and the 1980s saw lowest output with 468 films.

We can also track other developments across the timeline: the rise of international co-production, trends in average running times, and the popularity of genres and subjects. Noting that genre and subject are far from straightforward things to catalogue consistently, nonetheless we now have a ranking table for our major genres and a long tail of obscure genres. For example we know that the Sex Comedy, often seen as a staple of British feature films, accounts for 0.005% of films – it is a genre almost entirely bounded in the 1970s and 80s. In the wake of Dunkirk by Christopher Nolan, it’s interesting to note that War as a genre peaked in the 1940s at 24% of films that decade, and has been in steady decline since – now accounting for only 1.5% of British features.

A simple tag cloud of the subjects covered in the Filmography tells a topical story, with Europe and Western Europe among the most featured: the British love/hate relationship with Europe plays out in our feature films.

We can also track our national obsessions by looking at the most used words in the titles of our feature films (excluding generic words like to, for, in, and). In particular the top 20 words are like a found poem about how we depict relations between the sexes in the feature film.

WordCount
Man
219
Love
136
Night
118
Life
92
Girl
75
London
74
House    
72
Woman
71
Lady
69
Men
67
Mr
65
Last
62
Death
59
Black
57
Little
56
World
55
Time
55
Secret
54
Old
51
Murder
51

We know also that the British feature film was built on the national literary and dramatic traditions, exploiting well-established novels and plays for subject matter. But now we are able to track our obsession with those fictional creations alongside our obsession with historical characters, by looking at the most featured character names in the Filmography: Sherlock, the Royal Family, Bond, Music Hall, Harry Potter. Pushed out of the top 12 by the powerful franchises from royalty and literature: Robin Hood.

Table of most featured characters in the BFI Filmography

CharacterFilms
Queen Victoria
25
James Bond
25
Sherlock Holmes 
24
M (Bond)
23
Miss Moneypenny (Bond) 
21
Q (Bond)
19
Prince of Wales
13
Old Mother Riley
13
Queen Elizabeth
12
Felix Leiter (Bond)
10
Henry VIII
8
Harry Potter
8

These pub-quiz-style insights are interesting, but the real power of the Filmography is in asking bigger and more serious questions with policy implications — for example questions about careers for filmmakers in the British film industry through time. We now know that the 10,000 films in the Filmography were directed by around 3,395 directors, giving a mean of 3.2 films per director. If this seems low in terms of a career as director of British films, consider how it looks when unpacked a little. Of those 3,395 people who got to direct British feature films, 62% directed one British feature only, 38% directed two or more, and only 6% directed 10 or more. Now of course the Filmography is not the only place directors can build a career, and it’s likely that many directors migrated into and out of television, advertising, even video games latterly, during their careers. So there is a sea of complex data that is hidden to the Filmography, where rich careers are described, offscreen from this graph — as it were.

However, we now know that the percentage of cast or crew who get to make more than ten feature films in any capacity is low, and although it is likely that many build careers in television and other moving image industries, we hope that the Filmography can support deeper analysis of this situation. We do know that the numbers of British feature films people make, and their average career durations, have been decreasing every decade since the 1940s. The Filmography provides a solid evidence base for detailed analysis and tracking of the realities for career building within the British feature film, and for data-driven interventions to improve the most important areas.

Another major benefit is for the BFI National Archive: we can now track the percentage of our national filmography that is held in the collection, and facet that information on decade, genre, actor / director / production company. We can identify the critical gaps in the national collection along cultural or industrial lines, and take action using this evidence base to provide new momentum and force. We can use the data to form public engagement strategies (for example, new calls for action within BFI Most Wanted). We can quantify and contextualise the specific challenges in collecting our national feature film in the born digital era, and we can use this new data-driven approach to help shape lobbying initiatives for statutory deposit or de facto statutory deposit. The Filmography becomes a vital statistics monitor, which we can use to measure the fitness of the collection against our published collections policy, and indeed to help inform the revision of that collections policy.

Data analysis and research projects: gender

When we first described the Filmography project, we were asked a very direct question: it’s all very well, but can you use this data to tell me how many women directors have made British films? That provocation proved extremely valuable, as it kick-started a gender data research project that now means we can answer that question and many others about gender in our feature films.

We have almost 250,000 person records in our Filmography database, and using the methodology described below we assign an inferred gender of Male / Female to 91% of those persons.

  1. We used the UK Office for National Statistics gendered names datasets, which they publish every year, and which match names to gender (eg Stephen = male, Rebecca = female)
  2. We used the male and female sets to assess the first names in our person records, and assigned an inferred gender where there was an unambiguous match
  3. Where there was no match (for two reasons: the name is not in the ONS dataset, which happens most with non-English-origin names; or the name is not gender-specific – eg George, Kim, Leslie, Alex), we assigned an Unknown status to the person, rather than try to automate via logic
  4. We did two things to improve on the initial yield:

    a. For actors, we used the same logic on their character names. So where a gender-neutral actor eg Kim played Lucy, Rebecca, Elisabeth, Margaret, we inferred Female. Where Kim played Robert, Stephen, Edward, we inferred male

    b. For the most prolific Unknown persons (five or more credits), we manually researched their gender and assigned male or female

Some things we did not yet do:

  • Infer gender for non-gender-specific names based on statistical probability – eg ‘George is more likely to be male than female, statistically’
  • Find robust datasets for non-English origin names. There are lots of Middle Eastern and East Asian names in our dataset, among other ethnic and geographic sets
  • Develop a self-certification methodology or a data model for non-traditional gender categories such as transgender, transsexual, intersex, gender-neutral

This inferred gender for over 200,000 people (effectively, the majority of people working in the British feature film industry since 1911) lets us undertake very detailed analysis of gender parity across many facets over the timeline:

  • career longevity and output: do women’s careers last as long as men /do they make as many films as men, and how is that changing
  • departments: which areas in filmmaking support careers for women and which don’t – and how is that changing
  • senior roles: compare opportunities for women in key creative roles (director, writer, producer, editor, art director, composer) with those for men, over time
  • how do women in key decision-making roles influence gender balance in cast and crew?
  • genre and subject: which types of film, about which topics, give more parity to women

We have only scratched the surface of this area of data analysis, but the findings are already powerful and potentially transformative in terms of a data-driven approach to forming policy, lobbying, investment in skills, and so on. A few examples which illustrate the potential:

  • 9% of the film directors in the Filmography were women
  • Female directors direct half as many films as male directors, as an average (1.5 compared to 3.4)
  • gender balance in crews has been increasing very slowly but steadily since the 1930s, but it is still only around 30% female — well below the UK workforce average of 47%
  • Of all the main filmmaking departments, only three have ever reached 51% female crew: Casting (70%), Make-up (61%) and Costumes (72%)
  • The departments with the weakest gender parity are Photography (5%) and Sound (10%) — across the timeline and still in 2017

How do the key creative and decision-making roles compare with these whole-of-department findings? Only 2% of Director of Photography credits have gone to women, and the same is true for Sound Recordist / Sound Mixer. Only 4% of Music Composer credits have gone to women. It’s important to emphasise that these are blunt-tool analyses of all credited roles since 1911, and the analysis does yield clear evidence of improvement in gender parity in some departments over the timeline. But in several departments – Photography, Sound, Writing, Special Effects — there is barely any sign of improvement even in the last decade. And overall, the pace of improvement is glacial: in 1946 15% of crew were women, and in the 70 years since that has increased only by 18%, to 33%. In the last ten years, the increase has been only 3%.

Thanks to a collaboration with data analysts from NESTA, we have tentative results suggesting that where a film has only women in the roles of Director and Writer, there is a noted correlation with increase in percentage of crew that is female. The shift may be as high as an 8 percentage points increase in female crew (28% to 36%), compared with only male Director and Writer. The same is true with only female Producers, a 10% increase in comparison with only male Producers (25% to 35%). Similar correlations are seen in cast, where only female Directors, Writers, Producers are credited. Although we must be careful of making simple causal interpretations (there may be hidden factors), this correlation chimes with anecdotal claims (and indeed with David Oyelowo’s Black Star keynote at BFI Southbank in October 2016) that diversity of the key decision makers can structurally change the diversity of the filmmaking project as a whole.

The Filmography provides an evidence base for these types of longitudinal gender analysis across the timeline and across other facets such as genre, and we hope it can lead to an increase in data-driven policy development, including outreach and engagement with film training and education providers, to identify departments and roles where women are under-represented, and advocate and encourage for greater diversity.

Data analysis and research projects: ethnicity

The other key area for diversity research and analysis in the Filmography is ethnicity. During 2016’s Black Star series at BFI Southbank we published the first findings from a long term project to analyse the ethnic background of actors in the Filmography, using quantitative and qualitative methods to understand what types of role in what types of film are offered to ethnically diverse actors. That first phase focussed on Black actors in named or lead roles in a ten year sample, and we are now extending the scope initially to cover the 21st century, and to investigate all ethnic backgrounds – using the categories from the UK census in our data modelling. As well as undertaking this retrospective data capture for the ethnicity of actors, we are considering strategies for streamlining diversity data capture into the Filmography going forward, for all contributors to the film, and looking closely at UK TV sector initiative Project Diamond for inspiration.

Next steps for the Filmography

We hope to form partnerships and develop collaborative projects as we did with NESTA to deliver data analysis and data visualisation of some of the key research questions that the Filmography can answer, on gender, ethnicity, film history, industry, genre, and any other facet of this huge dataset. We would like to track career migrations into and out of the Filmography — to and from television, advertising, music videos, video games, interactive and immersive a/v, to and from Hollywood, Europe and the other centres where moving image careers are built – to form a clear picture of the ecosystem that surrounds and the BFI Filmography, both feeding talent to the feature film and luring it away.

We also seek partnerships with pioneers in the Digital Humanities to interrogate the filmography as a corpus of digital video and audio, digitised from the BFI National Archive collection, to ask questions using the emerging powers of computer vision and computational analysis of audio, like speech-to-text and mood analysis. Questions such as: how long are shots in the BFI Filmography, by decade, genre, gender of creators? What are the editing patterns? What colour schemes are in the Filmography? What does the Filmography sound like? What words does it use in its scripts, and how do female actors get to speak them compared with male actors? What moods are in the Filmography? What regional accents?

These and the other myriad research projects we hope to develop are built on the principle that the Filmography is a dataset to be developed and mined, to help us refocus on the British feature film as an object of analysis using a completely new set of precision lenses, tooled by methods from the emerging field of data science.

Footnotes:

  1. We exclude a small subset of credited roles in auxiliary areas such as catering and transportation, and we do not document all credited roles in work sub-contracted to animation companies, for example.
  2. The Filmography project was specifically funded as a five year project within the BFI’s Film Forever five year strategy, although it was built using core BFI Collections and Information systems and of course benefitted from decades of work predating Film Forever.
  3. This was achieved by manual keying. We considered a project to scan Gifford’s book, OCR the text, and import to the database to auto-create records, but the BFI cataloguers achieved it before such a project could be scoped.
  4. The BFI through agreement with the Gifford estate has rights to make use of the Catalogue in this way.
  5. London-based Euro Film Funding was among the production companies.