What is the role of data in jobs in the United Kingdom, Canada, and the United States?
A natural language processing approach
This paper estimates the data intensity of occupations/sectors (i.e. the share of
job postings per occupation/sector related to the production of data) using natural
language processing (NLP) on job advertisements in the United Kingdom, Canada and
the United States. Online job advertisement data collected by Lightcast provide timely
and disaggregated insights into labour demand and skill requirements of different
professions. The paper makes three major contributions. First, indicators created
from the Lightcast data add to the understanding of digital skills in the labour market.
Second, the results may advance the measurement of data assets in national account
statistics. Third, the NLP methodology can handle up to 66 languages and can be adapted
to measure concepts beyond digital skills. Results provide a ranking of data intensity
across occupations, with data analytics activities contributing most to aggregate
data intensity shares in all three countries. At the sectoral level, the emerging
picture is more heterogeneous across countries. Differences in labour demand primarily
explain those variations, with low data-intensive professions contributing most to
aggregate data intensity in the United Kingdom. Estimates of investment in data, using
a sum of costs approach and sectoral intensity shares, point to lower levels in the
United Kingdom and Canada than in the United States.