Job SDF
Job SDF
Abstract
In a rapidly evolving job market, skill demand forecasting is crucial as it enables
policymakers and businesses to anticipate and adapt to changes, ensuring that
workforce skills align with market needs, thereby enhancing productivity and
competitiveness. Additionally, by identifying emerging skill requirements, it
directs individuals towards relevant training and education opportunities, promoting
continuous self-learning and development. However, the absence of comprehensive
datasets presents a significant challenge, impeding research and the advancement
of this field. To bridge this gap, we present Job-SDF, a dataset designed to
train and benchmark job-skill demand forecasting models. Based on millions
of public job advertisements collected from online recruitment platforms, this
dataset encompasses monthly recruitment demand. Our dataset uniquely enables
evaluating skill demand forecasting models at various granularities, including
occupation, company, and regional levels. We benchmark a range of models on this
dataset, evaluating their performance in standard scenarios, in predictions focused
on lower value ranges, and in the presence of structural breaks, providing new
insights for further research. Our code and dataset are publicly accessible via the
https://github.com/Job-SDF/benchmark.
1 Introduction
Job skills encompass a range of abilities and competencies essential for performing tasks effec-
tively in the workplace. These skills are broadly categorized into hard skills, such as technical
and analytical abilities, and soft skills, including communication, teamwork, and adaptability [1].
Accurate forecasting of skill demand helps businesses and policymakers anticipate and address skill
shortages and mismatches, and promotes skill development in high-demand areas, thereby supporting
economic growth and stability [2, 3]. By identifying emerging skill requirements, individuals are
directed towards relevant training and education opportunities, fostering continuous self-learning
and development to stay competitive in the labor market [4–10]. By tracking skill demand trends,
∗
Equal contributions
†
Corresponding Authors
38th Conference on Neural Information Processing Systems (NeurIPS 2024) Track on Datasets and Benchmarks.
employers gain deeper insight into recruits’ priorities, enhancing person-job fit. [11–21]. Moreover,
forecasting informs educational and training programs, ensuring that curricula align with the labor
market’s evolving needs [22–24].
Traditionally, skill demand analysis has relied on labor-intensive, survey-based methods limited to
specific companies or occupations [25–27]. However, over the past decade, the rapid evolution of the
internet has spurred the emergence of online recruitment platforms. These platforms have become the
primary channels for job advertisements for numerous enterprises and organizations, accumulating
vast amounts of job advertisement data. By leveraging this data, researchers have formulated
skill demand forecasting as a time series task, utilizing various machine learning models such as
autoregressive integrated moving average (ARIMA) [28], recurrent neural networks (RNNs) [29],
and dynamic graph autoencoders (DyGAEs) [30], to predict future skill needs.
A major challenge impeding progress in this field is the lack of comprehensive and publicly accessible
datasets. Existing studies do not provide open-source datasets, making it difficult for researchers
to replicate experimental results and identify bottlenecks in current research. Furthermore, these
datasets primarily focus on predicting skill demand variations across different occupations, with a
notable lack of modeling and prediction at other granularities, such as companies or regions. This
limitation hinders comprehensive comparisons between different models and impedes the exploration
of potential downstream applications, such as human capital strategy development and regional
policy formulation. Additionally, the significant variations in skill demand present further challenges.
Existing studies, which rely on metrics such as Mean Squared Error (MSE), struggle to evaluate the
performance of skill demand forecasting models for low-frequency skill terms. For instance, some
emerging skills, such as large language models (LLMs), may initially show low demand but are
crucial for the job market due to their potential to reshape existing occupations.
To this end, in this paper, we propose Job-SDF, a multi-granularity dataset designed for job skill
demand forecasting research. Specifically, we collected millions of public job advertisements from
online recruitment platforms. By extracting skill terms from job advertisement texts, we quantified
the monthly skill demand at various granularities, including occupations, companies, and regions, to
construct our dataset. This dataset encompasses 2,324 types of skills, 52 occupations, 521 companies,
and 7 regions. We then use the Job-SDF dataset to benchmark a wide range of models for job skill
demand forecasting tasks at various granularities. These models include statistical time series models
(e.g., ARIMA [31]), deep learning-based methods such as RNN-based models [32, 30], Transformer-
based models [33–37], MLP-based models [38, 39], as well as several state-of-the-art time-series
forecasters [40, 41]. Performance is evaluated using regression metrics such as Mean Absolute Error
(MAE) and Root Mean Squared Error (RMSE). Additionally, we use the Symmetric Mean Absolute
Percentage Error (SMAPE) [42] and Relative Root Mean Squared Error (RRMSE) [43] metrics to
account for the significantly varying nature of skill demand values, which is particularly useful for
evaluating model performance in predicting lower value ranges. Moreover, we further investigate
the impact of structural breaks in job skill demand time series data on model performance. The
Job-SDF dataset, along with data loaders, example codes for different models, and evaluation setup,
are publicly available in our GitHub repository: https://github.com/Job-SDF/benchmark.
2 Related Work
Skill demand forecasting can analyze how skills evolve over time, aiding experts in evaluating
technological advancements [44–46], assessing wage inequality [47–49], and generating employment
opportunities [50]. Furthermore, the skills required in the 21st-century workplace will differ signifi-
cantly from those in previous eras [51]. Predicting skill demands benefits personal career transitions
and corporate management strategies.
Recently, with the rapid accumulation of data and continuous advancements in technology, skill
demand forecasting has demonstrated significant vitality. Das et al. proposed a method for dynamic
task allocation to investigate the evolution of job task requirements over a decade of AI innovation
across different salary levels [28]. Given the effectiveness of RNN in multi-step prediction, some re-
searchers have integrated skill demand forecasting with RNN algorithms, achieving promising results
[29, 32]. In addition, considering the supply-demand dynamics of the labor market concurrently,
CHGH designed a joint prediction model based on the encoder-decoder architecture to achieve trend
prediction for both skill supply and demand sides [30]. Moreover, to capture the dynamic information
2
of occupations, a pretraining-enhanced dynamic graph autoencoder has been developed to efficiently
forecast skill demand at the occupational granularity [52].
However, the predominance of closed-source datasets has significantly elevated the barrier of re-
searchers and constrained the pace of methodological advancements. While open-source skill-related
datasets such as O*NET [53] and ESCO [54] provide skill taxonomies, they do not quantify skill
demand. Furthermore, the current research data focuses either on macro-market skill demand predic-
tions or analyses at a specific granularity, neglecting multi-level labor market analysis. This limitation
generally hampers the transferability of the modeling approaches.
3 Job-SDF Dataset
The Job-SDF dataset is built from job advertisements collected on online recruitment platforms, en-
compassing dynamic job skill demand time series data at various granularities, recorded monthly. The
dataset is CC BY-NC-SA 4.0 licensed, accessible via the URL https://github.com/Job-SDF/
benchmark. We summarize the dataset construction process, task description, and dataset analysis
below.
Job Advertisement Collection. We collected public job advertisements for 52 occupations from 521
companies on online recruitment platforms. We obtained unique records after removing identical
job advertisements posted simultaneously by different companies on various platforms. Each record
contains five types of information: (1) Job Requirement, which is a text segment that outlines the
specific skills required of candidates applying for the job; (2) Company, which identifies the company
that posted the job advertisement; (3) Occupation, which specifies the job advertisement’s category.
Our dataset encompasses 52 detailed occupations (L2-level), such as front-end development engineer
and financial investment analyst. Additionally, these 52 occupations are grouped into 14 broader
categories (L1-level); (4) Region, which indicates the primary geographic divisions in China where
the job postings are located. These regions are classified based on their geographical orientation; (5)
Posting Time, which records the date when the job was posted, including the year, month, and day.
Job Skill Extraction. After acquiring the job advertisement data, we utilized a Named Entity
Recognition (NER) model, as referenced in [55–58], to explicitly extract skill requirements from the
Job Requirement of each advertisement. Specifically, we first annotated a dataset for training the NER
model by identifying skill terms within the job requirement texts. To achieve this, we devised a set of
regular expressions tailored to the characteristics of skill descriptions and used these to match skill
words in job advertisements. Subsequently, we merged all matched skill words to formulate a raw
skill dictionary, including their corresponding frequencies across job advertisements. We then filtered
out low-frequency words and manually annotated the raw skill dictionary to create a refined skill
dictionary. Along this line, we excluded unreasonable skill words matched by the regular expressions
that did not appear in the refined skill dictionary, establishing an initial correspondence between the
Job Requirement and the skill requirements.
Based on this annotated data, we trained an NER model to extract required skills from the Job
Requirement section for all job advertisements. Experts then aggregated the skills extracted by the
NER model based on their meaning and content, grouping together those with similar meanings or
repeated expressions. This process resulted in a skill dictionary S of 2,324 standardized skill words,
mapping original skill word descriptions to standardized skill words. The skill dictionary was then
used to filter and map the skill words extracted by the NER model, ultimately obtaining standardized
skill requirements for each job requirement. These standardized requirements were added to the job
advertisement data as a new field, Skill Requirements.
Job Skill Demand Estimation. Generally, the demand for different skills in the job market can be
estimated by the volume of job advertisements listing these specific skills as requirements within
a given time period [30]. Formally, given job advertisement data P = {P1 , ..., P Pt , ..., PT }, where
each Pt represents the job advertisements posted at timestamp t, we use Ds,t = p∈Pt 1(s ∈ p) to
estimate the demand for skill s ∈ S at time t. s ∈ p indicates that job advertisement p requires skill s.
Along this line, we can calculate skill demand at various granularities, such as occupation and
company levels. We define the sets of L1-level occupations, L2-level occupations, companies, and
3
Product Manager Doctor
3000 Logical Analysis 200
150 Consulting Services
2000 Cooperation
100 Cooperation
1000 Consulting Services 50 Logical Analysis
0 0
(a) The skill demands under two occupations.
Product Manager-Understanding Ability Salesperson-Communication Skill
Actual Actual
200 Fitted 200 Fitted
Chow Test Chow Test
100 100
0 0
2021-01
2021-02
2021-03
2021-04
2021-05
2021-06
2021-07
2021-08
2021-19
2021-10
2021-11
2022-02
2022-01
2022-02
2022-03
2022-04
2022-05
2022-06
2022-07
2022-08
2022-19
2022-10
2022-11
2023-02
2023-01
2023-02
2023-03
2023-04
2023-05
2023-06
2023-07
2023-08
2023-19
2023-10
23 1
-12
2021-01
2021-02
2021-03
2021-04
2021-05
2021-06
2021-07
2021-08
2021-19
2021-10
2021-11
2022-02
2022-01
2022-02
2022-03
2022-04
2022-05
2022-06
2022-07
2022-08
2022-19
2022-10
2022-11
2023-02
2023-01
2023-02
2023-03
2023-04
2023-05
2023-06
2023-07
2023-08
2023-19
2023-10
23 1
-12
2021-0
2021-0
20
20
(b) Appying the Chow test to two skill demand time series.
Figure 1: Data analysis on Job-SDF. (a) illustrates the long-tail phenomenon of skill demands under
the product manager and doctor occupations. (b) illustrates the results under the Chow test for the
absence (left) and presence (right) of structural breaks.
regions as Ao1 , Ao2 , Ac , and Ar , respectively. The demands for skill s at time t under granularity
i ∈ {o1 , o2 , c, r} is then defined as follows:
X
i i i
Ds,t = [Ds,t,ai ]ai ∈Ai , Ds,t,ai = 1(s ∈ p) · 1(ai ∈ p), (1)
p∈Pt
where a ∈ p represents a job advertisement p containing the attribute ai under granularity i. Similarly,
i
i,j,...,k
we can further define skill demands Ds,t across multiple granularities {i, j, ..., k} by calculating:
i,j,...,k
X
Ds,t,a = 1(s ∈ p) · 1(ai ∈ p ∧ aj ∈ p ∧ ... ∧ ak ∈ p), (2)
p∈Pt
i
i,j,...,k ||Aj |...|Ak |
where a = {ai , aj , ..., ak }, ai ∈ Ai , aj ∈ Aj , ..., ak ∈ Ak , and Ds,t ∈ R|A .
We study model performance through job skill demand forecasting tasks at different granularities,
including single and multiple levels. The primary goal of these tasks is to predict future job skill
demands based on historical time series data of various skills. Formally, we have:
Definition 1 (Job Skill Demand Forecasting) Given a granularity or a set of granularities g and
g g
the observed job skill demand series from the previous K timestamps, i.e., {D:,t−K+1 , ..., D:,t }, the
goal of job skill demand forecasting is to learn a forecasting model M to predict the demand values
g g
for the next H timestamps, denoted by {D̂:,t+1 , . . . , D̂:,t+H }.
Our dataset includes skill demand time series data for L1-level occupations, L2-level occupations,
companies, regions, and their combinations. We follow a standard protocol [59] that categorizes all
time-series data into training, validation, and test sets in chronological order with a ratio of 9:1:2.
In the main text, we demonstrate results with K set to 6 months and consider H as 3 months to
evaluate the performance of different forecasting models. More settings and results can be found
in the Appendix B and our project repository. Based on the Job-SDF dataset, other researchers can
easily adjust the parameters to suit their research objectives.
Varying Nature of Skill Demand. The values of skill demand exhibit significant differences and
generally follow a long-tail distribution. This indicates that, at a specific granularity, only a few skills
have high demand, while a wide range of skills are required by a limited number of jobs. For instance,
4
Figure 1a presents the skill demands under the product manager and doctor occupations in December
2022. The results clearly demonstrate the varying nature of skill demand values. This suggests that
relying solely on metrics like RMSE to evaluate forecasting models’ performance may overlook the
prediction accuracy for low-frequency skills.
Structural Break Phenomenon. As the labor market evolves, job skills that are not widely required
today may become crucial in the future, while those currently in high demand may be supplanted
by others. This dynamic can induce significant changes in the statistical properties of skill demand
time series at various points in time. These changes may be reflected in the mean, variance, trend, or
autocorrelation structure of the series. This phenomenon is known as structural breaks. A common
method for detecting structural breaks is the Chow test, which evaluates whether there are significant
differences in the regression coefficients across different periods [60]. Figure 1b illustrates the
application of the Chow test in detecting structural breaks in various skill demand time series. The
presence of structural breaks can impact the predictive accuracy of forecasting models. Further
discussion will be provided in the experimental section.
'O' denotes occupation (Bac:backend development engineer, Sal:salesperson, Pro:
Inter-Series Correlation. Intuitively, the proposed product manager) and 'S' denotes skill (PD:product design, CS:communication skill,
AI:artificial intelligence, TG:training and guiding, MA:market analysis).
job skill demand forecasting tasks can be categorized 1.0
O:Bac S:PD
as multivariate time-series forecasting tasks [61]. Fig- O:Bac S:CS
ure 2 shows the absolute values of the Pearson corre- O:Bac S:AI
0.8
lation coefficients of different skill demand series for O:Bac S:TG
O:Bac S:MA
the backend development engineer, salesperson, and O:Sal S:PD
0.6
product manager occupations. We found that the time O:Sal S:CS
series data for some skills exhibit significant correla- O:Sal S:AI
O:Sal S:TG
tion within the same occupation (i.e., product design O:Sal S:MA 0.4
and market analysis in product manager), as well as O:Pro S:PD
for the same skills across different occupations (i.e., O:Pro S:CS 0.2
O:Pro S:AI
product design in backend development engineer and O:Pro S:TG
product manager). This demonstrates the necessity of O:Pro S:MA
considering all variables as inputs for job skill demand
O ac S:PD
O:B:Bac :CS
O:B ac S:AI
O:Sac S :TG
O:Sal S:MA
O al S PD
O:S:Sal S:CS
O:S al S :AI
O:Pal S :TG
O:Pro S:MA
O ro S PD
O:P:Pro :CS
O:P ro SS:AI
ro :TG
A
S:M
:
:
forecasting models, as it captures the interrelationships
O:Bac S
S
O:B
4 Benchmark
We evaluated several SOTA time-series learning models using our proposed Job-SDF dataset. These
models are categorized into six groups based on their underlying architectures: statistical time series
models, RNN-based models, Transformer-based models, MLP-based models, Graph-based models,
and Fourier-based models. The implementation details for each model are provided in the Appendix
B, and the open-source model implementations are available on our GitHub repository.
Statistical Time Series Model. We first consider two statistical methodologies, namely ARIMA [63]
and Prophet [64], both of which have been widely used in various contexts. The ARIMA model,
which integrates differencing and moving averages within autoregression, has proven effective
in forecasting occupational task demands [28]. Prophet decomposes time series data into trend,
seasonality, and holiday components, allowing it to handle both linear and nonlinear trends with
changepoints. However, these models often struggle to capture complex nonlinear relationships and
exhibit suboptimal performance in large-scale data scenarios.
RNN-based Model. RNN-based methods are effective in capturing temporal state transitions through
their recurrent structures, making them widely used in various time series forecasting tasks [65–
5
69]. Notably, LSTM have demonstrated their effectiveness in predicting changes in skill shares
over time [29]. However, conventional RNNs often encounter performance degradation when
handling excessively long look-back windows and forecast horizons. To address this challenge,
SegRNN [32] introduces segment-wise iterations, which reduce the recurrence count within RNNs,
thereby significantly enhancing performance in time series forecasting tasks.
Transformer-based Model. Recently, Transformer-based models [70] have gained widespread
recognition in long-term time series forecasting due to their global modeling capabilities. Leveraging
the attention mechanism, Reformer [37] introduces locally sensitive hashing to approximate attention
by grouping similar queries. Informer [33] incorporates low-rank matrices in self-attention mecha-
nisms to accelerate computation. Autoformer[34] employs block decomposition and autocorrelation
mechanisms to more effectively capture the intrinsic features of time series data. FedFormer [36]
utilizes DFT-based frequency-enhanced attention, obtaining attentive weights through the spectrums
of queries and keys and calculating the weighted sum in the frequency domain. To address the chal-
lenges of non-stationary time series, the Non-stationary Transformer (NStransformer) [35] introduces
a sequence stabilization module and proposes a de-stationary attention mechanism. Additionally,
PatchTST [71] is a channel-independent patch time series transformer model that features patching
and channel-independence as its key design elements.
MLP-based Model. Multiple Layer Projection (MLP) has been introduced in time series forecasting,
demonstrating superior performance compared to transformer-based models in both accuracy and
efficiency [38]. Specifically, DLinear [38] uses series decomposition as a pre-processing step before
linear regression. FreTS [72] explores a novel approach by applying MLPs in the frequency domain
for time series forecasting. TSMixer [39] employs MLPMixer blocks, segments input time series into
fixed windows, and applies gated MLP transformations and permutations to enhance accuracy.
Graph-based Models. Graph Neural Networks (GNNs) can learn non-Euclidean relationships,
making them effective for identifying associations in structured data and generating joint represen-
tations from different perspectives [73–76]. CHGH [30] uses an adaptive graph enhanced by skill
co-occurrence relationships to link skill supply and demand sequences. This fusion of represen-
tations across views improves the performance of joint skill supply and demand prediction tasks.
Pre-DyGAE [52] targets skill demand prediction from an occupational perspective. It builds an
occupation-skill bipartite graph based on the skill demands of occupations and captures the dynamic
changes in these relationships. This method allows for predicting both potential occupational skills
and skill demands, leveraging a dynamic graph perspective.
Fourier-based Models. By utilizing Fourier projection, FiLM [40] not only captures long-term
time dependencies but also effectively reduces noise in forecasting. To address the challenge of
non-stationary time-series forecasting, Koopa [41] disentangles time-variant and time-invariant
components from complex non-stationary series using a Fourier Filter and designs the Koopman
Predictor to forecast dynamics.
To evaluate the performance of various benchmark models in job skill demand forecasting tasks,
we selected two commonly used regression metrics: MAE and RMSE. MAE is calculated over
1
PH
H observations using the formula: H i=1 |yi − ŷi |, where
q iP
y represents the ground truth value
1 H 2
and ŷi is the predicted value. RMSE is calculated as: H i=1 (yi − ŷi ) . Both MAE and
RMSE are scale-dependent metrics, which makes them unsuitable for comparison across different
granularities. Additionally, these metrics are less sensitive to prediction errors at lower skill demand
values. Therefore, we additionally applied SMAPE [42] and RRMSE [77] to assess the performance
of various forecasting models. SMAPE considers both the magnitude and direction of errors, making
it suitable for comparing forecasts across different scales. RRMSE measures the square root of the
average of the squared percentage errors.
v
H u 1 PH 2
2 X |yi − ŷi | u (yi − ŷi )
SM AP E = ∗ , RRM SE = t H Pi=1
H 2
. (3)
H i=1 |yi | + |ŷi | i=1 (ŷi )
6
Table 1: Performance comparison on MAE and RMSE.
L1-Occupation L2-Occupation Region&L1-O Region&L2-O Company
Model
MAE RMSE MAE RMSE MAE RMSE MAE RMSE MAE RMSE
ARIMA 20.27 256.89 6.46 115.79 3.98 58.65 1.31 27.42 1.31 38.88
Prophet 29.15 356.67 8.95 161.01 5.08 72.21 1.62 33.02 1.55 41.19
LSTM 19.05 194.67 7.09 116.36 3.92 51.59 1.29 23.31 1.35 26.47
SegRNN 12.28 108.28 5.01 68.83 3.14 34.26 1.05 15.96 1.01 16.03
CHGH 22.09 261.49 7.09 116.58 3.91 51.46 1.28 23.24 1.34 26.52
Pre-DyGAE 22.98 187.90 7.04 82.97 4.24 38.62 1.37 17.39 1.24 18.24
Transformer 22.06 215.09 7.58 118.21 4.01 52.04 1.35 23.44 1.26 24.99
Autoformer 23.06 186.76 8.22 100.02 6.45 57.77 2.41 24.10 3.31 38.55
Informer 22.21 205.24 7.43 117.38 3.88 50.13 1.30 23.07 1.26 24.92
Reformer 22.11 204.35 7.46 116.60 3.91 50.95 1.25 22.81 1.54 27.37
FEDformer 22.87 181.93 7.46 88.97 4.63 43.21 1.98 21.73 2.43 26.92
NStransformer 17.36 149.46 5.75 86.24 3.45 37.09 1.15 17.45 2.13 34.83
PatchTST 14.91 141.06 5.15 78.86 3.10 35.38 1.04 16.57 1.01 19.09
DLinear 16.61 154.88 5.44 81.61 3.24 36.67 1.07 16.79 1.05 18.85
TSMixer 21.34 192.85 8.14 106.65 5.81 62.14 5.95 68.26 13.96 144.96
FreTS 16.47 167.61 6.52 106.39 3.65 47.81 1.22 21.92 1.26 25.39
FiLM 12.95 117.17 5.08 65.65 3.24 29.90 1.14 14.01 1.17 15.87
Koopa 19.91 179.30 6.05 91.87 3.53 40.73 1.15 18.71 1.08 20.18
Overall Performance. In Table 1, we present the performance of various models evaluated using two
metrics: MAE and RMSE. The following conclusions can be drawn: (1) The traditional statistical
method, Prophet, demonstrates relatively poor predictive performance. This may be due to seasonal
and holiday factors not being the primary influencers in skill demand prediction. (2) Most Transformer-
based models, including Transformer, Autoformer, Informer, and Reformer, exhibit subpar overall
predictive performance. This is likely because these models are designed to address long-range
temporal dependencies, which are not well-suited for the current shorter time series context. (3) In
contrast, PatchTST, unlike these Transformer-based models that perform point-wise modeling of
time series, segments the time series into patches and inputs them into the Transformer. This
allows the model to focus on more local information. A similar idea is also employed in the
SegRNN. This strategy significantly enhances the performance of these models in predicting job
skill demand. (4) The performance of different linear models on our dataset varies significantly. For
instance, DLinear outperforms most Transformer-based models, while TSMixer performs poorly. This
discrepancy may be due to the tendency of more complex MLP-based models to overfit our dataset. (5)
CHGH and Pre-DyGAE exhibit poor performance in the separate skill demand forecasting scenario,
likely due to a mismatch between their model design and the context of our dataset. Specifically,
CHGH relies on sequential data from the supply side of skills, which is lacking in our dataset.
Conversely, Pre-DyGAE focuses more on predicting whether a skill will be required by an occupation
in the future. (6) Finally, FiLM achieved the best performance in most cases, demonstrating the
robustness of the denoising-based model.
Low-Demand Skill Prediction Performance. Considering the varying nature of skill demand
values, we further employed SMAPE and RRMSE metrics to focus on the predictive performance
of different models for low-demand skills. As shown in Table 2, the experimental results indicate
the following: (1) PatchTST achieved the best SMAPE performance in most cases, validating its
ability to more accurately predict the trends of low-demand skills. (2) Based on scale-independent
metrics, we can compare the performance of models at different granularities. It can be observed
that RRMSE exhibits a significant trend of variation across different granularities; specifically, as
the granularity becomes finer, the RRMSE performance deteriorates. This indicates that predicting
skill demand at finer granularities is more challenging. Additionally, FiLM shows the least variation
across multiple granularities, further validating its ability to provide stable and reliable predictions
under varying granularities and demand value ranges. (3) Although Koopa performs averagely on
MAE and RMSE metrics, it excels in predicting low-demand skills, particularly in terms of SMAPE.
Similarly, NStransformer also performs well in scenarios focusing on low-demand skill predictions.
This success can be attributed to both methods being designed to handle non-stationary time series.
They effectively filter noise from historical sequences and restore intrinsic non-stationary information
7
Table 2: Performance comparison on SMAPE and RRMSE.
L1-Occupation (%) L2-Occupation (%) Region&L1-O (%) Region&L2-O (%) Company (%)
Model
SMAPE RRMSE SMAPE RRMSE SMAPE RRMSE SMAPE RRMSE SMAPE RRMSE
ARIMA 35.72 47.89 25.00 58.87 23.86 58.07 13.58 73.57 20.17 147.94
Prophet 41.22 67.78 28.35 88.47 26.75 71.60 15.07 93.04 22.31 167.77
LSTM 41.38 57.90 32.85 83.70 31.58 68.40 22.93 87.36 30.26 174.40
SegRNN 39.81 37.58 33.35 50.53 35.30 48.53 23.84 61.90 33.07 86.27
CHGH 40.27 66.05 29.60 84.10 28.11 68.42 17.42 87.45 26.72 176.70
PreDyGAE 49.87 83.67 60.54 83.60 59.32 66.56 72.67 98.09 26.21 145.73
Transformer 55.59 64.25 44.23 84.27 31.15 76.16 33.04 86.87 27.61 164.36
Autoformer 70.28 53.75 74.37 63.40 90.14 65.57 91.51 74.46 107.05 99.60
Informer 56.85 58.18 44.04 88.72 34.75 69.59 29.29 90.15 32.41 164.37
Reformer 56.58 61.35 40.58 83.70 32.21 72.87 20.86 90.85 45.25 169.87
FEDformer 69.30 54.03 69.29 60.00 73.17 52.69 81.73 70.06 94.19 97.97
NStransformer 38.11 47.19 26.30 60.73 24.98 48.89 14.55 63.29 24.20 100.78
PatchTST 34.70 51.17 24.52 58.80 25.15 44.96 13.50 67.48 19.89 115.34
DLinear 41.84 52.89 34.35 60.22 33.47 51.05 25.77 64.65 30.71 108.66
TSMixer 56.59 61.17 72.29 99.35 82.48 87.29 120.85 96.49 155.20 102.14
FreTS 39.76 54.42 30.18 80.44 28.58 66.11 17.62 85.04 27.24 174.56
FiLM 39.51 37.55 29.65 43.86 28.79 37.66 17.24 47.75 25.72 76.92
Koopa 37.84 58.30 25.72 65.34 24.41 57.81 13.98 74.00 20.43 123.96
Table 3: Performance comparison on data with structural breaks on MAE and RMSE.
L1-Occupation L2-Occupation Region&L1-O Region&L2-O Company
Model
MAE RMSE MAE RMSE MAE RMSE MAE RMSE MAE RMSE
LSTM 87.30 554.46 57.95 400.22 18.99 149.53 7.91 52.38 24.40 159.02
SegRNN 61.92 390.54 43.97 276.57 15.85 114.04 6.56 37.84 17.98 112.13
CHGH 94.30 629.32 58.06 401.45 19.00 149.75 7.90 52.50 24.37 159.44
PreGyGAE 78.35 493.83 48.69 336.15 17.49 136.66 7.31 38.88 19.76 164.43
Transformer 98.66 580.58 61.73 404.17 19.37 151.12 8.45 55.46 22.41 152.27
Autoformer 107.22 533.06 67.66 350.97 26.84 156.50 12.19 63.04 44.10 208.96
Informer 98.89 570.35 59.95 402.75 19.03 146.91 7.72 49.15 22.37 151.87
Reformer 98.14 569.83 60.71 401.21 19.25 149.91 7.52 49.10 25.65 160.69
FEDformer 105.43 532.24 62.10 325.10 20.49 128.45 10.37 55.47 34.09 155.28
NStransformer 82.43 462.24 49.30 318.44 16.59 119.91 6.85 37.56 40.05 196.03
PatchTST 77.44 474.86 45.02 303.76 14.88 111.01 6.56 38.60 18.03 127.72
DLinear 81.17 485.25 46.67 307.34 15.94 118.94 6.50 37.72 18.18 124.32
TSMixer 107.47 614.93 83.60 479.39 29.99 187.08 25.83 190.29 155.10 766.58
FreTS 82.45 537.12 56.54 393.38 18.55 148.33 7.88 52.87 24.21 160.01
FiLM 62.86 404.82 42.63 260.99 14.31 101.23 6.37 32.28 18.78 110.65
Koopa 91.26 516.75 50.44 324.15 17.43 128.39 7.07 41.29 19.04 133.26
into time-dependent relationships, making them more adept at handling the fluctuating nature of
low-demand skill time series data.
Performance on Skill Demand Series with Structural Breaks. As described in Section 3.3, in
the dynamically changing job market, skill demand time series data exhibit structural breaks. To
assess the impact of this phenomenon on different models in the skill demand forecasting task, we
used the Chow test to detect structural breaks in the skill demand time series. The corresponding
predictive performance of different models is presented in Tables 3 and 4. We observe the following
phenomena: (1) Compared to the predictive performance on the full dataset, the performance on time
series data with structural breaks is significantly worse. This finding underscores the complexity and
unpredictability of skill trends that experience structural breaks. (2) FiLM has achieved results close
to the overall skill demand prediction in terms of SMAPE and RRMSE metrics. This validates that
FiLM can effectively mitigate the disruptive impact of structural breaks on skill demand forecasting.
(3) Furthermore, while the overall predictive performance of skill demand forecasting at both the
Region&L2-O and Company granularity levels is similar, significant differences emerge when
forecasting skills experiencing structural breaks. This suggests that skills undergoing structural
breaks display more predictable patterns at the Region&L2-O granularity level compared to the
Company level, making them relatively easier to forecast.
8
Table 4: Performance comparison on data with structural breaks on RRMSE and SMAPE.
L1-Occupation (%) L2-Occupation (%) Region&L1-O (%) Region&L2-O (%) Company (%)
Model
SMAPE RRMSE SMAPE RRMSE SMAPE RRMSE SMAPE RRMSE SMAPE RRMSE
LSTM 43.78 58.05 48.93 84.46 46.64 78.31 42.03 58.48 68.38 187.30
SegRNN 39.22 37.80 43.09 51.14 45.17 54.31 39.41 41.40 57.45 89.65
CHGH 44.91 66.31 48.90 84.87 45.43 78.32 39.79 58.89 68.36 189.91
PreDyGAE 52.35 47.15 56.56 59.31 52.06 61.22 44.13 42.31 70.26 106.88
Transformer 50.01 64.47 53.10 84.95 46.50 86.56 47.67 61.23 64.92 177.43
Autoformer 63.46 54.08 68.62 64.14 87.93 68.97 88.95 63.85 115.00 100.60
Informer 51.11 58.40 51.89 89.70 47.81 80.86 44.90 57.55 65.11 177.16
Reformer 50.79 61.59 51.51 84.53 46.86 84.15 40.81 58.59 72.36 181.36
FEDformer 62.83 54.37 64.37 60.84 72.24 58.55 80.03 54.29 103.27 100.65
NStransformer 45.36 47.46 47.63 61.85 43.04 57.60 36.72 39.72 170.57 113.87
PatchTST 40.89 51.48 43.26 59.69 41.51 51.85 34.74 43.12 55.26 122.56
DLinear 43.14 53.20 45.25 61.13 45.26 58.80 41.15 41.71 57.65 115.24
TSMixer 54.31 61.31 76.08 99.84 85.12 95.81 117.39 93.66 160.55 102.23
FreTS 42.44 54.59 48.24 81.17 45.39 75.43 39.85 57.83 68.39 187.94
FiLM 38.96 37.82 44.23 44.52 44.95 43.06 40.05 30.80 56.37 80.77
Koopa 46.45 58.59 47.13 66.28 42.60 66.20 36.24 47.48 58.98 131.77
5 Conclusion
In this work, we introduced Job-SDF, a dataset designed for training and benchmarking job-skill
demand forecasting models. Compiled from millions of public job advertisements collected from
online recruitment platforms, this dataset includes monthly recruitment demand for 2,324 types
of skills across 52 occupations, 521 companies, and 7 regions. Using this dataset, we validated a
wide range of time-series forecasting approaches, including statistical models, RNN-based models,
Transformer-based models, MLP-based models, Graph-based models, and Fourier-based models.
Furthermore, we conducted extensive experiments to compare the performance of various methods
in predicting skill demand at different granularities. We hope that Job-SDF will facilitate further
research in this field.
Acknowledgements
This work was supported in part by the National Key R&D Program of China (Grant
No.2023YFF0725001), in part by the National Natural Science Foundation of China (Grant
No.92370204), in part by the guangdong Basic and Applied Basic Research Foundation (Grant
No.2023B1515120057), in part by Guangzhou-HKUST (GZ) Joint Funding Program (Grant
No.2023A03J0008), Education Bureau of Guangzhou Municipality, in part by Nansha Postdoc-
toral Research Project, and in part by the National Natural Science Foundation of China (Grant
No.62176014), the Fundamental Research Funds for the Central Universities.
References
[1] David Autor et al. The polarization of job opportunities in the us labor market: Implications for
employment and earnings. Center for American Progress and The Hamilton Project, 6:11–19,
2010.
[2] James J Heckman, Jora Stixrud, and Sergio Urzua. The effects of cognitive and noncognitive
abilities on labor market outcomes and social behavior. Journal of Labor economics, 24(3):411–
482, 2006.
[3] Chuan Qin, Hengshu Zhu, Dazhong Shen, Ying Sun, Kaichun Yao, Peng Wang, and Hui Xiong.
Automatic skill-oriented question generation and recommendation for intelligent job interviews.
ACM Transactions on Information Systems, 42(1):1–32, 2023.
[4] Marios Kokkodis and Panagiotis G Ipeirotis. Demand-aware career path recommendations: A
reinforcement learning approach. Management science, 67(7):4362–4383, 2021.
[5] Rui Zha, Chuan Qin, Le Zhang, Dazhong Shen, Tong Xu, Hengshu Zhu, and Enhong Chen.
Career mobility analysis with uncertainty-aware graph autoencoders: A job title transition
perspective. IEEE Transactions on Computational Social Systems, 11(1):1205–1215, 2023.
9
[6] Rui Zha, Ying Sun, Chuan Qin, Le Zhang, Tong Xu, Hengshu Zhu, and Enhong Chen. Towards
unified representation learning for career mobility analysis with trajectory hypergraph. ACM
Transactions on Information Systems, 42(4):1–28, 2024.
[7] Xiaoshan Yu, Chuan Qin, Qi Zhang, Chen Zhu, Haiping Ma, Xingyi Zhang, and Hengshu
Zhu. Disco: A hierarchical disentangled cognitive diagnosis framework for interpretable job
recommendation. arXiv preprint arXiv:2410.07671, 2024.
[8] Ying Sun, Hengshu Zhu, Lu Wang, Le Zhang, and Hui Xiong. Large-scale online job search
behaviors reveal labor market shifts amid covid-19. Nature Cities, 1(2):150–163, 2024.
[9] Le Dai, Yu Yin, Chuan Qin, Tong Xu, Xiangnan He, Enhong Chen, and Hui Xiong. Enterprise
cooperation and competition analysis with a sign-oriented preference network. In Proceedings
of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining,
pages 774–782, 2020.
[10] Hengshu Zhu, Hui Xiong, Fangshuang Tang, Qi Liu, Yong Ge, Enhong Chen, and Yanjie Fu.
Days on market: Measuring liquidity in real estate markets. In Proceedings of the 22nd ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 393–402,
2016.
[11] Chuan Qin, Hengshu Zhu, Tong Xu, Chen Zhu, Liang Jiang, Enhong Chen, and Hui Xiong.
Enhancing person-job fit for talent recruitment: An ability-aware neural network approach.
In The 41st international ACM SIGIR conference on research & development in information
retrieval, pages 25–34, 2018.
[12] Chuan Qin, Hengshu Zhu, Chen Zhu, Tong Xu, Fuzhen Zhuang, Chao Ma, Jingshuai Zhang,
and Hui Xiong. Duerquiz: A personalized question recommender system for intelligent job
interview. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge
Discovery & Data Mining, pages 2165–2173, 2019.
[13] Chuan Qin, Hengshu Zhu, Tong Xu, Chen Zhu, Chao Ma, Enhong Chen, and Hui Xiong. An
enhanced neural network approach to person-job fit in talent recruitment. ACM Transactions on
Information Systems (TOIS), 38(2):1–33, 2020.
[14] Kaichun Yao, Jingshuai Zhang, Chuan Qin, Peng Wang, Hengshu Zhu, and Hui Xiong. Knowl-
edge enhanced person-job fit for talent recruitment. In 2022 IEEE 38th International Conference
on Data Engineering (ICDE), pages 3467–3480. IEEE, 2022.
[15] Likang Wu, Zhaopeng Qiu, Zhi Zheng, Hengshu Zhu, and Enhong Chen. Exploring large
language model for graph data understanding in online job recommendations. In Proceedings
of the AAAI Conference on Artificial Intelligence, pages 9178–9186, 2024.
[16] Shuqing Bian, Xu Chen, Wayne Xin Zhao, Kun Zhou, Yupeng Hou, Yang Song, Tao Zhang,
and Ji-Rong Wen. Learning to match jobs with resumes from sparse interaction data using
multi-view co-teaching network. In Proceedings of the 29th ACM International Conference on
Information & Knowledge Management, pages 65–74, 2020.
[17] Yong Luo, Huaizheng Zhang, Yonggang Wen, and Xinwen Zhang. Resumegan: an optimized
deep representation learning framework for talent-job fit via adversarial learning. In Proceedings
of the 28th ACM international conference on information and knowledge management, pages
1101–1110, 2019.
[18] Yoosof Mashayekhi, Nan Li, Bo Kang, Jefrey Lijffijt, and Tijl De Bie. A challenge-based survey
of e-recruitment recommendation systems. ACM Computing Surveys, 56(10):1–33, 2024.
[19] Yong Luo, Huaizheng Zhang, Yongjie Wang, Yonggang Wen, and Xinwen Zhang. Resumenet: A
learning-based framework for automatic resume quality assessment. In 2018 IEEE International
Conference on Data Mining (ICDM), pages 307–316. IEEE, 2018.
[20] Dazhong Shen, Chuan Qin, Hengshu Zhu, Tong Xu, Enhong Chen, and Hui Xiong. Joint repre-
sentation learning with relation-enhanced topic models for intelligent job interview assessment.
ACM Transactions on Information Systems (TOIS), 40(1):1–36, 2021.
10
[21] Feihu Jiang, Chuan Qin, Kaichun Yao, Chuyu Fang, Fuzhen Zhuang, Hengshu Zhu, and Hui
Xiong. Enhancing question answering for enterprise knowledge bases using large language
models. In International Conference on Database Systems for Advanced Applications, pages
273–290. Springer, 2024.
[22] David J Deming. The growing importance of social skills in the labor market. The Quarterly
Journal of Economics, 132(4):1593–1640, 2017.
[23] Liyi Chen, Chuan Qin, Ying Sun, Xin Song, Tong Xu, Hengshu Zhu, and Hui Xiong.
Collaboration-aware hybrid learning for knowledge development prediction. In Proceedings of
the ACM on Web Conference 2024, pages 3976–3985, 2024.
[24] Yunfei Zhang, Chuan Qin, Dazhong Shen, Haiping Ma, Le Zhang, Xingyi Zhang, and Hengshu
Zhu. Relicd: a reliable cognitive diagnosis framework with confidence awareness. In 2023
IEEE International Conference on Data Mining (ICDM), pages 858–867. IEEE, 2023.
[25] William Rasdorf, Joseph E Hummer, and Stephanie C Vereen. Data collection opportunities and
challenges for skilled construction labor demand forecast modeling. Public Works Management
& Policy, 21(1):28–52, 2016.
[26] Joshua Healy, Kostas Mavromaras, and Peter J Sloane. Adjusting to skill shortages in australian
smes. Applied Economics, 47(24):2470–2487, 2015.
[27] Lutz Bellmann and Olaf Hübler. The skill shortage in german establishments before, during
and after the great recession. Jahrbücher für Nationalökonomie und Statistik, 234(6):800–828,
2014.
[28] Subhro Das, Sebastian Steffen, Wyatt Clarke, Prabhat Reddy, Erik Brynjolfsson, and Martin
Fleming. Learning occupational task-shares dynamics for the future of work. In Proceedings of
the AAAI/ACM Conference on AI, Ethics, and Society, pages 36–42, 2020.
[29] Maysa Malfiza Garcia de Macedo, Wyatt Clarke, Eli Lucherini, Tyler Baldwin, Dilermando
Queiroz Neto, Rogerio Abreu de Paula, and Subhro Das. Practical skills demand forecasting
via representation learning of temporal dynamics. In Proceedings of the 2022 AAAI/ACM
Conference on AI, Ethics, and Society, pages 285–294, 2022.
[30] Wenshuo Chao, Zhaopeng Qiu, Likang Wu, Zhuoning Guo, Zhi Zheng, Hengshu Zhu, and
Hao Liu. A cross-view hierarchical graph learning hypernetwork for skill demand-supply joint
prediction. In AAAI, 2024.
[31] Adebiyi A Ariyo, Adewumi O Adewumi, and Charles K Ayo. Stock price prediction using the
arima model. In 2014 UKSim-AMSS 16th international conference on computer modelling and
simulation, pages 106–112. IEEE, 2014.
[32] Shengsheng Lin, Weiwei Lin, Wentai Wu, Feiyu Zhao, Ruichao Mo, and Haotong Zhang.
Segrnn: Segment recurrent neural network for long-term time series forecasting. arXiv preprint
arXiv:2308.11200, 2023.
[33] Haoyi Zhou, Shanghang Zhang, Jieqi Peng, Shuai Zhang, Jianxin Li, Hui Xiong, and Wancai
Zhang. Informer: Beyond efficient transformer for long sequence time-series forecasting. In
Proceedings of the AAAI conference on artificial intelligence, pages 11106–11115, 2021.
[34] Haixu Wu, Jiehui Xu, Jianmin Wang, and Mingsheng Long. Autoformer: Decomposition trans-
formers with auto-correlation for long-term series forecasting. Advances in neural information
processing systems, 34:22419–22430, 2021.
[35] Yong Liu, Haixu Wu, Jianmin Wang, and Mingsheng Long. Non-stationary transformers:
Exploring the stationarity in time series forecasting. Advances in Neural Information Processing
Systems, 35:9881–9893, 2022.
[36] Tian Zhou, Ziqing Ma, Qingsong Wen, Xue Wang, Liang Sun, and Rong Jin. Fedformer:
Frequency enhanced decomposed transformer for long-term series forecasting. In International
conference on machine learning, pages 27268–27286. PMLR, 2022.
11
[37] Nikita Kitaev, Łukasz Kaiser, and Anselm Levskaya. Reformer: The efficient transformer.
arXiv preprint arXiv:2001.04451, 2020.
[38] Ailing Zeng, Muxi Chen, Lei Zhang, and Qiang Xu. Are transformers effective for time
series forecasting? In Proceedings of the AAAI conference on artificial intelligence, pages
11121–11128, 2023.
[39] Shiyu Wang, Haixu Wu, Xiaoming Shi, Tengge Hu, Huakun Luo, Lintao Ma, James Y Zhang,
and Jun Zhou. Timemixer: Decomposable multiscale mixing for time series forecasting. arXiv
preprint arXiv:2405.14616, 2024.
[40] Tian Zhou, Ziqing Ma, Qingsong Wen, Liang Sun, Tao Yao, Wotao Yin, Rong Jin, et al. Film:
Frequency improved legendre memory model for long-term time series forecasting. Advances
in Neural Information Processing Systems, 35:12677–12690, 2022.
[41] Yong Liu, Chenyu Li, Jianmin Wang, and Mingsheng Long. Koopa: Learning non-stationary
time series dynamics with koopman predictors. Advances in Neural Information Processing
Systems, 36, 2024.
[42] Saba Sareminia. A support vector based hybrid forecasting model for chaotic time series: Spare
part consumption prediction. Neural Processing Letters, 55(3):2825–2841, 2023.
[43] Chao Chen, Jamie Twycross, and Jonathan M Garibaldi. A new accuracy measure based on
bounded relative error for time series forecasting. PloS one, 12(3):e0174202, 2017.
[44] Alan Manning. We can work it out: the impact of technological change on the demand for
low-skill workers. Scottish Journal of Political Economy, 51(5):581–608, 2004.
[45] John M Abowd, John C Haltiwanger, Julia Lane, Kevin L McKinney, and Kristin Sandusky.
Technology and the demand for skill: an analysis of within and between firm differences, 2007.
[46] Chuan Qin, Le Zhang, Yihang Cheng, Rui Zha, Dazhong Shen, Qi Zhang, Xi Chen, Ying Sun,
Chen Zhu, Hengshu Zhu, et al. A comprehensive survey of artificial intelligence techniques for
talent analytics. arXiv preprint arXiv:2307.03195, 2023.
[47] Chinhui Juhn. Wage inequality and demand for skill: evidence from five decades. ILR Review,
52(3):424–443, 1999.
[48] Jong-Wha Lee and Dainn Wie. Technological change, skill demand, and wage inequality:
Evidence from indonesia. World Development, 67:238–250, 2015.
[49] Ying Sun, Fuzhen Zhuang, Hengshu Zhu, Qi Zhang, Qing He, and Hui Xiong. Market-oriented
job skill valuation with cooperative composition neural network. Nature communications,
12(1):1992, 2021.
[50] George S Benson and Edward E Lawler III. Raising skill demand: Generating good jobs. Trans-
forming the US Workforce Development System. Urbana: Labor and Employment Relations
Association, 2011.
[51] Margaret Hilton. Research on future skill demands: A workshop summary. National Academies
Press, 2008.
[52] Xi Chen, Chuan Qin, Zhigaoyuan Wang, Yihang Cheng, Chao Wang, Hengshu Zhu, and Hui
Xiong. Pre-dygae: Pre-training enhanced dynamic graph autoencoder for occupational skill
demand forecasting. In Proceedings of the 33th International Joint Conference on Artificial
Intelligence, 2024.
[53] Manuel Cifuentes, Jon Boyer, David A Lombardi, and Laura Punnett. Use of o* net as a job
exposure matrix: a literature review. American journal of industrial medicine, 53(9):898–914,
2010.
[54] Filippo Chiarello, Gualtiero Fantoni, Terence Hogarth, Vito Giordano, Liga Baltina, and Irene
Spada. Towards esco 4.0–is the european classification of skills in line with industry 4.0? a text
mining approach. Technological Forecasting and Social Change, 173:121177, 2021.
12
[55] Chuan Qin, Kaichun Yao, Hengshu Zhu, Tong Xu, Dazhong Shen, Enhong Chen, and Hui
Xiong. Towards automatic job description generation with capability-aware neural networks.
IEEE Transactions on Knowledge and Data Engineering, 35(5):5341–5355, 2022.
[56] Chuyu Fang, Chuan Qin, Qi Zhang, Kaichun Yao, Jingshuai Zhang, Hengshu Zhu, Fuzhen
Zhuang, and Hui Xiong. Recruitpro: A pretrained language model with skill-aware prompt
learning for intelligent recruitment. In Proceedings of the 29th ACM SIGKDD Conference on
Knowledge Discovery and Data Mining, pages 3991–4002, 2023.
[57] Kaichun Yao, Chuan Qin, Hengshu Zhu, Chao Ma, Jingshuai Zhang, Yi Du, and Hui Xiong. An
interactive neural network approach to keyphrase extraction in talent recruitment. In Proceedings
of the 30th ACM International Conference on Information & Knowledge Management, pages
2383–2393, 2021.
[58] Feihu Jiang, Chuan Qin, Jingshuai Zhang, Kaichun Yao, Xi Chen, Dazhong Shen, Chen Zhu,
Hengshu Zhu, and Hui Xiong. Towards efficient resume understanding: A multi-granularity
multi-modal pre-training approach. arXiv preprint arXiv:2404.13067, 2024.
[59] Zhiyuan Wang, Xovee Xu, Weifeng Zhang, Goce Trajcevski, Ting Zhong, and Fan Zhou.
Learning latent seasonal-trend representations for time series forecasting. Advances in Neural
Information Processing Systems, 35:38775–38787, 2022.
[60] Gregory C Chow. Tests of equality between sets of coefficients in two linear regressions.
Econometrica: Journal of the Econometric Society, pages 591–605, 1960.
[61] Defu Cao, Yujing Wang, Juanyong Duan, Ce Zhang, Xia Zhu, Congrui Huang, Yunhai Tong,
Bixiong Xu, Jing Bai, Jie Tong, et al. Spectral temporal graph neural network for multivariate
time-series forecasting. Advances in neural information processing systems, 33:17766–17778,
2020.
[62] Ting Guo, Feng Hou, Yan Pang, Xiaoyun Jia, Zhongwei Wang, and Ruili Wang. Learning
and integration of adaptive hybrid graph structures for multivariate time series forecasting.
Information Sciences, 648:119560, 2023.
[63] George EP Box, Gwilym M Jenkins, Gregory C Reinsel, and Greta M Ljung. Time series
analysis: forecasting and control. John Wiley & Sons, 2015.
[64] Sean J Taylor and Benjamin Letham. Forecasting at scale. The American Statistician, 72(1):37–
45, 2018.
[65] Chen Zhu, Hengshu Zhu, Hui Xiong, Pengliang Ding, and Fang Xie. Recruitment market trend
analysis with sequential latent variable models. In Proceedings of the 22nd ACM SIGKDD
international conference on knowledge discovery and data mining, pages 383–392, 2016.
[66] Qi Zhang, Tong Xu, Hengshu Zhu, Lifu Zhang, Hui Xiong, Enhong Chen, and Qi Liu. After-
shock detection with multi-scale description based neural network. In 2019 IEEE International
Conference on Data Mining (ICDM), pages 886–895. IEEE, 2019.
[67] Qi Zhang, Hengshu Zhu, Ying Sun, Hao Liu, Fuzhen Zhuang, and Hui Xiong. Talent demand
forecasting with attentive neural sequential model. In Proceedings of the 27th ACM SIGKDD
Conference on Knowledge Discovery & Data Mining, pages 3906–3916, 2021.
[68] Qi Zhang, Hengshu Zhu, Qi Liu, Enhong Chen, and Hui Xiong. Exploiting real-time search en-
gine queries for earthquake detection: A summary of results. ACM Transactions on Information
Systems (TOIS), 39(3):1–32, 2021.
[69] Dazhong Shen, Qi Zhang, Tong Xu, Hengshu Zhu, Wenjia Zhao, Zikai Yin, Peilun Zhou, Lihua
Fang, Enhong Chen, and Hui Xiong. A machine learning-enhanced robust p-phase picker for
real-time seismic monitoring. arXiv preprint arXiv:1911.09275, 2019.
[70] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez,
Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information
processing systems, 30, 2017.
13
[71] Yuqi Nie, Nam H Nguyen, Phanwadee Sinthong, and Jayant Kalagnanam. A time series is
worth 64 words: Long-term forecasting with transformers. arXiv preprint arXiv:2211.14730,
2022.
[72] Kun Yi, Qi Zhang, Wei Fan, Shoujin Wang, Pengyang Wang, Hui He, Ning An, Defu Lian,
Longbing Cao, and Zhendong Niu. Frequency-domain mlps are more effective learners in time
series forecasting. Advances in Neural Information Processing Systems, 36, 2024.
[73] Liyi Chen, Zhi Li, Tong Xu, Han Wu, Zhefeng Wang, Nicholas Jing Yuan, and Enhong Chen.
Multi-modal siamese network for entity alignment. In Proceedings of the 28th ACM SIGKDD
conference on knowledge discovery and data mining, pages 118–126, 2022.
[74] Liyi Chen, Zhi Li, Weidong He, Gong Cheng, Tong Xu, Nicholas Jing Yuan, and Enhong
Chen. Entity summarization via exploiting description complementarity and salience. IEEE
Transactions on Neural Networks and Learning Systems, 34(11):8297–8309, 2023.
[75] Dazhong Shen, Chuan Qin, Chao Wang, Hengshu Zhu, Enhong Chen, and Hui Xiong. Reg-
ularizing variational autoencoder with diversity and uncertainty awareness. arXiv preprint
arXiv:2110.12381, 2021.
[76] Miao Chen, Chao Wang, Chuan Qin, Tong Xu, Jianhui Ma, Enhong Chen, and Hui Xiong.
A trend-aware investment target recommendation system with heterogeneous graph. In 2021
International Joint Conference on Neural Networks (IJCNN), pages 1–8. IEEE, 2021.
[77] Keerti Rawal and Aijaz Ahmad. Mining latent patterns with multi-scale decomposition for elec-
tricity demand and price forecasting using modified deep graph convolutional neural networks.
Sustainable Energy, Grids and Networks, page 101436, 2024.
[78] Aldo Pareja, Giacomo Domeniconi, Jie Chen, Tengfei Ma, Toyotaro Suzumura, Hiroki Kaneza-
shi, Tim Kaler, Tao Schardl, and Charles Leiserson. Evolvegcn: Evolving graph convolutional
networks for dynamic graphs. In Proceedings of the AAAI conference on artificial intelligence,
pages 5363–5370, 2020.
[79] Youngjoo Seo, Michaël Defferrard, Pierre Vandergheynst, and Xavier Bresson. Structured
sequence modeling with graph convolutional recurrent networks. In Neural Information Pro-
cessing: 25th International Conference, ICONIP 2018, Siem Reap, Cambodia, December 13-16,
2018, Proceedings, Part I 25, pages 362–373. Springer, 2018.
[80] Ling Zhao, Yujiao Song, Chao Zhang, Yu Liu, Pu Wang, Tao Lin, Min Deng, and Haifeng Li.
T-GCN: A Temporal Graph ConvolutionalNetwork for Traffic Prediction. IEEE Transactions
on Intelligent Transportation Systems, pages 3848–3858, 2020.
[81] Jinyin Chen, Xueke Wang, and Xuanheng Xu. Gc-lstm: Graph convolution embedded lstm for
dynamic network link prediction. Applied Intelligence, pages 1–16, 2022.
[82] Aynaz Taheri, Kevin Gimpel, and Tanya Berger-Wolf. Learning to represent the evolution of
dynamic graphs with recurrent models. In Companion Proceedings of The 2019 World Wide Web
Conference, WWW ’19, New York, NY, USA, 2019. Association for Computing Machinery.
14
A Computational Resource
Due to inherent design and size constraints of the models combined with varying data sizes at different
granularities, the deployment environments for each model are distinct. The CHGH model, which
requires over 80GB of memory, is exclusively deployed on CPU platforms to accommodate its sub-
stantial resource demands. In contrast, the PreDyGAE model operates solely on GPU infrastructure,
leveraging the computational efficiencies of the NVIDIA A800 GPUs. For other models, deployment
strategies are tailored according to the granularity of the data. Experiments at the labor market,
regions, L1 occupations, L2 occupations, and Region & L1 occupations granularities are conducted
on GPUs, capitalizing on the enhanced processing capabilities of these units for handling moderate
data volumes. However, at the granularities of Region & L2 and company, where data volumes are
significantly larger, deployment shifts to CPUs. Overall, the training time of different models are
shown in Table 5.
Table 5: Training time (minute) of different models for job skill demand forecsting.
Model Market Region L1-O L2-O R&L1-O R&L2-O Company
LSTM 0-0.5 0-0.5 0-0.5 0-0.5 0-0.5 37.7 39.0
SegRNN 0-0.5 0-0.5 0-0.5 0-0.5 0-0.5 342.8 458.2
CHGH 17.7 132.8 170.2 258.3 1300.3 490.6 6604.2
PreDyGAE 1-10 16.5 30.0 48.1 48.1 88.2 126.2
Transformer 0-0.5 0-0.5 0-0.5 0.5-1 0.5-1 128.2 166.5
Autoformer 0-0.5 0-0.5 0-0.5 0-0.5 0.5-1 304.3 325.0
Informer 0-0.5 0-0.5 0-0.5 0-0.5 0-0.5 133.8 171.7
Reformer 0-0.5 0-0.5 0-0.5 0-0.5 0-0.5 36.0 52.5
FEDformer 0-0.5 0-0.5 0-0.5 0-0.5 0-0.5 193.8 198.5
NStransformer 0-0.5 0.5-1 0.5-1 0-0.5 0.5-1 128.5 195.7
PatchTST 0-0.5 0-0.5 0-0.5 0-0.5 0.5-1 1202.8 2558.0
DLinear 0-0.5 0-0.5 0.5-1 0-0.5 0-0.5 20.0 39.1
TSMixer 0-0.5 0-0.5 0-0.5 0-0.5 0-0.5 24.0 97.0
FreTS 0-0.5 0-0.5 0-0.5 0-0.5 0-0.5 85.0 200.0
FiLM 0-0.5 0.5-1 0.5-1 1-10 1-10 598.0 1464.7
Koopa 0-0.5 0-0.5 0-0.5 0-0.5 0-0.5 38.3 68.5
To demonstrate the robustness and reliability of our experimental results, we first repeated the
experiments multiple times as described in the main text. Additionally, we extended our analysis to
include experiments across the entire labor market and at various regional granularities.
15
Data from the first 24 months were used for pre-training, and the model was fine-tuned using the next
6 months to capture trend changes. Finally, the model was used to infer skill demands for the last 6
months. All other hyperparameters were kept consistent with those in the original paper. To ensure
the reliability of our findings, we repeated these experiments four times, using random seeds set to 0,
1, 2, and 3, respectively.
Overall Performance Table 6 displays the mean and standard deviation results of repeated experi-
ments on the benchmark models for the job skill demand forecasting task as presented in the main
text. Initially, we supplement the experimental results at the overall labor market level and regional
granularity, where the RMSE averages over 1000. In cases of coarser granularity, due to the larger
base of demand values, the prediction deviations are significant.
Performance on Skill Demand Series with Structural Breaks Table 7 presents the results of
repeated experiments on forecasting skill demand sequences that have undergone structural breaks.
Initially, the overall errors are quite pronounced, underscoring the challenge of accurately predicting
these skills. Moreover, FiLM performs well on most metrics, which further verifies its robustness.
In multigranular skill demand sequences, a significant number of skills remain inactive or in low
frequency over extended periods. These skills might continue to have low demand in the future (indi-
cating low importance), or they might suddenly gain interest from certain professions or companies,
leading to rapid growth. In this study, we define low-frequency skills as those that appear fewer
than twice in the time slices of the training set. Predicting the demand for these skills is challenging
because their data points are predominantly zero during training, resulting in a lack of effective
observational data. Therefore, we specifically present the demand prediction results of the existing
benchmark models for these low-frequency skills.
Results. We continued to test the demand prediction effect on low-frequency skills using the
benchmark models described in the main text, and the results are shown in Table 8. From this, we
can draw the following conclusions: Firstly, there is a significant increase in the error on the RRMSE
metric, indicating that low-demand skills are difficult to predict accurately. Secondly, Koopa has
the best predictive performance in this scenario. We also found that the performance of SegRNN
significantly decreases, suggesting that SegRNN’s segment learning approach is not suitable for
predicting low-frequency skill demands due to a lack of effective observational data, rendering the
learning segments meaningless.
In the task of job skill demand forecasting, fully leveraging the inter-relationships among different
skills is beneficial for downstream tasks. Therefore, we construct a prior graph with co-occurrence
frequency from the training data to include as a dataset component. Given a set of granularities
i, j, . . . , k, we constructed the skill co-occurrence graph as G i,j,...,k = (V i,j,...,k , E i,j,...,k ), where
V i,j,...,k is the extended skill set under the multiple granularities. The edge weight ev,v′ ∈ E i,j,...,k
between nodes v and v ′ is determined by the co-occurrence frequency of the node pair v, v ′ in
the job advertisement data for training P train . Specifically, given v = (ai , aj , . . . , ak , s) and
v ′ = (ai′ , aj′ , ..., ak′ , s′ ), ev,v′ is calculated as:
X Y
ev,v′ = 1p (x ∈ p). (4)
p∈P train x∈{ai ,aj ,...,ak ,ai′ ,aj′ ,...,ak′ ,s,s′ }
This information will serve as prior knowledge, reflecting global inter-skill dependency patterns.
Benchmark Models To fully utilize the prior information from the co-occurency graph, we
introduce several GNN-based methods for multivariate time series prediction. These methods leverage
GNNs to extract the influences between different variables, effectively capturing the relationships
among various time series. The specific models are as follows:
16
Table 6: Overall performance comparisons on repeated experiments.
Model Market Region L1-O L2-O R&L1-O R&L2-O Company
LSTM 314.54±0.57 49.92±0.0 24.43±4.91 8.04±0.87 4.44±0.47 1.45±0.15 1.49±0.13
SegRNN 190.05±0.37 35.92±0.47 16.37±3.73 5.81±0.73 3.68±0.5 1.32±0.25 1.23±0.2
CHGH 315.47±0.04 50.03±0.02 25.62±3.23 8.04±0.87 4.43±0.48 1.51±0.7 1.54±0.73
PreDyGAE 189.95±0.01 35.84±0.09 21.72±1.15 6.81±0.21 4.08±0.15 1.85±0.43 1.85±0.56
Transformer 340.72±0.79 54.98±0.13 27.43±4.9 8.93±1.24 4.92±0.83 1.75±0.36 1.69±0.39
Autoformer 465.86±2.78 60.9±0.64 31.97±8.13 9.97±1.6 6.68±0.21 2.6±0.17 2.85±0.42
Informer 340.76±1.6 55.08±0.35 27.54±4.87 8.81±1.26 4.87±0.91 1.73±0.39 1.69±0.39
MAE
• GConvGRU [79]: This model integrates convolutional neural networks (CNNs) on graphs to
identify spatial structures and recurrent neural networks (RNNs) to detect dynamic patterns.
17
Table 7: Performance comparisons on skill demand series with structural breaks.
Model Market Region L1-O L2-O R&L1-O R&L2-O Company
LSTM 423.99±0.56 109.63±0.04 101.68±13.13 63.39±4.97 21.02±1.85 8.55±0.59 26.23±1.67
SegRNN 256.67±0.54 76.09±1.01 68.08±5.62 44.93±0.88 16.12±0.24 7.07±0.47 18.91±0.85
CHGH 425.17±0.04 109.82±0.06 104.41±9.23 63.54±5.0 18.11±1.93 15.16±4.33 19.75±13.35
PreDyGAE 296.27±0.01 85.05±5.06 84.11±8.87 56.28±7.2 16.57±0.84 9.51±0.19 22.9±0.13
Transformer 460.45±2.31 120.62±0.4 113.11±13.19 69.45±7.05 22.4±2.76 9.42±0.88 26.93±4.12
Autoformer 632.29±2.76 131.92±1.52 131.01±21.72 75.4±7.07 25.82±0.93 11.76±0.39 36.65±6.8
Informer 462.88±1.99 120.89±0.68 113.36±13.21 68.13±7.46 22.22±2.91 9.1±1.26 26.82±4.06
MAE
Two architectures, GConvGRU and GConvLSTM, are explored for the Graph Convolutional
Recurrent Network (GCRN).
• TGCN [80]: The temporal graph convolutional network (T-GCN) model, which is in
combination with the graph convolutional network (GCN) and gated recurrent unit (GRU).
Specifically, the GCN is used to learn complex topological structures to capture spatial
dependence and the gated recurrent unit is used to learn dynamic changes of traffic data to
capture temporal dependence.
• GCLSTM [81]: GCLSTM is an end-to-end model integrating a Graph Convolution Network
(GCN) embedded Long Short-Term Memory network (LSTM) for dynamic network link
18
Table 8: Performance comparisons on skill demand series with low-frequency.
Model Market Region L1-O L2-O R&L1-O R&L2-O Company
LSTM 33.69±0.44 26.88±0.05 16.49±0.01 12.11±0.04 12.28±0.02 8.92±0.1 12.6±0.02
SegRNN 51.5±0.57 46.12±0.28 25.65±0.39 18.03±0.03 22.08±1.92 19.1±3.22 19.37±0.88
CHGH 32.37±0.0 24.99±0.02 14.2±0.04 9.32±0.01 9.45±0.0 67.89±10.03 77.01±5.58
PreDyGAE 113.67±1.73 157.17±18.68 343.32±0.15 64.46±13.49 24.16±0.13 60.41±55.15 68.89±62.89
Transformer 54.84±1.99 52.95±0.09 46.1±0.13 45.47±0.07 45.64±0.02 45.1±0.01 45.43±0.0
Autoformer 132.98±11.42 106.13±1.65 110.06±0.19 97.66±1.23 98.28±0.46 95.26±1.1 90.23±0.54
Informer 54.42±1.83 52.76±0.13 46.1±0.0 45.52±0.01 45.69±0.02 45.07±0.01 45.47±0.0
MAE
prediction. The GCN captures local structural properties, while the LSTM learns temporal
features across snapshots of a dynamic network.
19
Table 9: Performance comparisons on skill demand series with GNN-based methods.
Model Market Region L1-O L2-O R&L1-O R&L2-O Company
EvolveGCNH 1053.18±804.38 30.07±22.97 16.51±12.61 5.43±4.15 2.92±2.23 1.11±0.85 0.96±0.73
EvolveGCNO 151.13±115.43 27.01±20.63 15.4±11.76 5.04±3.85 2.95±2.25 1.21±0.92 0.89±0.68
GConvGRU 570.31±435.58 92.96±71.0 46.43±35.46 11.45±8.75 5.47±4.18 1.4±1.07 1.04±0.79
MAE
TGCN 741.05±565.99 96.43±73.65 55.41±42.32 13.21±10.09 6.28±4.8 1.63±1.24 1.11±0.85
GCLSTM 729.48±557.15 57.13±43.63 25.84±19.74 11.49±8.78 5.67±4.33 1.61±1.23 1.07±0.82
GConvLSTM 741.45±566.29 93.61±71.5 46.57±35.57 11.19±8.55 5.57±4.25 1.49±1.14 1.09±0.83
DyGrEncoder 732.53±559.48 92.21±70.43 46.96±35.87 12.26±9.36 5.57±4.26 1.47±1.13 1.07±0.82
EvolveGCNH 4998.7±3817.82 166.34±127.04 178.55±136.37 76.14±58.15 35.12±26.82 16.97±12.96 19.33±14.77
EvolveGCNO 709.19±541.65 143.72±109.77 164.16±125.38 71.57±54.67 33.21±25.37 15.04±11.49 17.93±13.7
GConvGRU 2968.75±2267.42 579.66±442.72 442.53±337.99 170.09±129.91 81.95±62.59 31.75±24.25 30.65±23.41
RMSE
Table 10: Performance comparisons on skill demand series with structural breaks with GNN-based
methods.
Model Market Region L1-O L2-O R&L1-O R&L2-O Company
EvolveGCNH 1385.2±1057.96 61.79±47.19 58.05±44.33 36.04±27.53 10.69±8.17 5.56±4.25 13.25±10.12
EvolveGCNO 195.87±149.6 53.86±41.14 52.46±40.07 32.76±25.02 10.85±8.28 5.21±3.98 12.22±9.33
GConvGRU 749.83±572.69 204.83±156.44 189.48±144.72 92.29±70.49 26.94±20.58 9.23±7.05 17.26±13.18
MAE
We implement these benchmark models using the PyG library 2 and demonstrate the effectiveness of
these GNN-based methods in skill demand forecasting.
Results We have implemented a series of graph-based multivariate time series forecasting methods
based on the co-occurrence graph and verified their experimental effects under the three scenarios
discussed above. Firstly, Table 9 presents the overall performance of the methods based on the co-
occurrence graph for skill demand forecasting. It is observed that the prediction accuracy significantly
declines across the overall labor market. However, as the granularity of the forecast becomes finer,
the model performance improves, and at a finer granularity, the EvolveGCN method outperforms the
2
https://github.com/benedekrozemberczki/pytorch_geometric_temporal
20
Table 11: Performance comparisons on low-frequency skill demand series with GNN-based methods.
Model Market Region L1-O L2-O R&L1-O R&L2-O Company
EvolveGCNH 35.53±27.13 2.28±1.74 1.47±1.13 0.55±0.42 0.31±0.23 0.14±0.11 0.19±0.15
EvolveGCNO 27.87±21.29 3.35±2.56 1.69±1.29 0.57±0.43 0.31±0.24 0.27±0.21 0.17±0.13
GConvGRU 63.05±48.16 3.06±2.34 0.25±0.19 0.18±0.14 0.11±0.09 0.05±0.04 0.11±0.09
MAE
TGCN 7.37±5.63 2.16±1.65 0.13±0.1 0.61±0.46 0.26±0.2 0.15±0.12 0.15±0.11
GCLSTM 0.62±0.47 0.41±0.32 1.37±1.04 0.15±0.11 0.12±0.09 0.09±0.07 0.12±0.09
GConvLSTM 0.79±0.6 0.43±0.33 0.2±0.15 0.13±0.1 0.1±0.08 0.06±0.05 0.13±0.1
DyGrEncoder 1.04±0.79 0.47±0.36 0.24±0.18 0.13±0.1 0.15±0.11 0.09±0.07 0.14±0.11
EvolveGCNH 144.53±110.39 11.08±8.46 11.65±8.9 5.52±4.22 3.48±2.66 1.7±1.3 1.81±1.38
EvolveGCNO 83.5±63.77 10.36±7.91 12.82±9.79 5.74±4.38 3.8±2.9 1.95±1.49 1.65±1.26
GConvGRU 63.06±48.16 3.65±2.79 1.19±0.91 0.87±0.66 1.59±1.22 0.81±0.62 1.72±1.31
RMSE
state-of-the-art (SOTA) methods mentioned in the main text considerably. We analyze that the finer
the granularity, the more accurately the co-occurrence graph reflects the associations between skills,
while coarser granularity might introduce excessive noise leading to decreased model performance.
The fine-grained co-occurrence graph accurately reflects the interrelationships between skills at
different granularities, which aids in enhancing the model’s prediction accuracy. Secondly, we
find significant differences in the RRMSE metric among these methods, with EvolveGCN showing
superior performance because it can learn the evolution of GCN parameter weights over time, thus
capturing the evolving dependencies among edges. Therefore, based on the provided co-occurrence
graph, it can effectively learn the evolution of skill relationships, which is beneficial for dynamic
prediction of skill demand.
For skill demand forecasting in scenarios involving structural breaks, as shown in Table 10, the
improvements in the methods based on the co-occurrence graph are greater than those in the overall
skill demand forecasting task. This suggests that skills experiencing structural breaks have strong
interconnections, and the co-occurrence graph helps the model to identify the patterns of skill
demand sequences that are likely to undergo structural breaks, thus further enhancing the prediction
effectiveness for this category of skills.
In the task of predicting low-frequency skills, as shown in Table 11, methods like GConvLSTM
significantly outperform EvolveGCN. This is due to the sparse observable data for these skills, which
leads to sparse connectivity edges on the co-occurrence graph.
In the main text, we discussed the issue of skill demand prediction. However, consider a scenario
where the number of skill postings for a particular occupation is very low, leading to a low demand
for that occupational skills. Nevertheless, these skills might constitute a significant portion of the
profession’s core competencies. Therefore, using skill demand alone may not adequately measure the
importance of these skills within the occupation. To address this, we introduce an extended dataset
that includes the skill demand propotion. We define the skill demand proportion as:
1(s ∈ p) · 1(ai ∈ p)
P
i i i p∈Pt
Rs,t = [Rs,t,ai ]ai ∈Ai , Rs,t,ai = P i
, (5)
p∈Pt 1(a ∈ p)
21
Table 12: Performance comparisons on skill demand proportion forecasting.
Model Market Region L1-O L2-O R&L1-O R&L2-O Company
LSTM 0.14±0.0 0.49±0.01 1.21±0.03 2.26±0.05 2.43±0.03 3.51±0.06 2.69±0.03
SegRNN 0.15±0.02 1.25±0.02 2.63±0.11 4.2±0.04 6.6±1.12 10.16±2.44 5.52±0.53
CHGH 0.1±0.01 0.19±0.0 0.35±0.0 0.64±0.01 0.7±0.0 0.91±0.07 1.03±0.01
PreDyGAE 0.08±0.01 0.09±0.0 0.12±0.02 0.23±0.02 0.18±0.0 0.27±0.15 0.29±0.16
Transformer 0.62±0.02 3.89±0.02 10.87±0.05 20.58±0.01 22.02±0.01 31.26±0.02 23.47±0.01
Autoformer 1.29±0.14 8.84±0.15 24.5±0.22 43.99±0.94 47.53±0.67 66.32±1.02 50.02±0.66
Informer 0.55±0.02 3.83±0.03 10.86±0.03 20.6±0.01 22.04±0.01 31.23±0.01 23.48±0.0
MAE(%)
where ai ∈ p represents a job advertisement p containing the attribute ai under granularity i. Similarly,
i,j,...,k
we can further define skill demand proportions Rs,t across multiple granularities {i, j, ..., k} by
22
calculating:
1(s ∈ p) · 1(ai ∈ p ∧ aj ∈ p ∧ ... ∧ ak ∈ p)
P
i,j,...,k p∈Pt
Rs,t,a = P i j k
, (6)
p∈Pt 1(a ∈ p ∧ a ∈ p ∧ ... ∧ a ∈ p)
i
i,j,...,k ||Aj |...|Ak |
where a = {ai , aj , ..., ak }, ai ∈ Ai , aj ∈ Aj , ..., ak ∈ Ak , and Rs,t ∈ R|A .
Results We continue to utilize the benchmark models described in the main text for this task, and
the results, as shown in Table 12, lead to the following conclusions: Firstly, the best-performing model
on the task of forecasting the proportion of skill demand is Koopa. This model, integrating time series
decomposition and Fourier transformations, effectively captures the distribution changes in demand
proportions. Secondly, there is a significant variation in performance across models in this task. For
example, models like DLinear perform poorly on this task, though they are reasonably effective
in skill demand forecasting. We analyze that predicting percentages is distinct from forecasting
skill demand, as percentage predictions are also influenced by the demand for other skills at the
same granularity. Therefore, simple linear models are not advantageous for capturing the complex
interrelations and influences among multiple pieces of information.
23