Data Security Crisis in Universities: Identi Fication of Key Factors Affecting Data Breach Incidents
Data Security Crisis in Universities: Identi Fication of Key Factors Affecting Data Breach Incidents
https://doi.org/10.1057/s41599-023-01757-0 OPEN
The extremely complex and dynamic digital environments of universities make them highly
vulnerable to the risk of data breaches. This study empirically investigated the factors
1234567890():,;
influencing data breach risks in the context of higher education, according to crime opportunity
theory and routine activity theory. The data consisted of university samples from China and
were collected mainly from the Chinese Education Industry Vulnerability Reporting Platform.
After applying Poisson regression for the estimation, increased public disclosure of vulner-
abilities was found to escalate the frequency of data breaches, whereas cross-border data
flow decreased the number of data breaches. Furthermore, the mechanism by which aca-
demic strength affects data breaches was examined through the two mediators of cross-
border data flow and vulnerability disclosure. In addition, cloud adoption reduced data
breaches, and public clouds were determined to be relatively more secure than private
clouds. Cloud adoption also acted as a moderator between the negative impact of vulner-
abilities and the positive impact of cross-border data flow on data breaches. The estimation
and robustness findings revealed the underlying mechanisms that impacted university data
security, clarifying the understanding of data breaches and suggesting practical implications
for universities and other institutes to improve information security. The findings of this study
provide insights and directions for future research.
1 School of Management, Xi’an Jiaotong University, Xi’an, China. 2 National Computer Network Emergency Response Technical Team/Coordination Center of
G
Introduction
iven advancements in the digital economy worldwide and organizations and users inadvertently and continuously expand the
the rapid development of related technologies, such as 5G digital footprints of universities, potentially leading to information
and artificial intelligence, data have become an important security concerns by increasing the risk of data breaches. Moreover,
resource globally. However, numerous potential risks of data insufficient security awareness and a lack of attention to data
breaches accompany such developments in information tech- security place universities in a dangerous position. According to a
nology (IT). It has been reported that more than 100 million survey by the Joint Information Systems Committee (JISC), only
Android users’ sensitive personal data were exposed in May 2021 39% of students indicated that they were informed of how uni-
because of several misconfigurations. In the same year, a database versities store and use their personal data. Only 15% of the staff
containing the personal information of 533 million Facebook scored their organizations as eight or more out of ten in terms of
account users across 106 countries was exposed, potentially data protection (JISC, 2018). Notably, JISC had a 100% track record
leading to further social engineering attacks or hacking attempts of gaining access to the most valuable data in universities and
(Henriquez, 2021). The frequent incidents reported in the media research centers using spear phishing (Chapman, 2019). Data
reflect the severity of these data breaches and merely represent the breaches may also be caused by human errors, such as sloppy data
“tip of the iceberg.” Despite the related laws and data breach handling and negligent security procedures, due to insufficient
notification requirements enacted by governments worldwide, awareness of data security (Ulven and Wangen, 2021). For example,
such as the General Data Protection Regulation (GDPR) of the almost 44,000 student records were obtained from the storage of
European Union, National Security and Personal Data Protection secure information at Arden University in 2022 because of human
Act of 2019 (NSPDPA) of the United States, and Data Security errors3. Moreover, according to Verizon (2022), the education
Law and Personal Information Protection Law of China, data sector has been facing additional challenges because the pandemic
breach incidents continue to occur. Statista reported that the made it mandatory to hold classes online, providing opportunities
annual number of data compromises has increased from 2005 to for malicious hackers and increasing the risk of data breaches.
20221. According to statistics from the Privacy Rights Clearing- Universities with plentiful personal and research data, intel-
house, the occurrence of data breaches has been high since 20102. lectual property, and insufficient awareness of data security are
As shown in Fig. 1(a), from 2005 to 2018, the number of reported enticing from a hacker’s perspective, making higher educational
data breach incidents increased by 4.2 times, presenting a sig- institutions primary targets (Hina and Dominic, 2020). It has
nificant upward trend. Figure 1(b) indicates that the number of been observed that the number of information security breach
data breaches achieved a record high in 2021 (Verizon, 2022). incidents reported by higher education institutions worldwide is
The cost of a data breach has also increased significantly. The increasing rapidly (Borgman, 2018). For example, the University
average total cost of a global data breach was $4.35 million in of California announced a malicious cyberattack in 2021, and the
2022, which was the highest in the history of the report, stolen personal information (e.g., social security numbers, email
increasing by 2.6% from 2021 and 12.7% from 2020. Given the addresses, phone numbers, and home addresses) was found on
COVID-19 outbreak enforcing remote work and digital trans- the dark web (Ying, 2021).
formations in recent years, data breach costs increased by $1.07 The same holds true for data breach risks in universities in
million in 2021 and $0.97 million in 2022 (IBM, 2021, 2022). China. Figure 2 presents the monthly statistics regarding data
Moreover, Meng et al. (2022) suggested that the spreading online breach incidents in universities in China as reported by the
of public opinions can have severe consequences. Information Education Industry Vulnerability Reporting Platform, a resource-
breaches once disclosed may damage the image of the related sharing platform for collection and notification of system vul-
organization, industry, or even the supervisors. nerabilities in the country’s education industry4. The number of
Data breach risks remain prevalent in universities. The number reported data breaches was relatively high, with a significant
of reported data breach incidents in higher education institutions is upward trend on a monthly basis, reflecting that Chinese uni-
increasing (Bongiovanni, 2019). From the perspective of a uni- versities are also at severe risk of data breaches, which should not
versity concerning multiple stakeholders (Borgman, 2018), indivi- be underestimated.
duals exhibit diverse activities in both physical and cyber spaces (Li Despite the increasing trend in information breach incidents,
T, Li Y, Hoque MA (2022) and interact through the internet, previous studies have rarely focused on such incidents in uni-
thereby leaving digital footprints (Qin et al., 2022). Students, faculty, versities (Okibo and Ochiche, 2014). As Hina and Dominic
staff, and visitors frequently access a university’s information (2020) have reported, only a few studies have focused on the
technology infrastructure and generate data in various ways, such as security risks of sensitive information from higher educational
via personal mobile devices, laboratory sensors, and swipe card institutions. Information security management in universities is a
access systems. These large-scale data interactions and flows among poorly investigated topic (Bongiovanni, 2019).
Fig. 1 Yearly number of data breaches. A description of annual number of data breach incidents. Panel a presents statistics using data from PRC and panel
b describes data breaches in recent years using data from Verizon.
Fig. 2 Monthly statistics of data breaches in universities. A description of data breach incidents in universities in China using data from the Education
Industry Vulnerability Reporting Platform.
Hence, in this study, which focuses on universities’ data security risks in universities to further optimize data protection
breach incidents, we aim to investigate the determinants of data strategies.
breach risks to better understand the underlying impact The remainder of this paper is organized as follows. First, the
mechanisms. The research framework is at the university level, relevant literature is reviewed. Second, related theories are out-
and the samples used in empirical analyses were obtained from lined prior to proposing a research framework with hypotheses.
China. The aim of this study is to answer the following research Third, the data and variable measurements are described, fol-
questions: (i) What factors impact data breaches in universities lowed by empirical analyses and main results. Subsequently,
and how do these factors interact? (ii) By what mechanism does several robustness checks are performed. Finally, the results are
academic strength impact data breaches? (iii) What is the discussed, and conclusions are drawn.
influence of emerging information technologies, such as cloud
storage, on the impact mechanism? Literature review
Based on crime opportunity theory and routine activity theory, Prior research has analyzed the motives behind cybersecurity and
we examine how public vulnerability disclosures, cross-border the influencing factors of data breaches. Factors such as organi-
data flow, academic strength, and the adoption of cloud storage zational attributes, economic indicators, and information tech-
affect the possibility of data breaches, thereby analyzing the nologies have been empirically explored. In this section, we first
interactions between these variables. It is observed that an review the literature related to information breaches in uni-
increase in the number of public disclosures of vulnerabilities versities and then summarize the literature according to types of
increases the frequency of data breaches. In addition, cross- influencing factors.
border data flow decreases the number of data breaches. Subse-
quently, using two mediators, the mechanisms through which Information breaches in universities. As demonstrated by
academic strength affects data breaches are identified. Uni- Bongiovanni (2019), regarding security management, information
versities with higher academic achievements have relatively in universities is the least secure. Data breaches in higher edu-
higher cross-border data flow and vulnerabilities. Furthermore, cation are becoming increasingly common (Chapman, 2019).
cloud storage is better than local storage when considering data One of the most urgent threats faced by higher education is from
breaches, and a public cloud has better performance than a pri- cybercriminals or hackers seeking to profit from the theft of the
vate cloud in data security protection. Furthermore, our study sensitive personal and financial information of the students,
shows that cloud adoption negatively moderates the impacts of faculty, and staff (FireEye, 2016). Verizon (2022) noted that
vulnerabilities and positively moderates cross-border data flows. monetary gain was the primary motive for approximately 95% of
This study contributes to the literature in several ways. First, data breach incidents observed in higher education in 2021. In
the factors influencing data breach incidents related to uni- general, the intention of cybercriminals is to steal data that can be
versities are empirically examined. Prior studies focusing on data quickly monetized.
breach risks have primarily considered the medical industry and The open and collaborative environment in a university and
enterprises. The higher education industry—particularly uni- the typical access to many portable devices make it easier to gain
versities—is presently subject to severe data breach risks, but has access to unauthorized sensitive information (Coleman and
received relatively limited attention. Second, as risk management Purcell, 2015). Web users are highly mobile and accustomed to
has become a research focus in the context of cross-border data accessing the web from any device, at any time, and from
flow, we investigate the effects of cross-border data flow on data anywhere. This open-design architecture commonly used by
breaches and provide a new perspective for understanding the universities undoubtedly facilitates the exchange of information
value of such data transfers. Third, the impact of the cloud on (Okibo and Ochiche, 2014); however, the existence of numerous
data breaches is identified, distinguishing between the effects of connected devices across organizations, the coexistence of
different types of cloud adoption. Finally, we contribute to the different security cultures, and the tendency to outsource security
literature on data breaches and theories on data security, indi- controls make universities more vulnerable to information
cating several managerial implications for the control of data security risks (Borgman, 2018). Additionally, the academic
culture of openness and the unencumbered access make it commercial attributes, the defining attributes of the organization
particularly difficult for universities to maintain security. The lax and IT management are dominant factors that influence data
security that facilitates open access and the sharing of cutting- security. In the case of companies, due to their special nature as
edge academic research and content on the network makes higher business organizations, researchers are more concerned about the
education an attractive target for attackers (Roman, 2014). In impact of a company’s performance and image, which is likely to
conclusion, universities that hold sensitive personal data and cause dissatisfaction among stakeholders (D’Arcy et al., 2020). In
intellectual property of many researchers are ideal targets addition, the management practices of employees and the per-
(Chabrow, 2015). sonal characteristics of top managers are important factors related
One of the factors affecting information security in universities to information security (Ifinedo, 2016; Haislip et al., 2021; Burns
is the increasing difficulty of security management. Noghondar et al., 2022). Studies related to the health care industry have
et al. (2012) pointed out that high turnover rates and general largely focused on organizational features. Scholars have paid
complacency toward information security also increase the more attention to the impact of IT management systems and
exposure of university information. Magura et al. (2021) high- organizational characteristics on data security (Angst et al., 2017;
lighted issues affecting database security that could lead to data Dolezel and McLeod, 2019; Kim and Kwon, 2019). The same
breaches and data theft, including human factors, work environ- holds true for higher education, especially for Chinese universities
ments, and the technologies used. Liu et al. (2020) studied how as they are generally public universities with fewer commercial
centralized IT decision-making affects the likelihood of cyberse- features. Therefore, following the spirit of prior research, this
curity breaches in higher education, especially in institutions with study focused on organizational features and IT measures.
a more heterogeneous IT infrastructure. Iriqat et al. (2019) The risks of data breaches can differ based on the main
explored the compliance of staff with information security industry, geographic location, and types of breaches occurring in
policies at the Palestine University. Other studies have concluded the past (Sen and Borle, 2015). Lee and Hess (2022) found that
that a lack of security awareness is directly related to how the demographic variables (gender, age, race, ethnicity, income, and
faculty value the information system assets of their universities location) and political ideology are associated with data security.
(Nyblom et al., 2020). To address these concerns, artificial neural Schlackl et al. (2022) summarized the antecedents of data
network techniques have been utilized to improve cybersecurity breaches identified in prior research, including technology
in higher education (Saad AL-Malaise AL-Ghamdi et al., 2022). measures, information disclosure, organization attributes, etc. In
an enterprise, corporate social performance (measured by
participation in socially responsible or irresponsible activities)
Data breach influencing factors. There are three typical types of has been proven to affect the likelihood of computer attacks
research on data breaches: (1) analysis of the consequences of data leading to data breaches (D’Arcy et al., 2020). Corporate
breaches, such as that of Foerderer and Schuetz (2022), who reputations were found to be important assets in protecting
studied the influence on stock market reactions, Ali et al. (2022), corporate value after a data breach (Gwebu et al., 2020). Wang
who focused on the long-term effects on equity risk, and Bachura and Ngai (2022) explored the negative association between firm
et al. (2022), who investigated the emotional response after a data diversity and data breach risks, delineating the boundary
breach and identified breach concepts most relevant to each conditions. Ifinedo (2016) discussed how top management
emotion; (2) research on response strategies, such as user com- support, the severity of sanctions, and cost‒benefit analyses have
pensation (Goode et al., 2017; Hoehle et al., 2022) and corrective significantly impacted employee compliance with information
action (Nikkhah and Grover, 2022); and (3) analysis of the causes systems security policies. Burns et al. (2022) studied personal
of data breaches, which we focus on primarily in this paper. The motives and controls for insider computer abuse, which could
most relevant existing studies on the influencing factors of data lead to costly and severe data breaches. Regarding the medical
breaches from different sectors listed in Table 1 provide a com- industry, Wasserman and Wasserman (2022) focused on
parative analysis primarily from an industry perspective. From cybersecurity risks in hospitals. Dolezel and McLeod (2019)
the listed studies, it can be concluded that when organizations at studied employee behavior, safety culture, training, supplier
risk of data breaches have more commercial attributes, the selection and handling of personal health information, and strong
interests involved can be more complex; thus, social perception risk management procedures as data breach factors. Another
can significantly affect information security, especially the like- study found that data breach risks differ according to type and
lihood of cyberattacks. However, when an organization has fewer scale of a hospital (Gabriel et al., 2018). Regarding the banking
industry, Ali et al. (2020) investigated the effects of socio-factors biomedical organizations. There is evidence that localized data are
on the banking sector’s systematic risks. unlikely to provide better results in terms of data breaches, and
Given the emerging developments of new information the domestic storage of data poses risks to many poorly managed
technologies, such as artificial intelligence and intelligent robots and costly data centers (Chander and Lê, 2014). Indeed, data
(Ban et al., 2022; Lu et al., 2023), IT factors are attracting more localization does not contribute to data security but makes it
attention in related research streams. IT investments have been more vulnerable to destruction, especially by hackers (Chander
found to be effective in reducing the risk of data breaches (Sen and Lê, 2015).
and Borle, 2015); however, this does not necessarily translate into In summary, the current literature has shown that great
fewer data breaches. Institutional factors create conditions under progress has been made in research on the factors that influence
which IT security investments can perform more effectively. data breaches, thereby drawing a basic outline of the problem and
When considering the impact of information security invest- providing a thorough comprehension of data breaches. Based on
ments on data breaches, companies must consider the impact of this, we focus on identifying the influencing factors related to
institutional factors and balance them. Li et al. (2021) found that universities.
IT security investments have different effects on security breaches
in organizations with different approaches to making digitalized
Theories and hypotheses
progress. Li W, Leung ACM, Yue WT (2022) stated that there is a
Relevant theories. The routine activity theory proposes three
dynamic interrelationship between IT investments and data
factors leading to crimes (in this case, cybersecurity crimes): (i)
breaches. Haislip et al. (2021) found that executives’ IT expertize
potential attackers or malicious insiders with crime motives; (ii)
could be an effective factor influencing reported data security
suitable, accessible, and valuable targets; and (iii) a lack of com-
breaches. Additionally, the increase in vulnerabilities adds to the
petent guardianship (Cohen and Felson, 1979). In this context,
risk of data breaches but is mitigated by an increase in expired
offenders can be predominantly potential attackers, malicious
vulnerabilities (Sen and Borle, 2015). Regarding new ITs, Fried
insiders, or insiders who disclose sensitive information unin-
(1994) discussed both new threats and potential new defenses for
tentionally (Pang and Tanriverdi, 2022). The motive is mainly
information systems security brought about by new products and
financial (Verizon, 2022). The target could be accessible IT sys-
information technologies. For example, Kim and Kwon (2019)
tems that manage universities’ critical information. Universities
found that electronic medical records and medical management
can strengthen their guardianship by investing in security pro-
department plans increase the risk of accidental and malicious
tection technology (Liao et al., 2017; Luo et al., 2020; Wang et al.,
data breaches, especially in larger hospitals. For emerging cloud
2015) or by seeking external governance from vendors (Pang and
services, although people generally believe that cloud services are
Tanriverdi, 2022).
more vulnerable to security breaches, cloud services in fact reduce
The central assumption of crime opportunity theory is that
the average expected losses of consumers relative to internal
criminal behavior is driven by human rationality and that the
software in a high-security loss environment during an attack
conditions for committing a crime require a vulnerable victim in
(Zhang et al., 2020). Moreover, cloud storage is a type of
addition to motive and the lack of restraint (Hannon, 2002).
centralized storage (Bandara et al., 2021; Ouf and Nasr, 2015; Wu
Thus, criminals are more likely to take opportunistic actions and
et al., 2014) and may be safer when considering the emergence of
choose victims who are more vulnerable. In criminal cases that
end-user computing. The task of ensuring information security
lead to data breaches, vulnerabilities in information systems,
becomes more complex as information systems become increas-
software, and firmware present opportunities for potential
ingly distributed (Fried, 1993), and the integration of security and
intruders, that is, the more system vulnerabilities there are, the
IT-related processes can reduce data breaches (Angst et al., 2017).
greater the chances of attracting intruders will be, resulting in a
Pang and Tanriverdi (2022) found that cloud migration of legacy
higher risk of data breaches.
IT systems significantly reduces cybersecurity risks for public
clouds through the internal and external guardianship provided
by the cloud service, which has more resources for establishing Hypotheses development. Based on relevant theories and the
effective information protection. related literature, we propose the research framework shown in
As the digital economy develops, additional discussions on the Fig. 3 and the following research hypotheses.
security and development of cross-border data flow have Coordinated vulnerability disclosure (CVD) is an efficient
emerged. The benefits of cross-border data have both economic approach to finding and fixing flaws in IT systems. Through this
and social repercussions. Ten percent of the average profit growth approach, after finding a vulnerability in an IT system, a white-
of various industries is attributed to cross-border data (China hat hacker (an ethical hacker who uses his or her ability to
Academy of Information and Communications Technology, discover security vulnerabilities and helps protect organizations)
2021). Bauer et al. (2013) found that limiting the free flow of reports it to the platform to warn the system manager. Details
data leads to a reduction in gross domestic product (GDP). In such as the titles of the vulnerabilities and their brief descriptions,
terms of social benefits and public welfare value, the Organization ratings, and comments are visible to all registered white hats.
for Economic Co-operation and Development (OECD) (2019) However, the vulnerability details are only visible to relevant
insists that it is necessary for data to flow domestically and organizational administrators and vulnerability submitters.
internationally, as this can provide significant developmental According to crime opportunity theory, criminals are more likely
benefits. The “public good” nature of data beyond national to engage in speculative behavior and choose victims who are
borders has been emphasized and calls for international data more vulnerable (Hannon, 2002). In data breach incidents,
sharing. For example, the COVID-19 pandemic clearly demon- “vulnerable” represents the public disclosure of computer security
strated the importance of the global sharing of health data for vulnerabilities in information systems, software, and firmware
research purposes (United Nations Conference on Trade and (Sen and Borle, 2015), which enhances the accessibility of
Development, 2021). However, cross-border data flow and sensitive information, thereby increasing the data breach risk
international storage are associated with perceived risks, such as according to routine activity theory. It has also been noted in the
those concerning surveillance and unwarranted data mining literature that public disclosures of relevant vulnerabilities
(Meltzer, 2015). To assess risk, Li et al. (2022) developed a risk increase the frequency of attacks (Browne et al., 2001). The more
index system for cross-border data flow and applied it to vulnerabilities there are, the more vulnerable the information
Fig. 3 Research framework. It shows the relationships among variables and demonstrates relevant theories applied.
system is to malicious attackers. Therefore, we propose the particularly attractive targets for cybersecurity attacks. In other
following hypothesis. words, academically stronger universities are more likely to be
attacked, leading to additional data breaches. Hence, we propose
H1: Public disclosure of vulnerabilities increases data the following hypothesis.
breaches.
Given the development of globalization, cross-border data flow H3: There is a positive relationship between academic
has become an essential part of the global digital economy. The strength and the number of data breaches.
necessity for cross-border data flow has been emphasized According to Weulen Kranenbarg et al. (2018), one motive
considering its significant economic and social benefits (Bauer for white-hat hackers’ CVD reporting is to gain status in the
et al., 2013; OECD, 2019), especially in the context of academic hacker community, as they expect recognition and acknowl-
research on international collaborations and data exchanges. It is edgment. The other motive is cash bounties, which account for
also evident that localizing data storage is unlikely to provide 15% of motives. However, considering that no such bounty
better results in terms of data breaches (Chander and Lê, 2014) programs exist on the Education Industry Vulnerability
and does not contribute to data security; instead, it makes the Reporting Platform and that only gifts can be redeemed, we
data more vulnerable to destruction, especially by hackers assume that the main motivations for CVDs by hackers are to
(Chander and Lê, 2015). Rather than reducing data security gain status regarding and acknowledgment of their skills and
risks, suppressing cross-border data flow places universities at a actions. Undoubtedly, CVDs are aimed at more famous and
disadvantage. Thus, universities with greater cross-border data influential universities, in contrast to “normal” universities.
flow may have fewer data breach incidents. Therefore, we propose Therefore, the vulnerabilities of universities with higher
the following hypothesis. academic achievements and greater social impact are more
likely to be reported or disclosed to attract more social
H2: There is a negative relationship between the frequency attention. Based on this, we propose that such vulnerabilities
of cross-border data flow and occurrences of data breaches. mediate the relationship between academic strength and data
According to the Data Breach Investigation Report by Verizon breaches.
(2022), more than 75% of breach incidents in the education
industry are by external attackers. Financial motives account for H4(a): The number of vulnerabilities has a mediating effect
95% of attacker motives, meaning that hackers mostly attack for on the relationship between research strength and the
money (e.g., by selling personal information and through number of breaches.
blackmail). Academics are the heart of a university, and the Considering that universities with stronger academic strength
performance of the faculty affects the quality of student learning have broader worldwide influence and more academic commu-
and the strength of the university, which in turn impacts the nication with foreign institutions and individuals, they may
contributions of academic institutions to society (Shrand and commit to larger-scale, global data flow around the world.
Ronnie, 2019). Many indicators of research success are sig- Therefore, we propose that the scale of cross-border data flow
nificantly associated with a university’s reputation (Linton et al., mediates the relationship between academic strength and data
2011). In higher-ranked universities, the volume of research is breaches.
larger. According to routine activity theory, offenders tend to
choose more valuable targets. Therefore, hackers who hack for H4(b): The amount of cross-border data flow has a
money are more likely to aim for academically stronger mediating effect on the relationship between research
universities, as they are more famous and perform better in strength and the number of breaches.
industry. Similar concerns have been raised in previous research. Millions of companies and institutions use the cloud to store
Liu et al. (2020) considered the impact of research grants on data remotely and run applications and services, thereby reducing
cybersecurity attacks since the valuable intellectual properties costs and accelerating operations (Rawding and Sacks, 2020).
generated in research and development activities are at risk of According to the Cloud Usage and Digital Economy Develop-
being stolen and misappropriated, which makes universities ment Report (2018) of the Tencent Research Institute, the degree
security training conducted at universities. It has also been of data breaches, with a mean value of 8.5 breaches in 2020.
empirically shown that institutions’ scales are positively related Among the 110 universities, in terms of attributes (only the
to the risk of data breaches (Gabriel et al., 2018; Kim and Kwon, highest title of the university was taken), 20% were universities
2019). Therefore, the scale of the universities was controlled, as in “Project 985” and 35% were universities in “Project 211”6. In
demonstrated by the number of undergraduate majors. In addition, 48% of the universities were comprehensive uni-
addition, economic indicators were found to be positively versities, whereas 52% were noncomprehensive universities
correlated with the risk of data breaches (Sen and Borle, 2015); (such as those limited to medicine, finance and economics,
thus, we also controlled for the GDP of the city where each normal education, or science and engineering). Regarding the
university was located. In addition, other control variables were urban distribution of universities, 23% were in the most
added for the number of national key disciplines, master’s developed first-tier cities; 40% were located in new first-tier
programs, doctoral programs, time of establishment, type of cities7; and the rest were from less-developed areas.
university, attributes of university, and number of universities Table 4 presents the correlation matrix. Considering that
in the same city. The detailed definitions of the variables are some correlations were high, and that multicollinearity may
provided in Table 2. have existed among the variables, we conducted a variance
inflation factor (VIF) test. Except for the largest VIF value of
Descriptive statistics. Table 3 describes the statistics calculated 3.39 (Num_Research_Project), the remaining VIF values were
for the main variables. Although the data security risks of no higher than 3, indicating no significant multicollinearity
universities appeared uneven, they generally faced a severe risk issues.
*p < 0.05; **p < 0.01; ***p < 0.001. Pseudo R2 is McFadden’s pseudo R2 and can be explained as R2 in generalized linear models, but with a generally smaller value, as a value of 0.2–0.4 indicates an
excellent model fit (Hensher and Stopher, 1979).
model displays the best goodness-of-fit among all models, with cities and a lower risk in cities where higher education is well
the smallest Akaike information criterion (AIC) and Bayesian developed.
information criterion (BIC) values in Column (2).
The results in Column (2) of Table 5 show that the public Mediating effect. Contrary to our expectations, the relationship
disclosure of a vulnerability has a positive and significant effect on between academic strength and data breaches is not significant. In
data breaches (Num_Vulnerability: β = 0.005, s.e. = 0.001, this section, we investigated the possible mediating effects of these
p < 0.001), indicating that the more disclosed vulnerabilities there results. First, the public disclosure of vulnerabilities was con-
are, the more breach incidents occur and the greater the risks of sidered as a mediator. Universities with higher academic
such data breaches are. Thus, H1 is supported. The effect of the achievement and greater social impact are more likely to be
cross-border data flow on the breach is negative and significant reported and exposed negatively because they attract more social
(Num_Data_Flow: β = −0.021, s.e. = 0.008, p = 0.002), which attention. Thus, we addressed the mediating effect of the public
supports H2. This shows that the higher the frequency of data disclosure of vulnerabilities. The models were constructed as
flow is, the fewer reported breach incidents there are. First, data follows:
flow reflects the fluidity and mobility of data to a certain extent.
In universities with strong data fluidity, data security manage- Num Vulnerabilityi ¼ α þ β1 Num Research Projecti
ment generally receives greater attention and thus provides a ð2Þ
þγControls þ ε
higher level of data protection. Moreover, universities with strong
data flows have more open data systems. Their data security
protection and high openness reduce the motivation for potential
log E Num Breachi Xi þ Controls
attackers. These findings provide insights for possible future
research directions. For new IT utilization, the effect of cloud ¼ α þ β1 Num Research Projecti þ β2 Num Data Flowi ð3Þ
adoption is statistically significant (Ind_Cloud_Storage: þ β3 Ind Cloud Storagei þ γControls
β = −0.335, s.e. = 0.088, p < 0.001) and shows that universities
adopting cloud storage are less likely to have breach incidents. Equation (2) verified the relationship between the number of
Thus, H5 is supported. Notably, the direct effect of academic research projects and the disclosed vulnerabilities. Equation (3)
strength (proxied by Num_Research_Project) on data breaches is was employed to address the existence of a mediating effect based
not significant, as shown in Column (2). Additional analyses and on Eq. (2). The estimation results are presented in Table 6. The
explanations are presented in the next section. first two columns show the results obtained through Eq. (2):
Although not the main focus of our study, the coefficients of Column (1) is for Num_Research_Project only, and Column (2)
the other control variables also merit consideration. The scale incorporates the related controls. Unsurprisingly, academic
of the university, as measured by the number of undergraduate strength has a positive effect on the disclosure of vulnerabilities.
majors, increases the risk of data breaches, similar to the results Column (3) presents the results without Num_Research_Project
of previous research (Gabriel et al., 2018). Undoubtedly, and Num_Vulnerability. In Column (4), the number of research
relevant training helps reduce data breaches. Noncomprehen- projects is positively related to the number of data breaches,
sive universities generally face more severe risks than without interference from the mediating variable. Column (5)
comprehensive universities. Universities in “Project 985”, as replicates the main result of Column (2) in Table 5, where the
first-tier universities in China, face a higher risk of data effect of Num_Research_Project is insignificant. Therefore, we
breaches. Interestingly, the GDP of a city has a positive effect, concluded that the number of research projects indirectly affects
whereas the number of universities in the city has a negative the increase in data breach incidents through the corresponding
effect. This indicates a higher risk of data breaches in developed vulnerabilities, thus supporting H4(a).
Similarly, we specified a model for investigating the mediating the moderating effect, where the interaction term is negatively
effect of the number of cross-border data flows as follows: related to data breaches (β = −0.011, s.e. = 0.001, p < 0.001). Thus,
Num Data Flowi ¼ α þ β1 Num Research Projecti cloud storage mitigates the positive relationship between vulner-
ð4Þ abilities and data breaches as a moderating variable. Cloud storage
þγControls þ ε enables a more integrated consolidation of distributed data stored in
different systems, thus making it easier to maintain and manage.
log E Num Breachi jXi þ Controls
Therefore, adopting cloud storage could reduce the possibility of
¼ α þ β1 Num Research Projecti þ β2 Num Vulnerabilityi ð5Þ breaches caused by vulnerabilities. Accordingly, H6(a) is supported.
þβ3 Ind Cloud Storagei þ γControls Below, we addressed the moderating role of cloud storage in the
Table 7 shows the estimation results. As expected, cross-border data relationship between cross-border data flow and data breaches.
flow increased with the number of research projects. Column (3)
presents the results without Num_Research_Project and Num_Da- log E Num Breachi Xi þ Controls
ta_Flow. The number of research projects is positively correlated ¼ α þ β1 Num Vulnerabilityi þ β2 Num Research Projecti
with the number of data breaches, without interference from the
þ β3 Num Data Flowi þ β4 Ind Cloud Storagei
mediating variable, as shown in Column (4). We concluded that the
number of research projects indirectly affects the increase in data þ β5 Num Data Flowi *Ind Cloud Storagei þ γControls
breach incidents through cross-border data flow. Universities with ð7Þ
higher academic achievements tend to communicate more with
academics worldwide. Thus, H4(b) is supported. Table 9 shows the estimation results. Column (1) represents the
controls only, and Columns (2) and (3) report the results without
Moderating effect. To further investigate how new IT utilization and with a moderating effect, respectively. The interaction term in
(i.e., cloud storage) influences data breaches, we specified an Column (3) is negatively related to data breaches (β = −0.055,
econometric model with cloud storage adoption as a moderating s.e. = 0.014, p < 0.001), indicating that cloud storage strengthens
variable. First, we addressed the moderating role of cloud storage the negative relationship between cross-border data flow and data
in the relationship between vulnerabilities and data breaches. The breaches, as cloud storage makes it easier to transfer data
analytical model was constructed as follows. worldwide. Thus, H6(b) is supported.
log E Num Breachi jXi þ Controls
Robustness checks
¼ α þ β1 Num Vulnerabilityi þ β2 Num Research Projecti
To ensure the robustness of the conclusions, this section discusses
þ β3 Num Data Flowi þ β4 Ind Cloud Storagei several robustness checks from four perspectives. First, we tested the
þ β5 Num Vulnerabilityi *Ind Cloud Storagei þ γControls significance of the mediating effects. Second, we expanded the time
window for several variable measurements to mitigate the impact of
ð6Þ COVID-19. Third, we explored whether the effects of specific cloud
The empirical results are presented in Table 8. Column (1) shows adoptions differ by redefining the cloud services and classifying
all controls without Num_Vulnerability, Ind_Cloud_Storage, and the them into private and public cloud storage. Finally, because data
interaction term. Column (2) is from Column (2) in Table 5 to allow breach incidents have different risk levels, we considered the effects
for an easy comparison. Column (3) shows the estimates considering of various factors at different levels of risk.
Significance test for mediating effect. Three other methods Num_Data_Flow as the mediator, there is a significantly
(Aroian, 1947; Goodman, 1960; Sobel, 1982) were used to test the negative mediation effect between the independent and
significance of the mediation effect(s). As shown in Table 10, Row dependent variables. The average direct effects are insignif-
(1) is the test result obtained using Column (1) in Tables 8 and 9 icant. For Num_Vulnerability as the mediator, there is a
from investigating the relationship between the independent and significantly positive mediation effect, and the direct effects
intermediary variables. Row (2) is for Column (2) in Tables 8 and 9. are insignificant. This indicates that the effects of vulner-
All p values are less than 0.01, except for the Aroian Test of abilities on data breaches going through the mediator
Num_Data_Flow in Row (2), where p = 0.01006, indicating that the account for almost the entire total effects. The mediation
mediation effects are highly significant. and direct effects have different signs, explaining why the
We then investigated the proportions of the mediation proportion of the effects going through the mediator exceeds
effects and direct effects, as shown in Table 11. Using one.
Method Test statistic Std. error p-value Test statistic Std. error p-value
(1) Sobel −2.58571 0.00012 p = 0.00972 p < 0.01 3.85735 0.00013 p = 0.00012 p < 0.001
Aroian −2.58015 0.00012 p = 0.00988 p < 0.01 3.82650 0.00013 p = 0.00013 p < 0.001
Goodman −2.59130 0.00012 p = 0.00956 p < 0.01 3.88897 0.00013 p = 0.00010 p < 0.001
(2) Sobel −2.58004 0.00011 p = 0.00988 p < 0.01 4.14024 0.00012 p = 0.00004 p < 0.001
Aroian −2.57371 0.00011 p = 0.01006 p < 0.05 4.11445 0.00012 p = 0.00004 p < 0.001
Goodman −2.58642 0.00011 p = 0.00970 p < 0.01 4.16652 0.00012 p = 0.00003 p < 0.001
MD Num_Data_Flow Num_Vulnerability
Effects Estimate 95% CI Lower 95% CI Upper Estimate 95% CI Lower 95% CI Upper
(1) ACME −0.00341* −0.00660 0 0.00348*** 0.00231 0
ADE −0.00043 −0.00511 0 −0.00035 −0.00309 0
Total Effect −0.00384** −0.00790 0 0.00313** 0.00098 0
Prop. Mediated 0.92462* 0.18189 3.58 1.08239** 0.60717 3.64
(2) ACME −0.00355** −0.00685 0 0.00358*** 0.00236 0.01
ADE −0.00055 −0.00474 0 −0.00030 −0.00278 0
Total Effect −0.00410* −0.00870 0 0.00328** 0.00090 0.01
Prop. Mediated 0.91217* 0.16885 2.85 1.07296** 0.56443 3.57
The number of simulations was 1000, ACME stands for average causal mediation effects, ADE stands for average direct effects, Total Effect stands for the total effect (direct + indirect) of the
independent variable on the dependent variable; Prop. Mediated describes the proportion of the effect of the independent variable on the dependent variable that goes through the mediator. *p < 0.05;
**p < 0.01; ***p < 0.001.
Num_Vulnerability: 2020, Num_Vulnerability’: 2017–2019; Num_Breach: 2020, Num_Breach’: 2017–2019; Num_Data_Flow: 2020, Num_Data_Flow’: 2019; *p < 0.05; **p < 0.01; ***p < 0.001.
Varied length of time window 2017–2019 were used to measure the level of universities’ data
Expansion of the time window length of data breach incidents. For security protection.
the main analysis, we collected data on breach incidents in 2020. Column (1) in Table 12 presents the estimated results, where
Regarding the global outbreak of COVID-19, the incidents in Num_Breach’ denotes the number of data breach incidents in
2020 may have been affected by fluctuations in the epidemic, universities reported during 2017–2019, and Num_Vulnerability’
making them unrepresentative of typical data security issues in measures the number of publicly disclosed vulnerabilities in
universities. Therefore, the data breach incidents reported during universities during 2017–2019. Column (3) replicates the original
Expansion of the time window length of cross-border data flow. In Risk Level Obs. Mean S.D. Max Min
addition to the COVID-19 outbreak, another breakout in 2020 Low 110 6.83 8.11 63 0
was related to global medical data sharing, particularly regarding Medium 110 1.21 2.16 14 0
coronavirus epidemic-related data. This may have caused High 110 0.29 1.14 11 0
abnormal fluctuations in cross-border data flow at universities. Severe 110 0.05 0.25 2 0
To alleviate this concern, we used the cross-border data flow
collected in 2019. Table 12 shows the results in Column (2),
where Num_Data_Flow’ measures the number of cross-border
data flows in universities during 2019. The results remain con- Different risk levels of data breach. The factors related to the
sistent and the significance of Num_Data_Flow’ is even higher, risk of breaches were tested for repercussions beyond the mere
providing further evidence of the robustness. occurrence of such breaches. The Education Industry Vulner-
ability Reporting Platform scores the risks of all data breach
incidents on a scale of 0–10. This scale is further categorized as
Different cloud service types. As discussed in the empirical low (0–4), medium (4–7), high (7–9), and severe (9–10) risks.
results section, adopting cloud storage can result in fewer vul- Table 14 presents the descriptive statistics of breaches with dif-
nerabilities and improve data fluidity. However, considering that ferent risk levels.
different types of cloud storage may have different effects, we Instead of the total number of breaches, the number of
further classified cloud storage into two types, namely, private breaches with different risk levels was counted for universities in
and public clouds, as defined by cloud providers in the market. 2020 and regressed onto independent variables. As there were
According to Alibaba Cloud, a private cloud provides a cor- relatively few instances of severe incidents, only three risk levels
poration or organization with a dedicated cloud environment that were considered. The results are presented in Table 15. For low-
can be operated internally by the IT team to better control its and medium-risk breach incidents, the cross-border data flow still
computing resources (Li and Li, 2017). A private cloud can be has a significant negative effect, and the number of publicly
physically located in the organization’s data center or hosted by a disclosed vulnerabilities still has a significant positive effect. The
service provider. A public cloud is a cloud infrastructure provided effects of the main variables are highly consistent with the
by service suppliers for users, individuals, or enterprises. Users previous research results. For high-risk breach events, a good fit is
can access these servers by purchasing public cloud services and not achieved because of the small number of observations;
data storage. On a public cloud, all users share the same hard- however, the coefficient signs of the main variables are consistent.
ware, storage, and network equipment8. The results show that cross-border data flow only affects the
The effects of three variables, Ind_Cloud_Storage, Ind_Cloud_- occurrence of medium-risk breaches and that vulnerabilities tend
Private, and Ind_Cloud_Public, were investigated. Table 13 shows to mostly increase the occurrence of high-risk breaches, as they
the results, where Ind_Cloud_Storage is coded as “1” if the have the highest significance in the regression results. The
university adopted any type of cloud storage and “0” otherwise; adoption of cloud storage may only influence the occurrence of
Ind_Cloud_Private is coded as “1” if the university adopted a low-risk breaches. Despite the few observations, this finding
private cloud and “0” otherwise; and Ind_Cloud_Public is coded provides insights into possible future research directions.
as “1” if the university adopted a public cloud and “0” otherwise.
Notably, public clouds have the most significant negative effects. Discussion and conclusion
According to routine activity theory, guardianship is essential to In this study, we identified and analyzed the key elements of data
cybersecurity, and universities can enhance their guardianship by security incidents in the context of higher education from an
seeking external governance from external vendors (Pang and empirical perspective. Based on crime opportunity theory and
Tanriverdi, 2022). Public clouds enable external guardianship routine activity theory, we constructed a conceptual model and
provided by cloud service vendors who are more capable of proposed hypotheses to investigate the underlying mechanisms
effective information protection (Pang and Tanriverdi, 2022). In that impact data breaches. The key findings were obtained
addition, outsourcing vendors can achieve economies of scale and through a series of empirical analyses and robustness checks.
scope when offering IT services to clients, making it more First, it was determined that the public disclosure of vulner-
economically feasible for vendors with professional security teams abilities increased data breaches, which complements the con-
to protect their information systems (Levina and Ross, 2003). clusion of Sen and Borle (2015) in the context of universities.
Therefore, in terms of cloud adoption, public clouds may be a Second, when incorporating the cross-border data flow effect and
better choice for cybersecurity. measuring the data fluidity and mobility, we found that it
negatively affected data breaches, leading to fewer breaches. clouds can differ significantly. Our findings further reveal and
Third, academic strength influenced the occurrence of data strengthen the difference in terms of the impact on data breaches,
breaches in different ways. Academically stronger universities which has implications for studies focusing on data security in
tended to have more data flow and publicly reported vulner- cloud environments. Future research could break down the types
abilities, which played a mediating role in the relationship of clouds and explore their effects in different contexts to support
between academic strength and data breaches. Fourth, new decision-making relevant to clouds.
information technologies such as cloud storage could help reduce Fourth, this study has implications for research on data
data breaches and have moderating effects on vulnerabilities and breaches. Although we focus on a specific industry, and some of
data flow. In addition, public clouds were found to be relatively the identified factors and key findings are industry specific, they
safer than private clouds in terms of data breach issues, which nonetheless provide an impetus for analyzing the causes of data
complements the research focusing on cloud services and data breaches in other contexts, thereby enriching the literature by
securities. identifying the factors influencing data security incidents and, in
particular, data breaches in the context of universities.
Theoretical contributions and implications. This study makes
theoretical contributions to the literature. First, we contribute to the Practical implications. Our study provides a basis for improving
data security literature by exploring a new context. According to the the data security of universities and other scientific research
available literature, this study is among the first to examine the institutions in the higher education industry, which has practical
factors influencing data breach risks in the context of universities. implications for universities aiming to shape their data security
Prior studies on data breach risk have focused on several other strategies to mitigate data security risks. First, regular system
industries, such as the medical industry and companies. Relatively maintenance and timely discovery and repair of technical vul-
few studies have focused on universities even though they are at nerabilities can reduce opportunities for attackers and create a
great risk of data breaches. In this study, the increased risk asso- secure and stable information environment. Second, strengthen-
ciated with the number of public disclosures of vulnerabilities is ing data fluidity and openness is conducive to creating more
highlighted. The underlying mechanisms explaining how academic valuable data. Third, when embracing new information technol-
strength affects the risk of data breaches are investigated. ogies, such as cloud storage, universities may consider the pos-
Second, we contribute to research on data breaches by sibility of data breaches resulting from different service types,
discussing the effects of cross-border data flow, which are valued thereby weighing the advantages and disadvantages. Fourth,
and regulated by numerous countries and regions for their strengthening the intensity of data security training and
contributions to the digital economy and potential security risks. improving the data security awareness of relevant personnel can
Prior research has barely considered cross-border data flow in the help prevent problems and information breaches caused by
context of data breaches and has mostly focused on developing human errors before they occur.
and managing relevant policies to prevent potential risks incurred
by cross-border data flow. We investigate the effects of cross- Limitations and future research. Certain limitations and future
border data flow on data breaches and provide another research directions are summarized as follows. First, the data were
perspective for understanding the value of cross-border data flow. confined to universities in China and had a relatively short time
Third, we contribute to the information security literature by series. A dynamic panel integrating data analysis along the time
identifying the impacts of clouds on data breaches, distinguishing dimension for institutions of higher education in different
between the effects of different types of cloud adoption on the risk countries could be empirically created and analyzed in future
of data breaches. IBM (2022) reported that the cost of a data research. Second, the scale of a data breach was not incorporated
breach incident in organizations with public, private, or hybrid in this analysis for an assessment of the risk impact, as the
measurements for the numbers and types of leaked data were not Ban Y, Liu M, Wu P et al. (2022) Depth estimation method for monocular camera
accessible from the Education Industry Vulnerability Reporting defocus images in microscopic scenes. Electronics 11(13):2012. https://doi.
Platform. Providing risk quantification of data breach incidents org/10.3390/electronics11132012
Bandara E, Liang X, Foytik P et al. (2021) A blockchain and self-sovereign identity
could be an important future research direction. Third, higher empowered digital identity platform. In: Proceedings of the 2021 interna-
education institutions invest heavily in IT (Nash, 2007), which tional conference on computer communications and networks, Athens,
plays a key role in data-security management. Thus, the effects of Greece, 2021. https://doi.org/10.1109/ICCCN52240.2021.9522184
information security investments and new IT utilization, such as Bauer M, Erixon F, Krol M et al. (2013) The economic importance of getting data
biometric identification technologies, could be quantitatively protection right: protecting privacy, transmitting data, moving commerce.
European Centre for International Political Economy, Brussels. https://www.
valued in future research. uschamber.com/sites/default/files/documents/files/020508_
EconomicImportance_Final_Revised_lr.pdf. Accessed 25 May 2023
Bloom N, Propper C, Seiler S et al. (2015) The impact of competition on man-
Data availability agement quality: evidence from public hospitals. Rev Econ Stud 82:457–489.
https://doi.org/10.1093/restud/rdu045
The datasets of data breach incidents and disclosed vulnerabilities Bongiovanni I (2019) The least secure places in the universe? A systematic lit-
analyzed during the current study are from the Education erature review on information security management in higher education.
Industry Vulnerability Reporting Platform, available at https://src. Comput Secur 86:350–357. https://doi.org/10.1016/j.cose.2019.07.003
sjtu.edu.cn/. The data analyzed during this study are included in Borgman CL (2018) Open data, grey data, and stewardship: universities at the
the supplementary information files. The remainder of the privacy frontier. Berkeley Technol Law J 33(2):365–412. https://doi.org/10.
15779/Z38B56D489
datasets are available from the corresponding author upon rea- Browne HK, Arbaugh W, McHugh J et al. (2001) A trend analysis of exploitations.
sonable request. In: Proceedings 2001 IEEE Symposium on Security and Privacy, Oakland,
CA, USA, 2001. https://doi.org/10.1109/SECPRI.2001.924300
Received: 11 October 2022; Accepted: 17 May 2023; Burns AJ, Roberts TL, Posey C et al. (2022) Going beyond deterrence: a middle-
range theory of motives and controls for insider computer abuse. Inf Syst Res.
https://doi.org/10.1287/isre.2022.1133
Chabrow E (2015) China blamed for Penn State breach. Data Breach Today. http://
www.databreachtoday.com/china-blamed-for-penn-state-breach-a-8230.
Accessed 27 Jul 2022
Chander A, Lê UP (2014) Breaking the web: data localization vs. the global
Notes internet. SSRN Electronic Journal. https://doi.org/10.2139/SSRN.2407858
1 Statista provides annual number of data compromises and individuals impacted in the Chander A, Lê UP (2015) Data nationalism. Emory Law J 64(3):677–739
United States from 2005 to 2022. See https://www.statista.com/statistics/273550/data- Chapman J (2019) How safe is your data? Cyber-security in higher education.
breaches-recorded-in-the-united-states-by-number-of-breaches-and-records-exposed HEPI Policy Note. https://www.hepi.ac.uk/2019/04/04/how-safe-is-your-
(Accessed 10 Feb 2023) data-cyber-security-in-higher-education/. Accessed 20 Jun 2022
2 Private Rights Clearinghouse provides a chronology of data breaches. See https:// China Academy of Information and Communications Technology (CAICT) (2021)
privacyrights.org/data-breaches (Accessed 29 March 2023) White paper on global digital governance. CAITC, Beijing
3 The details of the Arden University data breach are provided by the Group Action Cohen LE, Felson M (1979) Social change and crime rate trends: a routine activity
Lawyers. See https://www.groupactionlawyers.co.uk/blog/arden-university-data- approach. Am Sociol Rev 44(4):588–608. https://doi.org/10.2307/2094589
breach-group-compensation-action (Accessed 7 Feb 2023) Coleman L, Purcell BM (2015) Data breaches in higher education. J Bus Cases Appl
4 Information about the Education Industry Vulnerability Reporting Platform can be 15:1–7. https://www.aabri.com/manuscripts/162377.pdf
found on https://src.sjtu.edu.cn/ (Accessed 7 Feb 2023) Coyle D, Nguyen D (2019) Cloud computing, cross-border data flows and new
5 The details of the context-aware security defined by Gartner can be found on http:// challenges for measurement in economics. Natl Inst Econ Rev 249:30–38.
www.Gartner.com/IT-glossary/context-aware-security (Accessed 7 Feb 2023) https://doi.org/10.1177/002795011924900112
6 The “Project 985” is a construction project to build a number of world-class D’Arcy J, Adjerid I, Angst CM et al. (2020) Too good to be true: firm social
universities and a number of internationally renowned high-level research universities. performance and the risk of data breach. Inf Syst Res 31(4):1200–1223.
Normally, they are regarded as the top universities in China. The “Project 211” focuses https://doi.org/10.1287/isre.2020.0939
on the construction of approximately 100 higher education institutions and several key Dolezel D, McLeod A (2019) Managing security risk: modeling the root causes of
disciplines. Note that the 985 project universities are included in the list of the 211 data breaches. Health Care Manag 38(4):322–330. https://doi.org/10.1097/
project universities. HCM.0000000000000282
7 First-tier cities include Beijing, Shanghai, Guangzhou and Shenzhen. New first-tier FireEye (2016) Cyber threats to the education industry. FireEye, California,
cities include Chengdu, Chongqing, Hangzhou, Wuhan, Xi’an, Tianjin, Suzhou, Accessed 25 Jul 2022 https://www.fireeye.com/content/dam/fireeye-www/
Nanjing, Zhengzhou, Changsha, Dongguan, Shenyang, Qingdao, Hefei and Foshan, current-threats/pdfs/ib-education.pdf
according to the list of new first-tier cities selected by YiMagazine in 2020. Foerderer J, Schuetz SW (2022) Data breach announcements and stock market
8 Alibaba Cloud provides the definitions of private and public clouds. See https://www. reactions: a matter of timing. Manage Sci 68(10):7298–7322. https://doi.org/
alibabacloud.com/zh/knowledge/what-is-private-cloud and https://www.alibabacloud.
10.1287/mnsc.2021.4264
Fried L (1993) Distributed information security: responsibility assignments and costs.
com/zh/knowledge/what-is-public-cloud (Accessed 7 Feb 2023)
Inf Syst Manag 10(3):56–65. https://doi.org/10.1080/10580539308906944
Fried L (1994) Information security and new technology potential threats and
References solutions. Inf Syst Manag 11(3):57–63. https://doi.org/10.1080/
Ali SEA, Lai F-W, Aman A et al. (2022) Do information security breach and its 07399019408964654
factors have a long-run competitive effect on breached firms’ equity risk? J Gabriel MH, Noblin A, Rutherford A et al. (2018) Data breach locations, types, and
Competitiveness 14(1):23–42. https://doi.org/10.7441/joc.2022.01.02 associated characteristics among US hospitals. Am J Manag Care 24(2):78–84
Ali SEA, Lai F-W, Hassan R (2020) Socio-economic factors on sector-wide sys- Goode S, Hoehle H, Venkatesh V et al. (2017) User compensation as a data breach
tematic risk of information security breaches: conceptual framework. In: recovery action: an investigation of the Sony PlayStation network breach. MIS
Proceedings of the international economics and business management con- Q 41(3):703–727. https://doi.org/10.25300/MISQ/2017/41.3.03
ference, Melaka, Malaysia, 2019. https://doi.org/10.15405/epsbs.2020.12.05.54 Goodman LA (1960) On the exact variance of products. J Am Stat Assoc
Angst C, Block E, D’Arcy J et al. (2017) When do IT security investments matter? 55(292):708–713. https://doi.org/10.1080/01621459.1960.10483369
Accounting for the influence of institutional factors in the context of Gwebu KL, Wang J, Hu MY (2020) Information security policy noncompliance: an
healthcare data breaches. MIS Q 41(3):893–916. https://doi.org/10.25300/ integrative social influence model. Info Systems J 30(2):220–269. https://doi.
MISQ/2017/41.3.10 org/10.1111/isj.12257
Aroian LA (1947) The probability function of the product of two normally dis- Haislip J, Lim JH, Pinsker R (2021) The impact of executives’ IT expertise on
tributed variables. Ann Math Stat 18(2):265–271. https://doi.org/10.1214/ reported data security breaches. Inf Syst Res 32(2):318–334. https://doi.org/
aoms/1177730442 10.1287/isre.2020.0986
Bachura E, Valecha R, Chen R et al. (2022) The OPM data breach: an investigation Hannon L (2002) Criminal opportunity theory and the relationship between
of shared emotional reactions on Twitter. MIS Q 46(2):881–910. https://doi. poverty and property crime. Sociological Spectrum 22(3):363–381. https://
org/10.25300/MISQ/2022/15596 doi.org/10.1080/02732170290062676
Henriquez M (2021) The top data breaches of 2021. Security Magazine. https:// Nikkhah HR, Grover V (2022) An empirical investigation of company response to
www.securitymagazine.com/articles/96667-the-top-data-breaches-of-2021. data breaches. MIS Q 46(4):2163–2196. https://doi.org/10.25300/MISQ/2022/
Accessed 15 May 2022 16609
Hensher DA, Stopher PR (1979) Behavioural travel modelling. Routledge. https:// Noghondar ER, Marfurt K, Haemmerli B (2012) The human aspect in data leakage
doi.org/10.4324/9781003156055 prevention in academia. In: Reimer H, Pohlmann N, Schneider W (eds) ISSE
Hina S, Dominic PDD (2020) Information security policies’ compliance: a per- 2012 securing electronic business processes: highlights of the information
spective for higher education institutions. J Comput Inform Syst security solutions Europe 2012 conference, Wiesbaden, 2012. https://doi.org/
60(3):201–211. https://doi.org/10.1080/08874417.2018.1432996 10.1007/978-3-658-00333-3_14
Hoehle H, Venkatesh V, Brown SA et al. (2022) Impact of customer compensation Nyblom P, Wangen G, Kianpour M et al. (2020) The root causes of compromised
strategies on outcomes and the mediating role of justice perceptions: a accounts at the university. In: Proceedings of the 6th international conference
longitudinal study of target’s data breach. MIS Q 46(1):299–340. https://doi. on information systems security and privacy, Valletta, Malta, 2020. https://
org/10.25300/MISQ/2022/14740 doi.org/10.5220/0008972305400551
IBM (2021) Cost of a data breach report 2021. IBM, Armonk Okibo BW, Ochiche OB (2014) Challenges facing information systems security
IBM (2022) Cost of a data breach report 2022. IBM, Armonk management in higher learning institutions: a case study of the Catholic
Ifinedo P (2016) Critical times for organizations: what should be done to curb University of Eastern Africa-Kenya. Int J Manag Excell 3(1):336–349
workers’ noncompliance with IS security policy guidelines? Inf Syst Manag Organisation for Economic Co-Operation and Development (OECD) (2019)
33(1):30–41. https://doi.org/10.1080/10580530.2015.1117868 Enhancing access to and sharing of data: reconciling risks and benefits for
Iriqat YM, Ahlan AR, Abdul Molok NN et al. (2019) Exploring staff perception of data re-use across societies. OECD, Paris, https://doi.org/10.1787/276aaca8-
infosec policy compliance: Palestine Universities empirical study. In: Pro- en
ceedings of 2019 first international conference of intelligent computing and Ouf S, Nasr M (2015) Cloud computing: the future of big data management. Int J
engineering, Hadhramout, Yemen, 2019. https://doi.org/10.1109/ Cloud Appl Com 5(2):53–61. https://doi.org/10.4018/IJCAC.2015040104
ICOICE48418.2019.9035133 Pang MS, Tanriverdi H (2022) Strategic roles of IT modernization and cloud
Joint Information Systems Committee (2018) Digital experience insights survey migration in reducing cybersecurity risks of organizations: the case of U.S.
2018: findings from students in UK further and higher education. Joint federal government. J Strategic Inf Syst 31(1). https://doi.org/10.1016/j.jsis.
Information Systems Committee, UK, Accessed 6 Feb 2023 https:// 2022.101707
digitalinsights.jisc.ac.uk/news/digital-experience-insights-survey-2018- Qin X, Liu Z, Liu Y et al. (2022) User OCEAN personality model construction
findings-students-uk-further-and-higher-education/ method using a BP neural network. Electronics 11(19):3022. https://doi.org/
Kim SH, Kwon J (2019) How do EHRs and a meaningful use initiative affect 10.3390/electronics11193022
breaches of patient information. ? Inf Syst Res 30(4):1184–1202. https://doi. Rawding M, Sacks S (2020) The balkanization of the cloud is bad for everyone.
org/10.1287/isre.2019.0858 MIT Technology Review. https://www.technologyreview.com/2020/12/17/
Lee D, Hess DJ (2022) Public concerns and connected and automated vehicles: 1014967/balkanization-cloud-computing-bad-everyone/. Accessed 7 Jun
safety, privacy, and data security. Humanit Soc Sci Commun 9:90. https://doi. 2022
org/10.1057/s41599-022-01110-x Roman J (2014) Add Butler University to breach list. Data Breach Today. https://
Levina N, Ross J (2003) From the vendor’s perspective: exploring the value pro- www.databreachtoday.com/add-butler-university-to-breach-list-a-7007.
position in information technology outsourcing. MIS Q 27(3):331–364. Accessed 27 Jul 2022
https://doi.org/10.2307/30036537 Saad AL-Malaise AL-Ghamdi A, Ragab M, Farouk S, Sabir M et al. (2022) Opti-
Li C, Li LY (2017) Optimal scheduling across public and private clouds in complex mized artificial neural network techniques to improve cybersecurity of higher
hybrid cloud environment. Inf Syst Front 19(1):1–12. https://doi.org/10.1007/ education institution. Comput Mater Contin 72(2):3385–3399. https://doi.
s10796-015-9581-2 org/10.32604/cmc.2022.026477
Li H, Yoo S, Kettinger WJ (2021) The roles of IT strategies and security invest- Say GD, Vasudeva G (2020) Learning from digital failures? The effectiveness of
ments in reducing organizational security breaches. J Manag Inf Syst firms’ divestiture and management turnover responses to data breaches.
38(1):222–245. https://doi.org/10.1080/07421222.2021.1870390 Strategy Sci 5(2):117–142. https://doi.org/10.1287/stsc.2020.0106
Li J, Dong W, Zhang C et al. (2022) Development of a risk index for cross-border Schlackl F, Link N, Hoehle H (2022) Antecedents and consequences of data
data movement. Data Sci Manag 5(3):97–104. https://doi.org/10.1016/j.dsm. breaches: A systematic review. Inf Manag 59(4):103638. https://doi.org/10.
2022.05.003 1016/j.im.2022.103638
Li T, Li Y, Hoque MA et al. (2022) To what extent we repeat ourselves? Discovering Sen R, Borle S (2015) Estimating the contextual risk of data breach: an empirical
daily activity patterns across mobile app usage. IEEE Trans Mobile Comput approach. J Manag Inf Syst 32(2):314–341. https://doi.org/10.1080/07421222.
21(4):1492–1507. https://doi.org/10.1109/TMC.2020.3021987 2015.1063315
Li W, Leung ACM, Yue WT (2022) Where is IT in information security? The Shrand B, Ronnie L (2019) Commitment and identification in the Ivory Tower:
interrelationship among IT investment, security awareness, and data brea- Academics’ perceptions of organisational support and reputation. Stud High
ches. MIS Q 47(1):317–342. https://doi.org/10.25300/MISQ/2022/15713 Educ 46:1–15. https://doi.org/10.1080/03075079.2019.1630810
Liao R, Balasinorwala S, Raghav Rao H (2017) Computer assisted frauds: an Sobel M (1982) Asymptotic confidence intervals for indirect effects in structural
examination of offender and offense characteristics in relation to arrests. Inf equation models. Sociol Methodol 13:290–312. https://doi.org/10.2307/
Syst Front 19(3):443–455. https://doi.org/10.1007/s10796-017-9752-4 270723
Linton JD, Tierney R, Walsh ST (2011) Publish or perish: how are research and Tencent (2018) The cloud usage and digital economy development report. Tencent,
reputation related? Serials Rev 37(4):244–257. https://doi.org/10.1016/j. Shenzhen
serrev.2011.09.001 Ulven JB, Wangen G (2021) A systematic review of cybersecurity risks in higher
Liu CW, Huang P, Lucas HC (2020) Centralized IT decision making and cyber- education. Future Internet 13(2):39. https://doi.org/10.3390/fi13020039
security breaches: evidence from U.S. higher education institutions. J Manag United Nations Conference on Trade and Development (2021) Digital economy
Inf Syst 37(3):758–787. https://doi.org/10.1080/07421222.2020.1790190 report: cross-border data flows and development: from whom the data flow.
Lu S, Liu S, Hou P et al. (2023) Soft tissue feature tracking based on deep matching United Nations, New York
network. Comput Model Eng Sci 136(1):363–379. https://doi.org/10.32604/ Verizon (2022) 2022 Verizon data breach investigations report. Verizon, New York
cmes.2023.025217 Wang J, Gupta M, Rao R (2015) Insider threats in a financial institution: analysis of
Luo R, Li H, Hu Q et al. (2020) Why individual employees commit malicious attack-proneness of information systems applications. MIS Q 39(1):91–112.
computer abuses? A routine activity theory approach. J Assoc Inf Syst 21(6). https://doi.org/10.25300/MISQ/2015/39.1.05
https://doi.org/10.17705/1jais.00646 Wang L (2022) Newbie or experienced: an empirical study on faculty recruitment
Magura Z, Zhou TG, Musungwini S (2021) A guiding framework for enhancing preferences at top national HEIs in China. Stud High Educ 47(4):783–798.
database security in state-owned universities in Zimbabwe. Afr J Sci Technol https://doi.org/10.1080/03075079.2020.1804849
Innov Dev 14(7):1761–1775. https://doi.org/10.1080/20421338.2021.1984010 Wang Q, Ngai EWT (2022) Firm diversity and data breach risk: a longitudinal
Meltzer JP (2015) The internet, cross-border data flows and international trade. study. J Strategic Inf Syst 31(4):101743. https://doi.org/10.1016/j.jsis.2022.
Asia Pac Policy Stud 2(1):90–102. https://doi.org/10.1002/app5.60 101743
Meng F, Xiao X, Wang J (2022) Rating the crisis of online public opinion using a Wasserman L, Wasserman Y (2022) Hospital cybersecurity risks and gaps: Review
multi-level index system. Int Arab J Inf Technol 19(4):597–608. https://doi. (for the non-cyber professional. Front Digit Health 4:862221. https://doi.org/
org/10.34028/iajit/19/4/4 10.3389/fdgth.2022.862221
Nash KS (2007) Information technology budgets: which industry spends the most? Weulen Kranenbarg M, Holt TJ, van der Ham J (2018) Don’t shoot the messenger!
CIO Digital Magazine. https://www.cio.com/article/274441/budget- A criminological and computer science perspective on coordinated vulner-
information-technology-budgets-which-industry-spends-the-most.html. ability disclosure. Crime Science 7(1):1–9. https://doi.org/10.1186/s40163-
Accessed 7 Feb 2023 018-0090-8
Wu TY, Pan JS, Lin CF (2014) Improving accessing efficiency of cloud storage Informed consent
using de-duplication and feedback schemes. IEEE Syst J 8(1):208–218. https:// This article does not contain any studies with human participants performed by any of
doi.org/10.1109/JSYST.2013.2256715 the authors.
Ying C (2021) UC data breach leaks students’ personal information to dark web.
The Daily Californian. https://www.dailycal.org/2021/04/27/uc-data-breach-
leaks-students-personal-information-to-dark-web. Accessed 7 Feb 2023
Additional information
Supplementary information The online version contains supplementary material
Zhang Z, Nan G, Tan Y (2020) Cloud services vs. on-premises software: compe-
available at https://doi.org/10.1057/s41599-023-01757-0.
tition under security risk and product customization. Inf Syst Res
31(3):848–864. https://doi.org/10.1287/isre.2019.0919
Correspondence and requests for materials should be addressed to Chong Zhang.
Ethical approval
This article does not contain any studies with human participants performed by any of © The Author(s) 2023
the authors.