Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
5 views13 pages

Pattern Recgonization

Uploaded by

nikhilcs
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views13 pages

Pattern Recgonization

Uploaded by

nikhilcs
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Letter https://doi.org/10.

1038/s41586-018-0632-y

Vowel recognition with four coupled spin-torque


nano-oscillators
­ iguel Romera1,5, Philippe Talatchian1,5, Sumito Tsunegi2, Flavio Abreu Araujo1,4, Vincent Cros1, Paolo Bortolotti1, Juan Trastoy1,
M
Kay Yakushiji2, Akio Fukushima2, Hitoshi Kubota2, Shinji Yuasa2, Maxence Ernoult1,3, Damir Vodenicarevic3, Tifenn Hirtzlin3,
Nicolas Locatelli3, Damien Querlioz3* & Julie Grollier1*

In recent years, artificial neural networks have become the flagship We transpose to hardware the neural network illustrated in Fig. 1a17
algorithm of artificial intelligence1. In these systems, neuron with the set-up illustrated in Fig. 1b. The four neurons in Fig. 1a are
activation functions are static, and computing is achieved through experimentally implemented with four spin-torque nano-oscillators
standard arithmetic operations. By contrast, a prominent branch (Fig. 1b), in our case circular magnetic tunnel junctions with
of neuroinspired computing embraces the dynamical nature of the 375 nm diameter and an FeB free layer with a vortex as ground state
brain and proposes to endow each component of a neural network (see Methods)26. The double arrow connections between neurons
with dynamical functionality, such as oscillations, and to rely on (blue in Fig. 1a) indicate that the output of neuron i influences the
emergent physical phenomena, such as synchronization2–6, for behaviour of neuron j, and vice versa. We implement these symmetric
solving complex problems with small networks7–11. This approach neural interconnections by connecting electrically the four oscillators
is especially interesting for hardware implementations, because using millimetre-long wires as schematized in Fig. 1b: in this configu-
emerging nanoelectronic devices can provide compact and energy- ration, the microwave current generated by each oscillator propagates
efficient nonlinear auto-oscillators that mimic the periodic spiking in the electrical microwave loop and in turn influences the dynam-
activity of biological neurons12–16. The dynamical couplings between ics, and in particular the frequency, of the other oscillators through
oscillators can then be used to mediate the synaptic communication the microwave spin-torques it creates24. The sum of all microwave
between the artificial neurons. One challenge for using nanodevices emissions is detected by a spectrum analyser. Importantly, we can
in this way is to achieve learning, which requires fine control and control the frequency of each oscillator by adjusting the direct cur-
tuning of their coupled oscillations17; the dynamical features of rent flowing through each (see Methods and Extended Data Fig. 1).
nanodevices can be difficult to control and prone to noise and Here, for computing, we choose direct currents leading to close but
variability18. Here we show that the outstanding tunability of not identical frequencies. The light blue curve in Fig. 1c shows a four-
spintronic nano-oscillators—that is, the possibility of accurately peak spectrum typical of this regime of moderate coupling where the
controlling their frequency across a wide range, through electrical dynamics of the oscillators are correlated but do not lead to mutual
current and magnetic field—can be used to address this challenge. synchronization.
We successfully train a hardware network of four spin-torque nano- The inputs to the neural network are encoded in the frequencies fA
oscillators to recognize spoken vowels by tuning their frequencies and fB of two fixed-amplitude microwave signals. Injected in a strip line
according to an automatic real-time learning rule. We show that the fabricated above the active magnetic layers, they modify the dynamics
high experimental recognition rates stem from the ability of these of the oscillators through the radiofrequency magnetic fields they gen-
oscillators to synchronize. Our results demonstrate that non-trivial erate. Figure 1d shows that when the frequency of one of the microwave
pattern classification tasks can be achieved with small hardware sources is swept, each oscillator synchronizes to the source in turn.
neural networks by endowing them with nonlinear dynamical Indeed, when the frequency of the source gets close to the frequency
features such as oscillations and synchronization. of one of the oscillators, the strong signal of the source pulls the adapt-
Spin-torque nano-oscillators are natural candidates for build- able frequency of the oscillator towards its own. In the locking range,
ing hardware neural networks made of coupled nanoscale oscilla- the frequency of the oscillator becomes equal to the frequency of the
tors8–10,13,15,18,19. These nanoscale magnetic tunnel junctions emit source27. The dark blue curve in Fig. 1c shows an example of spectrum
microwave voltages when they are driven by direct-current injection measured when the two microwave inputs are injected simultaneously.
in a regime of sustained magnetization precession through the effect Two peaks (in red) appear at frequencies fA and fB owing to capacitive
of spin torque. In addition, they have exceptional capacities to syn- coupling with the strip line. In comparison to the spectrum without
chronize their rhythms to periodic electric and magnetic input signals inputs (light blue curve), the emission peaks of oscillators 1 and 2 are
and to other spin-torque nano-oscillators20–24. This property originates pulled towards fA, whereas oscillator 4 is phase-locked to input B (its
from the high tunability of their frequency, in other words, the large emission peak merges with the one of input B at fB). We label this syn-
frequency changes induced by applied d.c. currents and magnetic fields. chronization configuration as (4B).
Single spin-torque nano-oscillators can achieve impressive cognitive The possible outputs of the neural network, represented in different
computations25. However, it has not been shown experimentally that a colours in Fig. 1e, are the different synchronization configurations that
coupled network of spin-torque nano-oscillators can learn to perform appear for different frequencies of the two input signals, keeping the
computational tasks through synchronization. Here, we use the ability direct currents through the oscillators fixed. Depending on the fre-
of spin-torque nano-oscillators to modify their frequency in response quencies of inputs, zero (grey regions), one, or two oscillators are phase-
to injected direct currents to train in real-time a network of coupled locked. For example, in the petrol-blue region labelled (2A), oscillator
oscillators to categorize different input patterns into different synchro- 2 is synchronized to input A. In the white region labelled (1A,3B),
nization configurations2,17,18. oscillators 1 and 3 are synchronized to inputs A and B, respectively.
1
Unité Mixte de Physique, CNRS, Thales, Université Paris-Sud, Université Paris-Saclay, Palaiseau, France. 2National Institute of Advanced Industrial Science and Technology (AIST), Spintronics
Research Center, Tsukuba, Ibaraki, Japan. 3Centre de Nanosciences et de Nanotechnologies, CNRS, Université Paris-Sud, Université Paris-Saclay, Orsay, France. 4Present address: Institute of
Condensed Matter and Nanosciences, UC Louvain, Louvain-la-Neuve, Belgium. 5These authors contributed equally: Miguel Romera, Philippe Talatchian. *e-mail: [email protected];
[email protected]

2 3 0 | N A T U RE | V O L 5 6 3 | 8 N O V E M B ER 2 0 1 8
© 2018 Springer Nature Limited. All rights reserved.
Letter RESEARCH

a b Microwave inputs d
fA fB 380

Oscillator frequency (MHz)


fA fB 4
Inputs Spin-torque 370
nano-oscillators
360
500 nm 3
I1 I2 I3 I4 350
c

density (μW MHz–1)


+

Spectral power
Neurons fA fB 2
1 340
1
10–2

+
330
Outputs: 10–4
synchronization configurations 1 2 3 4 330 340 350 360 370 380
320 340 360 380 Input frequency (MHz)
Frequency (MHz)

e 380 Other f 380


(4A,3B) ae
(4A,2B)
External frequency B (MHz)
(4A,1B) ah
370 370

Input frequency B (MHz)


(3A,4B) aw
(3A,2B)
(3A,1B) er
(2A,4B)
360 (2A,3B) 360 ih
(2A,1B) iy
(1A,4B)
350 (1A,3B)
350 uw
(1A,2B)
(4B)
(3B)
(2B) 340
340 (1B)
(4A)
(3A)
(2A) 330
330 (1A)
None
330 340 350 360 370 380 330 340 350 360 370 380
External frequency A (MHz) Input frequency A (MHz)

Fig. 1 | Approach for pattern classification with coupled spin-torque oscillators. The two narrow red peaks in the dark blue curve correspond to
nano-oscillators. a, Schematic of the emulated neural network. the external microwave signals with frequencies fA and fB. d, Evolution of
b, Schematic of the experimental set-up with four spin-torque nano- the four oscillator frequencies when the frequency of external source A is
oscillators electrically connected in series and coupled through their own swept. One after the other, the oscillators phase-lock to the external input
emitted microwave currents. Two microwave signals encoding information when the frequency of the source approaches their natural frequency. In
in their frequencies fA and fB are applied as inputs to the system through a the locking range, the oscillator frequency is equal to the input frequency.
strip line, which translates into two microwave fields. The total microwave e, Experimental synchronization map as a function of the frequencies
output of the oscillator network is recorded with a spectrum analyser. of the external signals fA and fB. Each colour corresponds to a different
c, Microwave output emitted by the network of four oscillators without synchronization state. f, Inputs applied to the system, represented in the
(light blue) and with (dark blue) the two microwave signals applied to (fA, fB) plane. Each colour corresponds to a different spoken vowel, and
the system. The two curves have been shifted vertically for clarity. The each data point corresponds to a different speaker.
four peaks in the light blue curve correspond to the emissions of the four

We now describe how this neural network can recognize patterns for each vowel, randomly picked between the different speakers. The
by classifying spoken vowels, which are naturally characterized by oscillator emissions corresponding to each of the seven input micro-
frequencies called formants28. We use as input data a subset of the wave signals are recorded with a spectrum analyser. A computer iden-
Hillenbrand database (available at https://homepages.wmich.edu/~hil- tifies the corresponding synchronization states (see Methods). If all the
lenbr/voweldata.html; see Supplementary Information) comprising seven vowels have been correctly classified in their assigned synchroni-
seven vowels pronounced by 37 different female speakers, where each zation regions of the map (fA, fB), the direct currents are not changed. If
vowel is characterized by 12 different frequencies. Formant frequencies one or several vowels have not been correctly classified, direct currents
are typically in the range between 500 Hz and 3,500 Hz, so a trans- in the oscillators are modified to bring the assigned synchronization
formation is needed to obtain input frequencies (fA, fB) in the range regions closer to the corresponding input frequency pairs (fA, fB) and
of operation of our oscillators, between 325 MHz and 380 MHz. As thus reduce the classification error (see Methods). In the next learning
detailed in Methods, we obtain fA and fB through two different lin- step, another set of seven vowels is applied, and so on.
ear combinations of the 12 formant frequencies that fit the grid-like Figure 2 shows synchronization maps obtained at different stages
geometry of the oscillator synchronization maps. In the resulting map of the training process (Fig. 2a–d), together with the evolution of the
shown in Fig. 1f, each point corresponds to one speaker. The spread direct currents applied to the oscillators (Fig. 2e), their frequencies
in frequency for each vowel indicates that each speaker has a different (Fig. 2f) and the average recognition rates for the seven vowels (Fig. 2g)
pronunciation. Our goal is to recognize the vowel presented as input to (for a short video (20 s), see Supplementary Information or https://
the oscillator network independently of the speaker. For this purpose, youtu.be/bbRqqcxc-po; for a longer video (3 min 30 s), see https://
the scattered points corresponding to each vowel pronounced by dif- youtu.be/IHYnh0oJgOA). After 48 training steps, an optimum is found,
ferent speakers should all be contained inside a different region of the direct currents and frequencies stop evolving, and the recognition rates
oscillator synchronization map in Fig. 1e. stop increasing, signifying that the training process can be stopped.
As can be seen from Fig. 2a, in which the input vowel map and the During training, we do not use all the vowels in the database. We always
oscillator synchronization map are superposed, initially they do not retain 20% of the vowels to test the ability of the system to recognize
coincide: the initial oscillator frequencies have been set randomly and unknown data. The final recognition rates on the training and testing
are not adequate to solve the problem. The oscillatory neural network datasets reach values up to 89% and 88%, respectively (Fig. 2g).
must learn to perform the classification properly. During this training We now interpret these experimental recognition rates by compar-
stage, the internal parameters of the network need to be finely tuned ing them to the performances that can be achieved with ideal oscil-
until each synchronization region encompasses the cloud of points lators trained on the same task with the same learning process. For
corresponding to the vowel that it has been assigned. For this pur- this purpose, we model the oscillator dynamics with coupled van der
pose, we take advantage of the highly tunable nature of spin-torque Pol equations accounting for their collective magnetization coor-
nano-oscillators to modify the synchronization map by tuning the dinates (see Supplementary Information)20. The simulated oscilla-
direct current through each oscillator, adapting a training algorithm tors are noiseless and differ only by a 2% mismatch in their natural
first proposed in ref. 17. We have developed an automatic real-time frequencies, analogous to the one observed experimentally. We first
learning procedure involving a feedback loop between the experimental vary their ability to synchronize by modifying their frequency tuna-
setup and the computer that controls it (see Methods). At each training bility (see Supplementary Information). Black circles in Fig. 3a show
step, we consecutively apply seven inputs (fA, fB) to the oscillators, one the recognition rate of the ideal simulated network as a function of the

8 N O V E M B ER 2 0 1 8 | V O L 5 6 3 | N A T U RE | 2 3 1
© 2018 Springer Nature Limited. All rights reserved.
RESEARCH Letter

a Training step 0 b Training step 7 c Training step 15


380 380 380 Other
ae (4A,3B)

Input frequency B (MHz)

Input frequency B (MHz)


Input frequency B (MHz)
(4A,2B)
ah (4A,1B)
370 370 370
aw (3A,4B)
(3A,2B)
er (3A,1B)
360 360 (2A,4B)
ih 360 (2A,3B)
iy (2A,1B)
(1A,4B)
350 uw 350 350 (1A,3B)
(1A,2B)
(4B)
(3B)
340 (2B)
340 340 (1B)
(4A)
(3A)
330 (2A)
330 330 (1A)
None
330 340 350 360 370 380 330 340 350 360 370 380 330 340 350 360 370 380
Input frequency A (MHz) Input frequency A (MHz) Input frequency A (MHz)
d 380
Training step 86 e 10
g 100
Training

current (mA)
Input frequency B (MHz)

Recognition rate (%)


370 80

Direct
6 Testing
360 60
f
4 Oscillators & 1 3
350 2 4
370 40

Frequency
360

(MHz)
340
350 20
330 340
330 0
330 340 350 360 370 380 0 20 40 60 80 0 20 40 60 80
Input frequency A (MHz) Number of training steps Number of training steps

Fig. 2 | Learning to classify patterns by tuning the frequencies of A video is provided as Supplementary Information. e, Direct current
oscillators. a–d, Experimental synchronization map as a function of applied through each oscillator as a function of the number of training
the frequencies of the external signals, at different steps of the training steps. f, Frequency of each oscillator as a function of the number of
procedure: a, step 0; b, step 7; c, step 15; and d, step 86. The coloured dots training steps. g, Recognition rates obtained with the sets of data points
represent the inputs applied to the oscillatory network: vowels pronounced used for training and for testing, as a function of the number of training
by different speakers. Different vowels are shown in different colours. steps.

average locking range of the oscillators normalized by their frequency network composed of ideal, noiseless oscillators. This high perfor-
difference. The recognition rate increases linearly with the oscillator mance is due to the large experimental locking ranges resulting from
locking ranges (see dotted blue linear fit in Fig. 3a). Indeed, as shown the high tunability, coupling and low noise of the hardware spin-torque
in the simulated maps of Fig. 3b, when the oscillator locking ranges nano-oscillators.
increase, the regions of synchronization grow, thus encompassing and We then compare the dynamical oscillator-based neural network
classifying an increasing number of points in each of the different vowel studied in this paper to more conventional forms of neural networks.
clouds. As shown in Fig. 3c, d, the mutual coupling between oscillators For this purpose, we first extract a reference value for the experimen-
also enhances their locking ranges27, leading to increased recognition tal recognition rate by repeating the training procedure experimen-
rates when the mutual interactions increase. The red star in Fig. 3a tally several times with different combinations of training and testing
pinpoints where the experimental result features in this graph. The sets (see Methods). This cross-validation technique yields an average
experimental vowel recognition rate of 89% is close to the maximum value of 84.3% for the experimental recognition rate on the testing set
recognition rate of 94% that can be achieved with the same neural that we can compare to other neural networks performances. First, we

a b 1 LR/FD ≈ 0.41 2 LR/FD ≈ 0.76 3 LR/FD ≈ 0.94


100
Experiments 370
90
Recognition rate (%)

Input frequency B (MHz)

80 360

70
350
60
340
50 Simulations
Fit
40 330
0.4 0.5 0.6 0.7 0.8 0.9 1.0 330 340 350 360 370 330 340 350 360 370 330 340 350 360 370
LR/FD Input frequency A (MHz) Input frequency A (MHz) Input frequency A (MHz)

c d 4 H≈0 5 H ≈ 0.68 6 H ≈ 2.08

90 370
Input frequency B (MHz)
Recognition rate (%)

360
80
350
70
340
Simulations
Fit
60 330
0.0 0.5 1.0 1.5 2.0 330 340 350 360 370 330 340 350 360 370 330 340 350 360 370
Normalized coupling, H Input frequency A (MHz) Input frequency A (MHz) Input frequency A (MHz)

Fig. 3 | Comparing the recognition rates of experimental and fit to the simulation results. The red star indicates where experimental
ideal oscillators. Simulations of vowel recognition with a network oscillators feature in this graph. b, Synchronization maps simulated
of four identical oscillators trained with the same procedure as in the with the network of oscillators used in a, for three different values of
experiments are illustrated, in the absence of noise. The simulated the normalized locking range. c, Recognition rate on the training set
oscillators differ only by a 2% mismatch in their natural frequencies. (black circles) as a function of the mutual coupling between oscillators
a, Recognition rate on the training set (black circles) as a function of the normalized by their coupling to the microwave inputs. The blue dotted line
average oscillator locking range normalized by the frequency difference is a linear fit to the simulation results. d, Synchronization maps simulated
between oscillators (LR/FD). The locking range is varied by modifying with the network of oscillators used in c, for three different values of the
the tunability of the oscillator frequency. The blue dotted line is a linear normalized coupling ε.

2 3 2 | N A T U RE | V O L 5 6 3 | 8 N O V E M B ER 2 0 1 8
© 2018 Springer Nature Limited. All rights reserved.
Letter RESEARCH

a Multilayer perceptron b 100 c Oscillator network

12 inputs + 1 bias 12 inputs + 1 bias

Recognition rate after


cross-validation (%)
90

fA fB
80
N hidden … …
neurons
70
Exp. oscillator net
Ideal oscillator net f1 f2 f3 f4
Multilayer perceptron
7 output neurons 60 4 oscillators
20 40 60 80 100 120
Softmax Number of trained parameters Synchronization configuration

Fig. 4 | Benchmarking performances with classical neural networks. The red star corresponds to the experimental results with the network
a, Flow chart of the simulated multilayer perceptron. The trained of spin-torque nano-oscillators. Exp., experimental. c, Flow chart of the
parameters are indicated in red. b, Recognition rate obtained through experimental oscillatory neural network. The trained parameters are
cross-validation versus the total number of trained parameters for the indicated in red.
neural network in a, in which the number of hidden neurons is varied.

consider a conventional, static, multi-layer neural network. This kind on top of CMOS. Finally, their synchronization can be detected with
of network can achieve better-than-human recognition rates at complex CMOS circuits that count the number of oscillations29 or measure the
tasks, such as image classification. This performance, however, comes at additional d.c. voltages produced by the oscillators when they phase-
the expense of the large number of parameters that need to be trained, lock (see Methods and Extended Data Fig. 3)30. Therefore, the wide
a major hurdle for hardware implementation. Figure 4b shows the rec- variety of possible magnetic and electric couplings offered by spin-
ognition rate of a multilayer perceptron, trained in software through tronics21–24, and the different ways of driving and controlling mag-
backpropagation on the same database as the experimental neural net- netization dynamics (spin torques, spin–orbit torques, electric fields),
work, with 30,000 vowel presentations (see Methods). As illustrated in could be exploited in the future to implement large-scale hardware
Fig. 4a, this network, composed of static neurons, takes as inputs the neural networks15.
12 formant frequencies characterizing each pronounced vowel. The
hidden layer neurons receive a weighted sum of these inputs (plus a bias Online content
term). The output layer, with softmax activation functions, has seven Any methods, additional references, Nature Research reporting summaries, source
neurons, one for each vowel class (see Methods). As can be seen in data, statements of data availability and associated accession codes are available at
Fig. 4b, the recognition rate is excellent, reaching 97% when the num- https://doi.org/10.1038/s41586-018-0632-y.
ber of trained parameters is large (synaptic weights illustrated in red in
Received: 24 November 2017; Accepted: 31 July 2018;
Fig. 4a). However, the performance rapidly degrades for small numbers
Published online 29 October 2018.
of trained parameters, diving below 65% for 27 trained parameters. This
result is quite general: as can be seen from Extended Data Fig. 2, state-
1. Silver, D. et al. Mastering the game of Go without human knowledge. Nature
of-the-art networks with feedback such as standard recurrent neural 550, 354–359 (2017).
networks or long short-term memory networks have limited perfor- 2. Borisyuk, R., Denham, M., Hoppensteadt, F., Kazanovich, Y. & Vinogradova, O. An
mance when the number of trained parameters is small. In contrast, oscillatory neural network model of sparse distributed memory and novelty
detection. Biosystems 58, 265–272 (2000).
the recognition rate of our experimental oscillatory neural network is 3. Jaeger, H. & Haas, H. Harnessing nonlinearity: predicting chaotic systems and
over 84% for only 30 trained parameters: as illustrated in red in Fig. 4c, saving energy in wireless communication. Science 304, 78–80 (2004).
the 26 weights converting formants to inputs, and the currents through 4. Rabinovich, M., Huerta, R. & Laurent, G. Transient dynamics for neural
processing. Science 321, 48–50 (2008).
the oscillators. For an ideal, noiseless, oscillatory network, the success 5. Sussillo, D. Neural circuits as computational dynamical systems. Curr. Opin.
rate reaches 89% after cross-validation. The network also learns rap- Neurobiol. 25, 156–163 (2014).
idly (350 vowel presentations are used). This high performance with a 6. Pikovsky, A. & Rosenblum, M. Dynamics of globally coupled oscillators:
small number of trained parameters comes from the combination of progress and perspectives. Chaos 25, 097616 (2015).
7. Kumar, S., Strachan, J. P. & Williams, R. S. Chaotic dynamics in nanoscale NbO2
two phenomena: as shown in Fig. 3c, the oscillatory network can do Mott memristors for analogue computing. Nature 548, 318–321 (2017).
better than the sum of its individual components, owing to its complex, 8. Csaba, G. & Porod, W. Computational study of spin-torque oscillator interactions
coupled, dynamical features, and in addition, the oscillators collectively for non-Boolean computing applications. IEEE Trans. Magn. 49, 4447–4451 (2013).
9. Yogendra, K., Fan, D., Jung, B. & Roy, K. Magnetic pattern recognition using
contribute to pattern recognition by synchronizing to the inputs. This injection-locked spin-torque nano-oscillators. IEEE Trans. Electron Dev. 63,
result shows that the performance of hardware neural networks can be 1674–1680 (2016).
boosted by enhancing neuron functionalities beyond simple nonlinear 10. Macià, F., Kent, A. D. & Hoppensteadt, F. C. Spin-wave interference patterns
created by spin-torque nano-oscillators for memory and computation.
activation functions, through oscillations and synchronization. Nanotechnology 22, 095301 (2011).
In the future, such dynamical neural networks will have to be scaled up 11. Fang, Y., Yashin, V. V., Levitan, S. P. & Balazs, A. C. Pattern recognition with
to solve challenging classification problems on software-benchmarked “materials that compute”. Sci. Adv. 2, e1601114 (2016).
12. Pickett, M. D., Medeiros-Ribeiro, G. & Williams, R. S. A scalable neuristor built
databases. Spin-torque nano-oscillators offer numerous advantages with Mott memristors. Nat. Mater. 12, 114–117 (2013).
towards this goal. Their energy consumption is comparable to or 13. Pufall, M. R. et al. Physical implementation of coherently coupled oscillator
lower than complementary metal–oxide–semiconductor (CMOS) networks. IEEE J. Explor. Solid-State Comput. Devices Circuits 1, 76–84 (2015).
14. Sharma, A. A., Bain, J. A. & Weldon, J. A. Phase coupling and control of
oscillators, and contrary to the latter, their lateral dimensions can be oxide-based oscillators for neuromorphic computing. IEEE J. Explor. Solid-State
scaled down to a few nanometres in diameter (a detailed comparison Comput. Devices Circuits 1, 58–66 (2015).
is presented in Extended Data Table 2). Their quality factor can exceed 15. Grollier, J., Querlioz, D. & Stiles, M. D. Spintronic nanodevices for bioinspired
several thousands26, and their natural frequency can be controlled by computing. Proc. IEEE 104, 2024–2039 (2016).
16. Parihar, A., Shukla, N., Jerry, M., Datta, S. & Raychowdhury, A. Computational
the aspect ratio of the magnetic dot from hundreds of megahertz to paradigms using oscillatory networks based on state-transition devices. In
several gigahertz in small pillars, opening the path to nano-oscillators 2017 International Joint Conference on Neural Networks (IJCNN) 3415–3422
assemblies with a wide range of natural frequencies19. In addition, their (IEEE, 2017).
17. Vassilieva, E., Pinto, G., de Barros, J. A. & Suppes, P. Learning pattern recognition
simple structure is similar to spin-torque magnetic random access through quasi-synchronization of phase oscillators. IEEE Trans. Neural Netw. 22,
memory cells, which means that they can be produced by billions 84–95 (2011).

8 N O V E M B ER 2 0 1 8 | V O L 5 6 3 | N A T U RE | 2 3 3
© 2018 Springer Nature Limited. All rights reserved.
RESEARCH Letter

18. Vodenicarevic, D., Locatelli, N., Araujo, F. A., Grollier, J. & Querlioz, D. 30. Fang, B. et al. Giant spin-torque diode sensitivity in the absence of bias
A nanotechnology-ready computing scheme based on a weakly coupled magnetic field. Nat. Commun. 7, 11259 (2016).
oscillator network. Sci. Rep. 7, 44772 (2017).
19. Locatelli, N., Cros, V. & Grollier, J. Spin-torque building blocks. Nat. Mater. 13, Acknowledgements This work was supported by the European Research
11–20 (2014). Council ERC under grant bioSPINspired 682955, the French National Research
20. Slavin, A. & Tiberkevich, V. Nonlinear auto-oscillator theory of microwave Agency (ANR) under grant MEMOS ANR-14-CE26-0021, and a public grant
generation by spin-polarized current. IEEE Trans. Magn. 45, 1875–1918 overseen by the ANR as part of the ‘Investissements d’Avenir’ programme
(2009). (Labex NanoSaclay, reference ANR-10-LABX-0035).
21. Kaka, S. et al. Mutual phase-locking of microwave spin torque nano-oscillators.
Nature 437, 389–392 (2005). Reviewer information Nature thanks F. Hoppensteadt, A. Kent and the other
22. Mancoff, F. B., Rizzo, N. D., Engel, B. N. & Tehrani, S. Phase-locking in double- anonymous reviewer(s) for their contribution to the peer review of this work.
point-contact spin-transfer devices. Nature 437, 393–395 (2005).
23. Houshang, A. et al. Spin-wave-beam driven synchronization of nanocontact Author contributions The study was designed by J.G. and D.Q. Samples
spin-torque oscillators. Nat. Nanotech. 11, 280–286 (2016). were optimized and fabricated by S.T. and K.Y. The main experiments were
24. Lebrun, R. et al. Mutual synchronization of spin torque nano-oscillators through performed by M.R. and P.T. Spin diode experiments were performed by P.T.
a long-range and tunable electrical coupling scheme. Nat. Commun. 8, 15825 and J.T. Numerical simulations were realized by P.T., M.E., M.R., T.H. and D.V.
(2017). All authors contributed to analysing the results and writing the paper.
25. Torrejon, J. et al. Neuromorphic computing with nanoscale spintronic
oscillators. Nature 547, 428–431 (2017). Competing interests The authors declare no competing interests.
26. Tsunegi, S. et al. High emission power and Q factor in spin torque vortex
oscillator consisting of FeB free layer. Appl. Phys. Express 7, 063009 (2014). Additional information
27. Romera, M. et al. Enhancing the injection locking range of spin torque Extended data is available for this paper at https://doi.org/10.1038/s41586-
oscillators through mutual coupling. Appl. Phys. Lett. 109, 252404 (2016). 018-0632-y.
28. Hillenbrand, J., Getty, L. A., Wheeler, K. & Clark, M. J. Acoustic Supplementary information is available for this paper at https://doi.org/
characteristics of American English vowels. J. Acoust. Soc. Am. 97, 10.1038/s41586-018-0632-y.
3099–3111 (1994). Reprints and permissions information is available at http://www.nature.com/
29. Vodenicarevic, D., Locatelli, N., Grollier, J. & Querlioz, D. Synchronization reprints.
detection in networks of coupled oscillators for pattern recognition. In 2016 Correspondence and requests for materials should be addressed to D.Q. or J.G.
International Joint Conference on Neural Networks (IJCNN) 2015–2022 (IEEE, Publisher’s note: Springer Nature remains neutral with regard to jurisdictional
2016). claims in published maps and institutional affiliations.

2 3 4 | N A T U RE | V O L 5 6 3 | 8 N O V E M B ER 2 0 1 8
© 2018 Springer Nature Limited. All rights reserved.
Letter RESEARCH

Methods 0.1 mT. They strongly affect the magnetization dynamics of the four oscillators, and
Samples. Magnetic tunnel junction (MTJ) films with a stacking structure thus the total microwave output emitted by the network. The microwave emissions
of buffer/PtMn(15)/Co71Fe29(2.5)/Ru(0.9)/Co60Fe20B20(1.6)/Co70Fe30(0.8)/MgO(1)/ are recorded with a spectrum analyser. As can be seen in Fig. 1d, the input signals
Fe80B20(6)/MgO(1)/Ta(8)/Ru(7) (thicknesses in nm) were prepared by ultrahigh- from the antenna can be detected in addition to the oscillator emissions due to
vacuum (UHV) magnetron sputtering. After annealing at 360 °C for 1 h, the resistance– capacitive coupling between the strip line antenna and the metallic electrodes con-
area product was RA ≈ 3.6 Ω μm2. Circular-shaped MTJs with a diameter of about necting the oscillator. The analysis of the output, which depends on the frequencies
375 nm were patterned using Ar ion etching and e-beam lithography. The resistance of the microwave inputs, can therefore easily be used to classify the spoken vowels.
of the samples is close to 40 Ω, and the magneto-resistance ratio is about 100% at Each spectrum recorded with the spectrum analyser is sent to the computer, where
room temperature. The FeB layer presents a structure with a single magnetic vortex it is analysed by a program in real time. The information we use as input to this
as the ground state for the dimensions used here. In a small region called the vortex program is: (1) the value of the two frequencies of the external microwave signals
core (of about 12 nm diameter at remanence for our materials), the magnetization (fA, fB) and (2) the oscillator frequencies at each direct current value in the absence
spirals out of plane. Under direct current injection and the action of the spin transfer of external microwave signals (f 10 , f 20 , f 30 , f 40 ) . The output data that we extract from
torques, the core of the vortex steadily gyrates around the centre of the dot with a each spectrum analysis are the four values of the oscillator frequencies in the presence
frequency in the range of 150 MHz to 450 MHz for the oscillators we used here. of microwave inputs. Then, another program takes these oscillator frequencies to
Database and inputs. In this study, we classify seven spoken vowels with the oscilla- calculate the synchronization states and check whether the applied vowel was prop-
tory network. Spoken vowels are characterized by a set of frequencies called formants, erly recognized, as follows. If one of the detected frequencies coincides with the
which we obtain from a subset of the Hillenbrand database (https://homepages. frequency of one of the external signals (±0.5 MHz), we consider that the oscillator
wmich.edu/~hillenbr/voweldata.html) given in Supplementary Information. We is synchronized to it. From this analysis, the synchronization pattern that corresponds
use the first three formants (F1, F2 and F3) sampled at four different times of the to the input vowel is calculated. This is compared to the synchronization pattern
duration of the spoken vowel: at the steady state and at 20%, 50% and 80% of the initially assigned to that specific vowel to check whether it was successfully classified.
vowel duration (that is, 12 parameters in total). When one of these 12 parameters If we are in the training procedure and the vowel is not properly classified, the
could not be measured, or when irresolvable formants mergers occurred, Hillenbrand online learning algorithm calculates how the four direct currents should be modi-
et al.28 put a zero in this parameter in the database. For our study, we have removed fied to reduce the recognition error, as described in ‘Real-time learning algorithm’
the vowel utterances whose corresponding set of formants is not complete. Moreover, below. This information is then sent back to the experimental set-up, where the
we use the same number of speakers for each vowel. The resulting formant database currents are automatically modified.
comprising 37 female speakers that we used is provided as Supplementary Data. Real-time learning algorithm. In this section, we present the supervised learning pro-
We perform two linear combinations of these formants to obtain two characteristic cedure that was applied to our spin-torque nano-oscillator network to learn to recognize
frequencies (fA and fB) in the range of operation of the spin-torque nano-oscillators different classes of input stimuli. Here these classes correspond to seven different spoken
(between 325 MHz and 380 MHz for the applied field value that we are using): English vowels: ae, ah, aw, er, ih, iy and uw (see ref. 28 for details; the sounds can be
heard at https://homepages.wmich.edu/~hillenbr/voweldata.html). Initially, we assign
fA = A1 F1steady_state + B1F2steady_state + C1F3steady_state + D1F120% a synchronization pattern to each class of vowel (column 2 in Extended Data Table 1).
For a perfect recognition of one class of vowel, all data points in the frequency
+ E1F220% + G 1F320% + H1F150% + I1F250% + J1F350% + K1F180% input map that corresponds to this vowel (Fig. 1f) must be contained in their
+ L1F280% + M1F380% + N1 assigned synchronization pattern in the experimental map (Fig. 1e). If this is not
the case, for each association spoken vowel-synchronization pattern we define a
fB = A2 F1steady_state + B2F2steady_state + C 2F3steady_state + D2F120% frequency difference vector with four components (one for each oscillator; see
+ E 2F220% + G 2F320% + H2F150% + I2F250% + J2F350% third column in Extended Data Table 1) that will be used in the learning procedure.
Starting from a random map configuration (Fig. 1e), the automatic learning rule
+ K2F180% + L 2F280% + M2F380% + N2 that we developed allows us to converge to a configuration where most data points for
each vowel class are contained in their respective assigned synchronization pattern.
To choose the coefficients of the two linear combinations, we first record an experi-
The learning rule works in the following way.
mental synchronization map that is used as a calibration of the network. The calibra-
(1) We present to the network a randomly chosen input data point i belonging
tion map allows to assign a synchronization pattern to each vowel. Then, the linear
to one vowel class, by sending two microwave inputs with frequencies f Ai and f Bi .
transformation of the formants that best matches the data points of each vowel with
(2) From the resulting spectra, we extract the frequencies of the four spin-torque
its associated synchronization pattern is determined through fitting by least-square
oscillators (f1, f2, f3, f4) in presence of the microwave inputs.
regression. The coefficients used in the two linear combinations and the two fre-
(3) We determine the resulting synchronization configurations by comparing
quencies fA and fB corresponding to each vowel are provided as Supplementary Data.
the oscillator frequencies to the input frequencies f Ai and f Bi . Then, we compare
Once this calibration is done and the coefficients and characteristic frequencies
the obtained synchronization configuration with the one assigned to this vowel.
are calculated, the direct currents are reset to random values to begin the learning
(4) For each vowel presented to the network, we define an associated frequency
experiment. Two fixed-amplitude microwave signals with frequencies fA and fB are
difference vector, which describes the frequency distance between the applied input
used as inputs to the experimental network of coupled nano-oscillators.
and the assigned synchronization region. For instance, if the presented data point
Experimental set-up. Extended Data Fig. 1 shows a schematic of the experimental
belongs to the vowel class ‘ae’, we compute dae = [(f Ai − f1 ), 0, (f Bi − f3 ), 0]T.
set-up with the four coupled vortex nano-oscillators. A magnetic field of µ0H = 530
If one of the two synchronization events assigned to ‘ae’ has occurred, we only
mT is applied perpendicularly to the oscillator layers to get an efficient spin transfer
compute the frequency difference that corresponds to the other event. For instance,
torque acting on the oscillator vortex core. A direct current is injected into each
if oscillator 1 is correctly synchronized to external source f Ai , then we compute
oscillator to induce vortex dynamics, which leads to periodic oscillations of the
only dae = [0, 0, (f Bi − f3 ), 0]T.
magnetoresistance, giving rise to an oscillating voltage at the same frequency than
(5) We repeat steps (1) to (4) for all seven vowel classes.
the vortex core dynamics. The four oscillators are electrically connected in series by
(6) We compute the sign of the vector sum of all seven associated frequency differ-
millimetre-long wires. They are therefore coupled through the microwave currents
ence vectors D: D = sgn(dae + dah + daw + der + dih + diy + duw) = (D1, D2, D3, D4)T.
they emit, and too far away to be coupled through the magnetic dipolar fields that
(7) We then compute the new direct current set (I1′, I2′, I3′, I4′)T , which will be
they radiate. Four direct currents (IDC1, IDC2, IDC3, IDC4) are supplied to the circuit
applied to the four oscillators:
by four different sources, allowing an independent control of the current flow-
ing through each oscillator. The actual current flowing through each spin-torque  
oscillator is given by ISTO1 = IDC1, ISTO2 = IDC2 + IDC1, ISTO3 = IDC3 + IDC2 + IDC1  D sgn  ∂ω1  
 1  ∂I  
and ISTO4 = IDC4 + IDC3 + IDC2 + IDC1, respectively, where ISTOi corresponds to the  I = I1  
current flowing through the ith oscillator. Two microwave sources are used to inject  
 I1′   I    ∂ω 2  
two external microwave signals with frequencies fA and fB and power P = −9 dBm      1  
D2sgn  
 
 I2′   I2    ∂I 
through a strip line, creating two microwave fields as inputs to the oscillator net- I = I2  
  =   + µ  
work. The amplitude of the generated magnetic field, set by Ampere’s law, depends  I3′   I3    ∂ω 3  
only on the cross-section of the antenna (in addition to the distance between the    I   D3sgn   
 I4′   4    ∂I   
strip line and the active magnetic layer of the oscillators). Therefore, the length of I = I3  

the antenna is only set by the number of oscillators it should cover. In our case, the   ∂ω4  
 D4sgn   
strip line has a width of 2.5 µm and is fabricated 370 nm above the pillar (separated  
 ∂I  I = I 

by an insulating layer). The resulting input microwave fields have an amplitude of  4

© 2018 Springer Nature Limited. All rights reserved.


RESEARCH Letter

In this equation, μ = 0.1 mA is the learning rate of our algorithm. At each step, Gaussian of mean 0 and variance 0.01. No gradient inertia or learning rate adapta-
the applied direct current through each oscillator can be modified only by ±μ. tion technique was used. For the LSTM and the RNN, we ran training over 500,000
Here sgn[(∂fk /∂I)I = Ik ] represents the sign of the frequency evolution versus and over 1,000,000 iterations to ensure convergence with a learning rate of 0.01
injected direct current of the kth oscillator at the value of current Ik. For this, the and 0.0005, respectively. If needed, optimization techniques such as root-mean-
frequency–current dependence of each independent oscillator has been previously square propagation or adaptive moment estimation could be used to accelerate
characterized. training. Owing to the mini-batch size, gradient descent is highly stochastic, and
Upon modifying the direct currents following this learning procedure, the oscil- we average the test and training rates over the last 5,000 iterations to obtain reliable
lator frequencies change. This translates into a displacement of the synchronization training and error rate for a given trial. All results are reported in Extended Data
patterns in the experimental synchronization map (Fig. 2a–d). Fig. 2a where we show the cross-validation success as a function of the number
(8) We repeat all previous steps (steps (1) to (7)) N times, where N is the total of parameters learnt.
number of training steps. At each iteration, the synchronization map evolves Synchronization detection through oscillator rectified voltages. In the present
towards an optimal configuration where the global frequency difference vector work, synchronization of the oscillators is detected using a spectrum analyser,
dtot = dae + dah + daw + der + dih + diy + duw is minimized. On increasing the allowing a comprehensive understanding of the systems and of the physics of the
number of training steps, we observe an increase of the recognition rate until it oscillators. In a final integrated system, simpler techniques could be used to detect
saturates after step 48, reaching a value of 89% (Fig. 2f). In our training experiment, synchronization of oscillators. A possibility is given in ref. 29. Another method,
we set the maximum number of training steps to N = 87, which corresponds to involving less energy overhead, consists in exploiting the spin diode effect31,
applying three times each of the 29 data points of the training database. which causes synchronized oscillators to generate a supplementary direct volt-
Cross-validation procedure. Training was realized using 80% of the total number age32. Extended Data Fig. 3a and b illustrates this effect in one of our oscillators.
of vowels in the database. The testing procedure was done using the remaining The appearance of a rectified voltage measured between the oscillator electrodes
20% data points. The cross-validation technique allows estimating accurately the (Extended Data Fig. 3a) coincides with the locking range (Extended Data Fig. 3b).
recognition performances of the network by repeating the training/testing proce- The generated rectified voltage is proportional to the fraction of the external micro-
dure five times over distinct data point samples. Each time, the selected data points wave current Iext flowing through the oscillator30,32. In our experiments, Iext is small:
used for testing are different: in the first (respectively second, third, fourth and the input microwave signals are sent though a strip line isolated from the oscilla-
fifth) cross-validation period, we use the first (respectively second, third, fourth tors, in a geometry minimizing by design the capacitive coupling between oscillator
and fifth) quintile (20%) of the data points for testing. The final recognition rate and strip line (Iext = 7.5 × 10−3Istripline). As a result, the measured rectified voltages
was obtained by averaging the testing recognition rates of the five cross-validation are small (approximately 0.5 mV). In the future, these values can be increased up
experiments. The same cross-validation procedure is used for all the neural net- to several tens of millivolts by optimizing the coupling between oscillator and strip
works (experimental and simulated). line. Indeed, as demonstrated experimentally, rectification effects due to oscillator
Comparison of spin-torque nano-oscillators to CMOS oscillators. Extended phase locking can be large, with sensitivities reaching 75.4 mV for the generated
Data Table 2 compares features of CMOS and spin-torque nano-oscillators. ‘Vortex d.c. voltage per microwatt of injected microwave power30.
spin-torque oscillators’ refer to the magnetic tunnel junctions used in this study; We now present how synchronization detection through the resulting rectified
‘10 nm spin-torque oscillators’ refer to state-of-the-art magnetic tunnel junctions voltages may be implemented in a final integrated circuit, using a differential
currently used as memory cells. method. We propose to use four reference resistors with the same resistance as
Comparison with a multilayer perceptron. To benchmark the results of the the mean resistance of the nano-oscillators and polarized in the same manner.
experimental oscillatory network, we first ran a standard multi-layer perceptron, Comparing the voltage across a nano-oscillator and the corresponding reference
schematized in Fig. 4a, on the same vowel database. resistance then allows detection of whether the oscillator is experiencing syn-
The network takes as inputs the 12 formants of a given vowel in a database and chronization (Extended Data Fig. 3c). We designed a simple two-stage CMOS
has seven outputs, one for each vowel class. We have varied the number of hidden circuit to perform this comparison (Extended Data Fig. 3d,e). The first stage
neurons between 1 and 20 to evaluate the recognition rate as a function of the is composed of two differential amplifiers (voltage to current) in parallel. It is
number of trained parameters. More precisely, each formant has been rescaled followed by a gain stage (current to voltage amplifier). The mismatch between
between −1 and 1 before being fed into the first layer of neurons. The neuron the two amplifiers, a standard design technique, allows high gain. The output
activation functions are tanh functions at the hidden layer, and softmax at the of the circuit is therefore a binary voltage, high if the oscillator is synchronized
output layer: the outputs zi (i = 1 to 7) are defined as z i = e yi/ ∑7j= 1 e yj, where yj is to the input signal, low otherwise. This voltage can be used directly by standard
the input to the output neuron j. The output with the largest zi is taken as the vowel CMOS digital circuit to obtain the class of the input. In the circuit, bias voltages
class corresponding to the input. We also tried ReLU activation functions, but they (Vbias1 and Vbias2) can be adjusted to vary the speed and power consumption of
performed worse than tanh on this task. the circuit.
For training the network we performed backpropagation, that is, gradient We simulated this circuit in transient operation using the Cadence Spectre
descent over the negative log-likelihood (or cross entropy). SPICE simulator, a standard tool in commercial integrated circuit design, with
As in the experimental conditions, the samples are picked and presented ran- the design kit of a 28-nanometre commercial CMOS technology, and optimized the
domly to the network. One learning iteration corresponds to one forward pass bias voltages for minimal energy consumption, while retaining a response time of
of a given sample through the network, its subsequent gradient evaluation and the circuit below 600 ns. Extended Data Fig. 3f shows the energy consumed by the
weight update. The learning rate has been tuned to obtain the best result. Weights detection circuit as a function of the rectified direct voltage due to synchronization,
and biases before learning were randomly sampled from a Gaussian of mean 0 taking into account the whole transient of the detection. This energy can be low: it
and variance 0.01. is below 200 fJ for rectified direct voltages above 50 mV, which can be achieved in
For each trial, we ran training over 100,000 iterations to ensure convergence structures optimized for spin diode effect30. For a full system, this detection must
with a learning rate of 0.05. In practice, optimization techniques such as root- be performed twice (we send two input signals), for the four oscillators, leading to
mean-square propagation or adaptive moment estimation could be used to accel- a detection energy of 2 × 4 × 200 fJ = 1.6 pJ.
erate training. All results are reported in Fig. 4b, where we show the recognition Using our current oscillators, this energy would be smaller than the energy
rate after cross validation as a function of the number of trained parameters. dissipated by the oscillators and the reference resistors. By contrast, with scaled
Comparison with RNNs. In addition to the multilayer perceptron (Extended Data nano-oscillators (see Extended Data Table 2), this 1.6 pJ detection energy would
Fig. 2b), we also ran, on the same vowel database, a perceptron (Extended Data become dominant.
Fig. 2c), as well as a recurrent neural network (RNN; Extended Data Fig. 2d) and It is interesting to compare this quantity with the energy consumption of a
a long short-term memory network (LSTM) recurrent neural network (Extended purely CMOS neural network, implementing the multilayer perceptron of Fig. 4a.
Data Fig. 2e) with four hidden units. The procedure is similar to the multilayer Optimized CMOS neural networks compute in reduced precision, usually 8-bit
perceptron. Formants are presented sequentially to the network which outputs integers, which allows low energy consumption33. Taking into account the arith-
a vowel once all of them have been swept through. Softmax activation functions metic operations (sum and multiplications), in the same commercial 28-nanometre
were used at the output layer and tanh elsewhere. Outputs are encoded in a ‘one- technology as the detection circuit that we implemented, we calculated that an 8-bit
hot’ fashion: for example, the ae vowel (out of the seven in total) is encoded by integer neural network implementing the second layer of the neural network of
(1,0,0,0,0,0,0). We take the maximum activation value as the classification result. Fig. 4a consumes 2.2 pJ. We only took into account the second layer of the neu-
As in the experimental conditions, the samples are picked and presented randomly ral network, as it is the part implemented by the nano-oscillators. To obtain the
to the network. One learning iteration corresponds to one forward pass of a given energy estimation, we synthesized a Verilog description of a multiply and accu-
sample through the network, its subsequent gradient evaluation and weight update. mulate block and computed its energy consumption with the Cadence encounter
For each architecture, the choice of the learning rate has been tuned to obtain the tools using appropriate value change dump files generated by the Cadence ncsim
best result. Weights and biases before learning were randomly sampled from a simulator.

© 2018 Springer Nature Limited. All rights reserved.


Letter RESEARCH

These energy considerations show that on our tiny control system, a nano- 32. Louis, S. et al. Low power microwave signal detection with a spin-torque
oscillator-based solution would provide an energy consumption slightly smaller nano-oscillator in the active self-oscillating regime. IEEE Trans. Magn. 53, 1–4
(2017).
than an optimized CMOS-based solution. We expect that the full benefit of 33. Jouppi, N. P. et al. Datacenter performance analysis of a tensor processing unit.
the oscillator system will appear in deep networks composed of many layers of In Proc. 44th Annual International Symposium on Computer Architecture 1–12
spin-torque nano-oscillators. Indeed, cascading the synchronization states from (ACM, 2017).
one layer to the next can be achieved directly through oscillatory interlayer 34. Livi, P. & Indiveri, G. A current-mode conductance-based silicon neuron for
coupling and does not require synchronization detection. Only at the last layer address-event neuromorphic systems. In 2009 IEEE International Symposium on
Circuits and Systems 2898–2901 (IEEE, 2009).
will detection circuits be required to communicate their state to other circuits.
35. Qiao, N. & Indiveri, G. Scaling mixed-signal neuromorphic processors to 28 nm
Therefore, we expect that in a deep network of oscillators, the energy consumption FD-SOI technologies. In 2016 IEEE Biomedical Circuits and Systems Conference
will be largely dominated by the oscillator energy consumption, which can be low (BioCAS) 552–555 (IEEE, 2016).
for a scaled-down oscillator, as can be seen from Extended Data Table 2. 36. Wijekoon, J. H. B. & Dudek, P. Compact silicon neuron circuit with spiking and
bursting behaviour. Neural Netw. 21, 524–534 (2008).
37. Tran, D. X. & Dang, T. T. An ultra-low power consumption and very compact
Data availability 1.49 GHz CMOS voltage controlled ring oscillator. In 2014 International Conference
The datasets generated and analysed during this study are available from the cor- on Advanced Technologies for Communications (ATC 2014) 239–244 (IEEE, 2014).
responding authors on reasonable request. 38. Tomita, Y. et al. An 8-to-16GHz 28nm CMOS clock distribution circuit based on
mutual-injection-locked ring oscillators. In 2013 Symposium on VLSI Circuits
C238–C239 (IEEE, 2013).
31. Tulapurkar, A. A. et al. Spin-torque diode effect in magnetic tunnel junctions. 39. Gajek, M. et al. Spin torque switching of 20 nm magnetic tunnel junctions
Nature 438, 339–342 (2005). with perpendicular anisotropy. Appl. Phys. Lett. 100, 132408 (2012).

© 2018 Springer Nature Limited. All rights reserved.


RESEARCH Letter

Extended Data Fig. 1 | Schematic of the experimental set-up. The four coupled vortex nano-oscillators are shown. IRFA and IRFB are the microwave
currents injected in the strip line by the two microwave sources. HRF is the resulting microwave field. IDC1–4 are the applied direct currents.

© 2018 Springer Nature Limited. All rights reserved.


Letter RESEARCH

Extended Data Fig. 2 | Recognition rates obtained by different neural trained parameters. b–e, Schematics of the simulated neural networks:
networks on the formant database. a, Recognition rates of different b, multi-layer perceptron; c, perceptron; d, RNN; and e, LSTM.
neural networks on the formant database as a function of the number of

© 2018 Springer Nature Limited. All rights reserved.


RESEARCH Letter

Extended Data Fig. 3 | Synchronization detection by the spin diode configuration for CMOS-based detection of synchronization-induced
effect. a, Rectified direct voltage measured between oscillator electrodes rectified voltages. d, Two-stage CMOS circuit. e, The first stage, composed
when the external microwave signal is injected in the strip line above the of two differential amplifiers (green), is followed by a gain stage (blue).
oscillator and its frequency is swept. Here, the direct current through the VDD, supply voltage; GND, ground. f, Energy consumption of the CMOS
oscillator is 5 mA, the magnetic field is 585 mT and the injected microwave circuit for one synchronization detection event, as a function of the
power is +1 dBm. b, Oscillator spectrum emission measured during amplitude of the generated rectified direct voltages.
the same frequency sweep as a. c, Proposed differential measurement

© 2018 Springer Nature Limited. All rights reserved.


Letter RESEARCH

Extended Data Table 1 | Learning rule

Column 1, spoken vowel class; column 2, synchronization pattern assigned to each vowel; column 3, frequency difference vector between the spoken vowels and their associated patterns.
The index i refers to the ith data point of a vowel class (ith speaker).

© 2018 Springer Nature Limited. All rights reserved.


RESEARCH Letter

Extended Data Table 2 | Comparison of CMOS and spin-torque nano-oscillators for neuromorphic computing

Data from refs 24,34–39.

© 2018 Springer Nature Limited. All rights reserved.

You might also like