K.K. LIKHAREV
Department of Physics
State University of New York
Stony Brook, NY 11794-3800
U.S.A.
ABSTRACT. Background, basic ideas, and recent development of a new family of ultrafast superconductor digital devices are reviewed. Possible applications of this new digital technology are discussed.
As an additional factor, technologies of fabrication of the Josephson-junction integrated circuits may be considerably simpler than those of the present-day semiconductor (both silicon and gallium-arsenide) transistors with similar design rules, while providing quite acceptable RMS. deviations of the main parameters.
These factors were responsible for much work in this field during the last two decades, exemplified primarily by the American IBM project (1969-83) - see Anacker (1980) - and the Japanese MITI project (1981-91) - see e.g. Kroger (1986). Unfortunately, these big projects did not result in a practical superconductor digital technology. I believe that the main reason for that failure was an unfortunate choice of circuitry, which concentrated on various types of "latching" (or "voltage-state") logic.
Figure 1. (a) The buffer stage in Josephson-junction digital circuits and schemes of its operation in (b) latching logic and (c, d) RSFQ logic. After Likharev and Semenov (1991).
Figure 1a shows a simplified version of the simplest logic
gate, the buffer stage, used in latching logic. It employs a tunnel
Josephson junction which is naturally underdamped and thus exhibits a
hysteretic dc I-V curve (Fig. 1b). The junction is biased by a dc
current Ib slightly lower than the critical current Ic. Initially the
junction is in its superconducting state (V = 0, point 0 in
Fig. 1b). An arriving signal current Iin drives the total junction
current beyond Ic, and triggers its switching to its resistive state 1
with V = 2
(T)/e, so that a considerable part of the current (Iout) is
steered into the load R (typically, through a microstrip line); the
latter current serves as an output signal. In most latching logics -
see, e.g., Hayakawa (1983), Gheewala (1982), and Hasuo (1993) - the
current Iin is also used to suppress Ic simultaneously, but this fact
changes nothing essential in our discussion. This 0 -> 1 switching
process can be very fast, down to almost one picosecond - see Kotani
et al. (1988).
However, the reset (the 1 -> 0 switching) cannot be achieved by merely turning the signal Iin off: the circuit remains in its state 1. The only practical way to reset the gate to its state 0 is to switch off the bias current Ib. In the latching logics, this periodic reset of all gates is achieved by using an rf rather than a dc current supply for all the gates; this waveform performs also a global synchronization of the whole device. Unfortunately, this operation mode has severe drawbacks:
As a result of these drawbacks, no prospects have been found
to increase clock frequencies of latching-logic circuits beyond a few
GHz. This speed is higher than, but comparable to, those of the
fastest semiconductor digital circuits which do not require helium
refrigeration. This marginal advantage is probably insufficient to
warrant commercial introduction of this digital technology. This is
why recently much attention was turned to alternative "flux-state" (or
"Single-Flux-Quantum", or "SFQ") logic which uses coding of the binary
information not by the dc voltage, but by single quanta of magnetic
flux (
0 = h/2e = 2.07x10-15Wb). SFQ devices can be divided into two
big groups, defined by the method used to pass the information between
logic circuits.
In static SFQ circuits, suggested and analyzed by Anderson et al. (1971), Fulton and Dunkleberger (1973), Likharev (1976), Hurrell and Silver (1978), Likharev (1982), Loe and Lee (1985), Likharev et al. (1985 b), Loe et al. (1988) and others, the information is passed in the form of dc flux (or supercurrent). These devices are of a high fundamental interest because of their capability to implement the reversible processing of digital information - for details, see Likharev (1982). However, in static circuits the intergate distance is severely limited (practically, to the nearest neighbors) by the inductance of the interconnects. A second disadvantage of this approach is the necessity of an rf supply/clock, with resulting limitations on speed, similar to those for a voltage-state logic. Finally, a detailed analysis shows that parameter tolerances in static SFQ circuits are forbiddingly low.
In dynamic SFQ circuits, first discussed by Nakajima et al. (1976), Nakajima and Onodera (1978), Hurrell et al. (1980), Hamilton and Lloyd (1982), Silver et al. (1985), Nakajima et al. (1983), Oya et al. (1985), and some other authors, information between logic devices is passed ballistically, along either passive microstrip lines or active Josephson transmission lines, in the form of very short (picosecond) "quantized" voltage pulses V(t) with the fixed area
Int V(t)dt =
The essence of this idea is that these "SFQ" pulses can be
quite naturally generated, reproduced, amplified, memorized, and
processed by elementary circuits comprised of overdamped Josephson
junctions. This unique ability, fully realized in some analog devices
based on the Josephson effect, was virtually neglected in latching
logic; moreover, in the latching logic circuits the SFQ pulse
generation is an inherent reason for the punchthrough effect which
limits operation speed.
Over two decades, from the mid-1960s to the mid-1980s, several
suggestions on how to use SFQ pulses for processing of digital
information and analog-to-digital (A/D) conversion were put forward by
the authors cited above, as well as by Clark and Baldwin (1967),
Anacker and Zappe (1972), Likharev (1974), Lum et al. (1977), Zappe
(1974 and 1975). It was only in 1985-86, however, that a complete
family of dynamic SFQ circuits, with the nickname RSFQ (standing for
Rapid Single-Flux-Quantum devices) was suggested by a Moscow State
University group - see Likharev et al. (1985 a) and Mukhanov et
al. (1987). During the period 1985-91 this group, in collaboration
with a group at the Institute of Radioengeneering and Electronics
(IRE), then of the Soviet Academy of Sciences, designed, fabricated,
and tested a few circuits containing several of the simplest basic
components of the RSFQ family. These components were demonstrated to
work at clock frequencies in excess of 100 GHz with quite decent
parameter margins (see Kaplunenko et al. (1989 a and b), and
Filippenko et al. (1991)), despite a rudimentary 5-um technology used
for their fabrication. Simultaneously, numerical simulations had shown
that transfer to a 1-um technology could increase the speed beyond the
300-GHz level - see Mukhanov et al. (1987, 1989, and 1991), Rylov
(1991), Polonsky (1991), Kirichenko et al. (1991), and
Kidiyarova-Shevchenko et al. (1991). Since 1991, the RSFQ idea has
been adopted by several groups in the United States and other
countries, and its development is moving rapidly.
An alternate family of dynamic SFQ devices was suggested,
under the name "Phase Mode Josephson System", by Professor K. Nakajima
and his collaborators at Tohoku University in Sendai, Japan - see Oya
et al. (1985) and Nakajima et al. (1989 and 1991). This system used a
single basic cell, the "ICF gate", and seemed much less flexible than
the RSFQ family. I am not aware of any recent attempt to continue this
approach.
This review paper is intended to give a brief review of RSFQ
digital circuits. In contrast with the detailed review of the initial
work by Likharev and Semenov (1991), I will emphasize more recent
results (obtained by mid-1992).
0 = 2.07 mV-ps.
2. Basic Components of RSFQ Circuits
2.1. REPRODUCING, AMPLIFYING, AND TRANSFERRING SFQ PULSES
The most elementary RSFQ circuit coincides with that shown in Fig.1a,
except that now the Josephson junction is overdamped, e.g., the tunnel
junction shunted externally by a metallic resistor low enough to
reduce its McCumber-Stewart parameter
c below 1. Figure 1c shows the
dc I-V curve of the junction; in contrast to the underdamped case, the
curve is single-valued. This implies that after a current pulse Iin(t)
is over, the junction is automatically self-reset to its original
superconducting state (V = 0). If we look at this picture more
attentively, using equations of the Josephson dynamics - see, for
example, Likharev (1986) - we will see something quite
interesting. There exists a broad range of amplitude and length of the
pulse Iin(t), within which it triggers a quantized jump of the
Josephson phase
of the junction by 
= 2
- see Fig. 1d. This fact
can be readily understood starting from the well-known analogy between
the Josephson junction and the pendulum. In this analogy, biasing of
the junction with the dc current Ib < Ic corresponds to applying a
nearly critical torque to the pendulum, driving it to a position close
to the critical angle
c =
/2. The short input-current pulse is
equivalent to a kick which drives the pendulum beyond fc. If the
pendulum is overdamped, the kick results in just one 2
-rotation, with
its automatic reset to the sub-critical static state. According to the
fundamental Josephson phase-to-voltage relation
d
such a "2 If the dc bias current Ib is not too far from the critical
value Ic, this SFQ pulse can be triggered by an incoming short pulse,
with either the nominal or a somewhat different
amplitude. It means that the circuit shown in Fig.1a can reproduce
SFQ pulses, bringing their area Int V(t)dt to the nominal value
Figure 2a shows another key RSFQ circuit, the Josephson
transmission line (JTL) comprising several Josephson junctions
connected in parallel by superconducting strips of a relatively low
inductance L< If amplification of SFQ pulses is not needed, they can be
passed along passive superconducting microstrip lines. In order to
match these lines to other RSFQ components, one can use short segments
of the JTLs together with matching capacitors (Fig. 2b). Recently,
such circuits were used to demonstrate transfer of 5-ps SFQ pulses
over distances up to 1 cm without noticeable attenuation - see
Polonsky et al. (1993 b).
An evident generalization of the JTL (Fig. 2c) can be used to
provide splitting of the SFQ pulse, i.e., reproduction of the input
pulse A at each of its two outputs B and C, without noticeable
decrease of the pulse voltage amplitude.
All these simplest circuits are reciprocal, and cannot be used
for isolation, so one needs a buffer stage (Fig. 2d). In this circuit,
critical current of the junction J2 is somewhat smaller than that of
the junction J1. Now, if the initial pulse arrives from the circuit
input A, it is applied to J1 alone, and triggers the 2
Figure 2. The simplest components of RSFQ circuits: (a) Josephson
transmission line, (b) driver and receiver for transfer of SFQ pulses
along a passive superconducting microstrip line, (c) SFQ pulse
splitter, (d) buffer stage, and (e) one-directional JTL. After
Mukhanov et al. (1987 and 1989).
In principle, the SFQ pulse can be generated by the circuit shown in
Fig. 1a, by feeding the Josephson junction by a short non-quantized
current pulse Iin arriving from, say, a semiconductor electronic
device. A disadvantage of this circuit is that the pulse should be
very short (e.g., a few picoseconds), and its duration should be
within certain limits. A much less demanding way is to use a Josephson
junction in parallel with the superconducting inductor L (i.e., the
usual single-junction superconducting quantum interferometer), with
the basic dimensionless parameter Figure 3 shows a more advanced version of such a DC/SFQ
converter. If its input current I is increased beyond a certain
threshold value Iup , the critical state of the junction J3 is
achieved, and the SFQ pulse is generated across it. Simultaneously,
the three-junction interferometer (J1-J3, L1-L3) is switched into
another flux state. In order to reset the interferometer into its
initial state, the current should then be decreased below a value
Idown at which the 2
Figure 3. A possible structure for a DC/SFQ converter. After Polonsky
et al. (1993 a).
Figure 4a shows another key component of virtually all RSFQ
circuits. It is essentially the two-junction superconducting quantum
interferometer ("dc SQUID"). If the inductance L of the interferometer
is chosen so that its basic parameter Let us suppose that the persistent current is
circulating counterclockwise (binary 0), so that it sums with I in J3:
I3 = Ib/2+Ip < Ic. If now the SFQ pulse arrives at the input S, it
triggers the 2 Figure 4b shows an SFQ analog of the T flip-flop (i.e., a
single-bit stage of a binary counter). This circuit is fed by the
input pulses from its single input. Each pulse is split and injected
to both arms of the interferometer, so that it always triggers the
circuit switching to the opposite state. Numerical simulations by
Polonsky et al. (1993 a) have shown that for an optimized version of
this circuit the dc-bias-current margins can be as wide as ±37%; the
experimental result by Kaplunenko et al. (1989 a) for a slightly
different version was found to be ±30%. Note that a conceptually
similar device was proposed by Hurrell and Silver (1978) and
implemented by Hamilton and Lloyd (1982) long before the development
of the RSFQ logic family. However, due to the use of a resistor
(instead of Josephson junctions J3 and J4) for injection of SFQ
pulses, that device had extremely small parameter margins (about ±5%)
and could not be used in large-scale-integration circuits.
Figure 4. The simplest RSFQ latches: (a) RS flip-flop, (b) T
flip-flop, and (c) T0 cell. After Mukhanov et al. (1987), Kaplunenko
et al. (1989 a), and Likharev and Semenov (1991).
Figure 4c shows a modification of the T flip-flop (the "T0
cell"), which allows one to read out its contents (in the
complementary code) by the additional SFQ pulse T. The quantizing loop
now contains three junctions (J3, J4, and J6), but due to the
corresponding choice of its parameters, it still has two stable
states. If the loop is in its state 0, with counterclockwise
circulation of the persistent current, then junction J6 is biased
subcritically, pulse T triggers the 2 Figure 5a shows a possible structure for an "SFQ pulse monitor" which
is a combination of the T flip-flop and SFQ/DC converter, producing a
dc voltage at its output. Its heart is the interferometer (J1, J3, L)
connected to an additional pair of the Josephson junctions (J5, J6)
forming another dc SQUID. If the basic interferometer is in its state
0, the Josephson phase drop across junctions J5 and J6 is small, and
this additional SQUID is in its superconducting state (zero dc voltage
across J5). Switching of the basic interferometer to its state 1 leads
to reduction of the critical current of the additional SQUID, and to
its transfer to the resistive state (accompanied by continuous
Josephson oscillations of junctions J5 and J6), i.e., to the
appearance of a non-vanishing dc voltage V at the converter output. A
similar converter can be nested on the RS flip-flop as well
(Fig. 5b).
Such SFQ/DC converters have been repeatedly tested
experimentally to operate with more than ±30% parameter margins, in
good agreement with simulation results - see, e.g., Kaplunenko et
al. (1989 a and b), Filippenko et al. (1991), and Polonsky et
al. (1993 a).
Fig. 5. SFQ/DC converters combined with (a) a T flip-flop and (b) an
RS flip-flop. After Kaplunenko et al. (1989 a) and Likharev and
Semenov (1991).
In order to proceed to digital operations with signals as unusual as
picosecond SFQ pulses, one needs an explicit definition of what
digital information is in RSFQ circuits. Such a definition was
probably the most important conceptual step made by Likharev et
al. (1985 a).
According to this concept, any RSFQ circuit may be considered
as consisting of "elementary cells" (or "timed gates") operating as
Fig. 6 shows. Each cell has two or more stable flux states. The cell
is fed by SFQ pulses which can arrive from signal lines S1,...,
Sn,
and a clock (timing) line T. Each clock pulse marks a boundary between
two adjacent clock periods by setting the cell into its initial state
1. During the new period, an SFQ pulse can arrive (or not arrive) at
each of the cell inputs Si (Fig. 6b). Arrival of the SFQ pulse at a
terminal Si during the current clock period defines the logic value 1
of the signal Si, while absence of the pulse during this period
defines the logic value 0 of this signal. (Note that this convention
does not require the exact coincidence of SFQ pulses in time; nor is a
specified time sequence of the various input signals needed.) Each
pulse can either change or not change the internal state of the cell,
but it can not produce any immediate reaction at its output
terminal(s) Sout. Only the clock pulse T is able to fire out the
pulse(s) Sout corresponding to the internal state of the cell,
predetermined by the input signal pulses which have arrived during
this period. The same clock pulse terminates the clock period by
resetting the cell into its initial state. Thus, an elementary cell of
the RSFQ family is equivalent to a usual asynchronous logic circuit
coupled with a latch (flip-flop) storing its output bit(s) until the
end of the clock period.
Figure 6. RSFQ logic: (a) elementary cell and (b) signal
sequence. After Likharev and Semenov (1991).
In order to understand how the above convention works, let us come
back to the RS flip-flop (Fig. 4a) and suppose that the input S is fed
from a signal line, while the clock pulses are fed into input R. The
clock cycle starts from the clock pulse, resetting the system back
into its state 0. If no signal pulse S arrived during the current
clock cycle, then the new clock pulse R would trigger the SFQ pulse
across J1 rather than J4, and no output signal appears at the output
F. However, if this concluding clock pulse was preceded by the signal
pulse S (switching the interferometer to state 1), then the clock
pulse would trigger the SFQ pulse across J4, which will serve as the
output signal (simultaneously with the new resetting of the flip-flop
into state 0).
We see that the RS flip-flop in this case works in accordance
with the above definition of an RSFQ elementary cell, performing the
function YES, i.e. reproducing the input signal S, with its time delay
until arrival of the clock pulse. In other words, this circuit works
as a 1-bit stage of a shift register. Such RSFQ registers have been
tested successfully by Mukhanov (1993) at frequencies up to 60 GHz
with parameter margins up to ±30%.
/dt= (2e/h)V(t),
-jump" of
corresponds to generation of the SFQ voltage
pulse across the junction. For typical present-day
fabrication technologies, duration of the pulse is a few picoseconds,
while the pulse amplitude is a few hundred microvolts.
0,
i.e.,
providing a moderate voltage gain if necessary. On the other hand, if
the input pulse is too weak (e.g., presents digital "noise" due to
parasitic crosstalk between the signal transfer lines) it is not
reproduced by the circuit, so that the circuit also serves as a noise
discriminator.
0/Ic , and dc-current biased to their sub-critical state
(Ib<Ic). After the 2
-jump of the Josephson phase is triggered in the
left junction J1 by the input signal, the resulting SFQ pulse
developed across J1 will trigger the 2
-jump in J2, and this process
will continue until the pulse is reproduced at the right edge of the
JTL. JTLs can also be used to amplify SFQ pulses (or, more exactly, to
provide their current/power gain while conserving their voltage
area). For that, critical currents of the junctions and the
corresponding dc bias currents should grow in the direction of the
pulse propagation, with the proportional decrease of the inductances.
-jump of the
Josephson phase in J1, leaving the phase across J2 virtually
undisturbed. As a consequence, the SFQ pulse is reproduced and passed
to the output terminal B. On the contrary, if a pulse arrives from the
latter terminal, it triggers a current pulse in both J1 and J2. As Ic2
< Ic1 , the junction J2 reaches its critical state earlier, and
performs the 2
jump. Hence, voltage across J1 remains close to zero, which means that
the SFQ pulse does not reach the input terminal A. Figure 2e shows a
useful recent synthesis of the JTL and the buffer stage, having larger
parameter tolerances - see Polonsky et al. (1993a).
2.2. GENERATING SFQ PULSES
= 2
IcL/
0 somewhere between 2 and
6 - see, e.g., Likharev (1974). In order to generate a single SFQ
pulse, the interferometer may be fed by a usual dc current pulse, with
only amplitude (but not length) within certain limits.
-jumps are triggered sequentially in the
junctions J1 and J2. The reset is accompanied by generation of SFQ
pulses across these junctions, which do not penetrate into the output
JTL. Numerical simulations and experiments by Polonsky et al. (1993 a)
have shown that such a converter may have extremely wide parameter
margins (up to ±60%).
2.3. CAPTURING, STORING, AND RELEASING SINGLE FLUX QUANTA
= 2
IcL/
0 is close to 10,
and the dc bias current Ib is close to 0.8 Ic, the circuit has two
symmetric stable stationary states which differ by the direction of
the persistent current Ip =
0/2L circulating in the loop. In other
words, one of these states corresponds to an additional single flux
quantum trapped in the superconducting loop of the interferometer.
-jump in J3, but not in J4 which carried a lower dc
current I4 = Ib/2 - Ip. As a result of the jump, the cell is switched
to its opposite state 1 with the clockwise circulation of the
persistent current. It is evident that now the reset (the 1 -> 0
switching) can be triggered by the SFQ pulse arriving at the R
terminal. Simultaneously, an SFQ pulse V(t) is developed across J2,
which can serve as an output signal F. The auxiliary junctions J1 and
J2 defend the SFQ pulse sources from the back reaction of the
interferometer in the case of a "wrong" signal, for example, the S
(set) pulse arriving during the state 1. In this case, junction J2
(rather than J3) switches; in other words, the incoming single flux
quantum "falls out" of the circuit through J2 if the interferometer
loop is unable to accept it. One can see that for SFQ pulses the
circuit works exactly as a standard RS flip-flop (latch): SFQ pulses
can be trapped by this circuit, so that the information about their
arrival can be conveniently stored there in the form of static
magnetic flux, and released when necessary in SFQ pulse form.
-jump of its phase, and the SFQ
pulse is developed at the additional output S. (Note that this
operation switches the loop from state 0 to state 1.) If, on the other
hand, the loop is in its state 1 (with clockwise circulation of the
persistent current), the bias in J6 is small, and pulse T leaves it in
the superconducting state (inducing the 2
-jump in the junction J5
instead).
2.4. MONITORING SFQ PULSES
2.5. INFORMATION IN RSFQ CIRCUITS
3. Implementation of Single-Bit Functions: A Few Examples
3.1. YES (D CELL)
3.2. NOT (INVERTER)
Figure 7 shows a possible implementation of an RSFQ inverter. This
circuit is also built around one interferometer (J2, L1, J3), but also
includes an additional Josephson junction J1. In the initial state 0
the higher current is flowing through J2, while J3 carries virtually
no current. This is why, in the absence of the input pulse W1, the
next clock pulse would trigger the SFQ pulse in J1 rather than in J3,
and this pulse would appear at the circuit output J1 (so that input 0
provides output 1). If the signal pulse W1 arrived, it would switch
the interferometer into the state 1, with a higher current in J3. In
this case, the next clock pulse would trigger the SFQ pulse across J3
rather than J1, the circuit would be reset, and no output pulse
developed (i.e., input 1 yields output 0).
Figure 7. A possible implementation of RSFQ inverter. After Kidiyarova-Shevchenko et al. (1991).
This circuit was suggested by Kidiyarova-Shevchenko et al. (1991) and recently tested experimentally by Polonsky et al. (1993 a) and Goldobin et al. (1993). The tests have shown that the inverter can have a critical margin approaching ±25%, and dc supply margin approaching ±30%, in good agreement with results of numerical simulations.
This (very typical) RSFQ cell also was used recently in two similar experiments by Goldobin et al. (1993) and Polonsky et al. (1993 a) for estimating the probability of rare errors. For that case, the inverter output was connected to its clock input through a JTL line. When an SFQ pulse was injected into this closed loop (through some additional circuitry), it circulated permanently with a frequency about 15 GHz, thus providing dc voltage V1 ~8 mV across the junctions of the JTL. At the margins of the parameter window, rare errors were clearly visible as random jumps of the voltage up (to a dc voltage of about 2V1 corresponding to two SFQ pulses circulating in the loop) and down (to zero voltage corresponding to no SFQ pulses). Preliminary estimates show that these errors were due to external interference rather than fundamental thermal fluctuations in the Josephson-junction shunts, because of unshielded environment with only rudimentary filtering. Even under these conditions, however, not a single jump has been observed by Polonsky et al. (1993 a) in the middle of the parameter window during 6 hours of observation, thus indicating that the error probability was less than ~3x10-15/bit at T = 4.2 K.
This encouraging result should be compared with the much larger probability of 2x10-7/bit measured by Durand et al. (1992) at the same temperature in the resistively-coupled SFQ T flip-flop. This discrepancy can be readily explained by the fact that the resistively-coupled flip-flop had very narrow parameter margins, so that no operating point inside this small parameter window was really stable.
Figure 8. Equivalent circuit of an RSFQ OR cell. After Polonsky et al. (1993 a).
Figure 8 shows a circuit suggested by Polonsky et al. (1993 a), which
implements the 2-input OR function. It consists of two quantizing
loops (J1, L1, J2) and (J5, L2, J6), each with two stable
states. Input signals IN1 and IN2 can switch these interferometers
from their initial state 0 (with larger currents flowing through
junctions J1 and J5, respectively) to state 1 (with larger currents
flowing through J2 and J6, respectively). If no inputs have come
during the current clock period, the next clock pulse triggers SFQ
pulses across Josephson junctions J4 and J8, which do not penetrate to
the circuit output. If the input pulse IN1 alone has arrived during
the clock period, and thus junction J2 carries high current, the clock
pulse CLK triggers the 2
-jump of the phase in this junction. The
generated SFQ pulse easily passes through the small inductance L3,
triggers the SFQ pulse across J9, and passes to the circuit output
through the superconducting junction J10 and small inductance
L5. Simultaneously, the SFQ pulse is developed across J12. If the
input pulse IN2 alone has arrived, a similar process will take part in
the lower part of this symmetric circuit. Finally, if both IN1 and IN2
have arrived, both junctions J2 and J6 develop SFQ pulses
simultaneously. These pulses trigger simultaneous SFQ pulses across J9
and J11, but only one RSFQ pulse appears at the circuit output,
because these junctions are connected in parallel, as seen from the
output (in which case, junctions J10 and J12 remain superconducting).
When fabricated and tested by Kwong et al. (1993), this circuit has shown very large parameter margins for each of the dc bias currents I1-I4; estimated margins for the dc supply as a whole were close to ±30%, in reasonable agreement with results of numerical simulations by Polonsky et al. (1993 a).
We have seen that the RS flip-flop (Fig. 4a) itself presents a
single-bit "D" cell of a shift register with destructive readout
(DRO). Figure 9 shows the equivalent circuit of a cell which allows a
non-destructive readout (NDRO) of its contents. Its quantizing loop
(J1, J4, L1, J2) can be switched between its two stable states by
write-in pulses W0 and W1. Setting pulse W1 triggers sequential
2
-jumps in junctions J2, J7, and J6, and triggers counterclockwise
circulation of the persistent current in the loop. Reset pulse W0
triggers a 2
-jump in junction J4 alone, and thus restores clockwise
circulation. Note that in neither case does an SFQ pulse penetrate to
the NDRO output port OUT, though one can easily pick up the usual DRO
output pulse from J2 or J7.
Figure 9. RSFQ NDRO memory cell. After Polonsky et al. (1993 a).
If the NDRO-initiating pulse arrives (from the input port
denoted as CLK in Fig. 9) when the cell is in its state 0 with
clockwise circulation of the persistent current, it triggers an SFQ
pulse across J3, but leaves junction J1 (with small net current) in
its superconducting state, thus producing no effect on the state of
the cell. If, however, this pulse arrives when the cell is in the
state 1 (counterclockwise circulation of the persistent current making
the bias of J1 subcritical), it triggers 2
-jumps sequentially in J1
and J5, thus providing the output SFQ pulse to the NDRO output
port. Note that for a moment the cell has changed its state from 1 to
0. However, the SFQ pulse across J5 passes through superconducting
junction J6, small inductances L2 and L4, is reproduced across J7, and
(passing through a small inductance L6) is fed back into the
quantizing loop. It triggers a 2
-jump in J2 and thus returns the cell
to its state 1 after a small (few-picosecond) delay.
This "delayed feedback" principle was successfully applied by Polonsky et al. (1993 a) to design several other circuits of the RSFQ family, including an RS flip-flop with two complementary outputs and demultiplexer. Numerical simulations and experimental testing of these circuits have shown that despite their relatively complex dynamics they can have quite decent parameter margins (about ±30%).
Note that the NDRO memory cell can be considered as an asynchronous single-bit multiplier of signals W1 and CLK, provided that the former of these pulses arrives before the latter one.
A few other elementary cells including XOR (see Mukhanov et al. (1989 a)), AND (see Kirichenko (see Mukhanov et al. (1991)), and OR-AND (see Mukhanov et al. (1991)) cells, full and serial single-bit adders (see Mukhanov et al. (1987), Kidiyarova-Shevchenko et al. (1991), and Hamilton and Gilbert (1991)), and various multiplexers and demultiplexers (see Mukhanov et al. (1991) and Polonsky et al. (1993 a)), have been suggested, and some of them have been tested experimentally - see Polonsky et al. (1993 a) and Benz et al. (1993). General principles of their design are similar to those illustrated above. Now I shall discuss parameters of these cells.
0, where
0 is a natural unit of
time:
At this unparalleled speed, timing becomes an issue of primary importance. In particular, external (global) synchronization of VLSI circuits at frequencies beyond 100 GHz is relatively impractical. Fortunately, clock pulses for RSFQ cells are physically identical to the signal pulses, and hence can be generated inside RSFQ circuits. Thus, the external global synchronization can be complemented (and sometimes replaced) by very convenient local self-timing.
Figure 10. The simplest clock distribution systems: (a) "counterflow" and (b) "concurrent flow" of the clock (single lines) and data (double lines). After Mukhanov et al. (1989 a).
For relatively simple fragments of RSFQ circuits, self-timing schemes
may be quite straightforward. Let us consider, for example, a
shift-register-type structure (Fig. 10). (Here and below the data
transfer will be denoted by double lines, and the clock by single
lines). The first task for this structure is a single-step shift of
all the data. It is easily achieved (Fig. 10a) by sending the clock
pulse in the direction opposite to the desired signal propagation
direction, with an appropriate time delay
c per gate:
where For more complex circuits, however, another ("hand-shaking")
approach may be preferable. In this approach (borrowed by Mukhanov et
al. (1989 b) from high-speed semiconductor electronics), each fragment
(cell or block) of the system is complemented by a special circuit
which generates the clock pulses for its correspondents. Figure 11a
shows such a circuit for a shift-register type structure, while
Fig. 11b shows the "coincidence junction" circuit employed for the
timing. The latter circuit generates the output SFQ pulse only when
each of its two inputs have been fed by SFQ pulses.
Figure 11. RSFQ hand-shaking: (a) general scheme and (b) equivalent
circuit of the coincidence junction C. After Mukhanov et al. (1989 b).
Let the register be filled by a string of data and all
coincidence junctions have received their acknowledgment (ACK)
pulses. The cell is waiting for the arrival of the SEND pulse
signaling that the receiver cell is reset and hence ready to accept
new data. This pulse triggers the coincidence junction which first
produces the clock pulse T for its native cell and thus forces it to
send its output signal to the receiver. Simultaneously, this pulse is
duplicated as the ACK signal necessary to set up the coincidence
junction of the receiver, and (after an appropriate delay) as the
SEND signal for the next (sending) cell. As a consequence, the entire
data string will eventually be shifted by one step to the right. With
a different initial setting, the same clock distribution system can
operate in the "load" mode similar to that of the circuit shown in
Fig. 10b.
The hand-shaking approach allows one to design self-timing
circuits (providing appropriate time delays) locally, being sure that
they can be later united to an arbitrary LSI circuit without any
problem for correct operation of the system as a whole. It also allows
one to replace the clock generators by clock controllers which would
automatically adjust their pace to that of the slowest fragment of the
controlled circuit - for a detailed discussion, see Likharev and
Semenov (1991).
In complex VLSI circuits it probably will be beneficial to
combine high-speed local timing (i.e., asynchronous operation) of the
circuit fragments with the slower global timing of the circuit as a
whole, thus ensuring its compatibility with the usual synchronous
operation of its semiconductor environment.
Figure 12 shows a possible structure of an RSFQ circuit performing
serial multiplication of two multi-bit numbers A and B. The block uses
three types of single-bit elementary cells: DRO cells D (e.g., simple
RS flip-flops, Fig. 4a), NDRO cells N (Fig. 9), and full adders FA
(operating in the carry-save mode, i.e., as the serial
adders). Operation of the block is controlled by two sequences of the
clock pulses, TA and TB.
Figure 12. A possible structure for a serial RSFQ multiplier. After
Mukhanov et al. (1989 a).
Figure. 13. Parallel RSFQ multiplier: (a) general structure and (b)
internal structure of the unit M (units D and H can be similar in
structure, but perform only some of the functions of the unit
M). After Mukhanov et al. (1989 a).
The operation is started by rapid loading of all n bits of
number B into the cells - such loading can be provided, e.g., by the
concurrent-flow timing scheme shown in Fig. 10b. Then the clock train
TA starts, inducing gradual (one bit per clock cycle) loading of the
number A into the shift register formed by the D cells. The latter
train triggers also a simultaneous backward motion of the bits along
the string of the full adders FA.
One may be readily convinced that at the end of each operation
cycle (taking 2n clock periods) the circuit really does produce the
correct 2n-bit product AxB at its output terminal P. (In fact, all the
partial single-bit multiplications are performed by the NDRO cells,
while the serial adders FA merely sum up all the partial products.)
Note that the circuit is occupied by the operands and the
product during all 2n clock cycles, so that its throughput (number of
operations per second) is relatively low - as an example, see Table 1
below.
The same operation can be carried out with much higher throughput
using parallel-pipeline single-bit units. Figure 13 shows an example
of such a device, a multiplier of two n-bit numbers A and B - in this
example, n = 4. Note that the unit M is a copy of a single column of
the serial multiplier (Fig. 12); the only difference is the method of
connecting the units and their timing.
c,
>
,
is the clock period, and
is the maximum logic delay of the
cell. Another simple timing method is shown in Fig. 10b. This scheme
can work correctly only if the equation is satisfied,
and the whole
register is initially empty. In this case, a single clock pulse will
trigger a motion of each single input bit along the whole
register. Similar timing schemes can be used also for two-dimensional
RSFQ structures - see Mukhanov et al. (1989 a).
5. RSFQ Arithmetic: Two Examples
5.1. SERIAL MULTIPLICATION
5.2. PARALLEL MULTIPLICATION
| Circuit type; fabrication technology | Design rules (um) | Integration scale (103 Josephson or p-n junctions) | Throughput (109 operations per second) | Latency (ns) |
|---|---|---|---|---|
| Parallel-pipelined; Si-MOS | 1.0 | 200.0 | 0.2 | 150 |
| Parallel; JJ latching | 2.5 | 70.0 | 0.5 | 2 |
| Serial; JJ RSFQ | 2.5 | 1.5 | 0.5 | 2 |
| Parallel-pipelined; JJ RSFQ | 2.5 | 40.0 | 30.0 | 2 |
If clock pulses are fed into the vertical columns of the structure sequentially, from right to left (for example, using the counterflow scheme shown in Fig. 10a, with proper precautions to avoid clock skew), the signal bit front moves from left to right, the data being processed simultaneously. If fed in parallel by all 2n bits of new operands (A, B) each clock period, this "pipeline" multiplier produces all 2n bits of the product P = AxB simultaneously, during one period. Thus, the throughput of this circuit may be extremely high (one number per each clock cycle), although a full processing of each specific pair of the operands takes 2n+1 clock periods. The price for this high throughput is hardware. An example of performance of these two RSFQ multipliers is given in Table 1.
One can see that serial RSFQ devices can combine operation at reasonably high speed with exceptional simplicity. At the same time, parallel RSFQ devices can provide unprecedented throughput. A flexible trade-off between these two performance factors is also possible. For example, one can increase the speed of the serial multiplier by using parallel m-bit multipliers (m < n) instead of single-bit multipliers. Such a device would require a factor of m more junctions, but would have an m-times larger throughput.
The most important feature of Table 1 is that RSFQ circuits can provide a considerable improvement (at least two orders of magnitude) in performance in comparison to present-day semiconductor digital technologies.
The simplest and hence the most immediate application of RSFQ
technology is analog-to-digital conversion. The reasons for this are:
the extremely high switching speed of the Josephson junctions (and
hence very short aperture time
a of the converters) and the natural
2
-quantization of the Josephson phase
(i.e., the
0-quantization of
the magnetic flux).
Two types of Josephson-junction A/D converters have been developed during the last decade: parallel and series ones - for a recent review, see Lee and Peterson (1989). The former converters are usually believed to be able to provide the largest signal bandwidths. However, they need simultaneous delivery of ultrafast sampling waveforms to each of their n samplers, and high-precision analog input signal dividers. Both factors do limit the effective aperture time, so that the best experimentally demonstrated performance of the converters - see the recent work by Bradley (1993) - is comparable to that of their semiconductor commercially-available counterparts. I believe that much better prospects exist for serial converters. A possible basis of these latter devices, the so-called ripple counter, was proposed more than a decade ago by Hurrell and Silver (1978). Its practical application was, however, delayed until a method could be found to carry out ultrafast read-out and preliminary processing of the digital output of the device. RSFQ circuits provide such a method.
Figure 14 shows a possible structure for an RSFQ A/D converter
- see Rylov (1991) and Likharev and Semenov (1991). Its first part (an
SFQ comparator) is essentially the two-junction interferometer (J1,
J2, L) formed by lower junctions of two Josephson-junction pairs
which are fed by SFQ pulse trains of very high frequency (say, 100
GHz). If the flux
e = MI applied to the interferometer is increasing
and exceeds a certain value
up, the next SFQ clock pulse will trigger
a 2
-jump of the phase in J2, and an SFQ pulse will enter (through a
small inductance L2) the digital part of the device. If flux Fe is
decreased below a (typically different) value
down, a similar pulse
will be generated by the lower part of the circuit. The digital part
of the device starts with the corrector which cancels the pulses in
both channels if they are generated during one SFQ clock cycle. (This
outcome leads to the correct result eventually, because the number of
pulses coming from two channels will be subtracted in the following
circuit.)
Figure 14. Serial (counting) RSFQ A/D converter: (a) comparator and corrector, and (b) decimal filter. After Rylov (1991) and Likharev and Semenov (1991).
Figure 14b shows the digital part of the converter, the decimation filter. The circuit is timed by a continuous train of SFQ pulses. Two consecutive counts from each channel of the comparator-corrector unit are averaged by the delay of each count by one clock period in a simple register cell D and by addition of its contents to the next count. These averaged signals arrive at inputs of a 4-bit reversible counter using T0 cells (Fig. 4c), so that the register contents (in the complementary code) can be destructively read out. After every two clock periods the contents of the counter are read out (destructively) into the similar, but six-bit, binary counter ("accumulator"), again with signal averaging by additional D cells. After every four clock periods, a similar operation is fulfilled with the contents of the latter accumulator, etc., until the reloading frequency is reduced to the Nyquist frequency 2Fmax of the signal. (Note that T flip-flops in the clock distribution line provide division of the clock pulse frequency by two at each stage.)
One can readily be convinced that each accumulator provides the correct differential digital code of the input signal (except that all odd stages yield this information in the complementary code). The farther right that the counter is, the larger is the number of output bits (by two bits per stage) and the lower is the output frequency (by a factor of two per stage). An analysis of the device operation - see Rylov (1991) - shows that the 25% least significant bits of the output present the quantization noise and should be neglected, while the remaining bits are correct. (It is interesting and important that this result is not affected by moderate thermal noise.) Thus the overall accuracy of the converter is given by the formula
where Note that the formula coincides with that for semiconductor
single-stage sigma-delta modulators - see, e.g., Candy and Temes
(1992). In Josephson-junction technology a direct implementation of
the sigma-delta modulators is, however, possible - see Przybysz et
al. (1993) and Xiao and Van Duzer (1993) - only with a substantial
loss of signal-to-noise ratio, because of the absence of convenient
analog integrators.
The effective digital filtering described above can be also used for
development of digital SQUIDs which would combine the high sensitivity
of analog versions with a much higher slew rate and a virtually
unlimited dynamic range - see Rylov (1991) and Likharev and Semenov
(1991)). Consider the simple example shown in Fig. 15. The usual
analog dc SQUID (J1, J2, L1) senses the current flowing in the input
coil L2 of its dc transformer, and produces proportional changes of
its output dc voltage V. After low-pass filtering, this signal
controls the SFQ-clock-driven comparator formed by junctions J3-J5. If
the dc signal I is above a certain threshold, the clock pulse T
triggers the 2p-jump in junction J4, which is then applied to the
clocked inverter and the register cell D. As a result, an SFQ pulse
appears at the direct digital output, and also is injected to the
positive (bottom) arm of the feedback loop. After passing through the
Josephson transmission line (J7, J9, J11), the single flux quantum is
injected into the pick-up coil of the SQUID. The polarity of the
injection ensures reduction of the input flux applied to the dc SQUID
by
Figure 15. A simple RSFQ digital SQUID. After Rylov (1991) and
Likharev and Semenov (1991).
Performance of this on-chip negative feedback is generally
similar to that implemented by Fujimaki et al. (1988) with an
important exception that in the RSFQ version the clock frequency f can
be much larger (beyond 100 GHz instead of 0.5 MHz). As a result, the
slew rate s = One also can use RSFQ circuits to implement low-frequency A/D
converters with the same property, but with active (and virtually
infinite) impedance. Such devices, suggested by Semenov and Voronova
(1989), could be used as digital voltmeters with a precision in excess
of 10 decimal digits.
The next most suitable application of this new technology is digital
signal processing, because many algorithms in this field can be
implemented with relatively small on-chip memory. A typical example of
such processing is digital filtering - e.g., see Kung et al. (1985) -
where a linear filter with virtually any transfer function can be
constructed of standard second-order sections. Such sections can be
readily implemented using RSFQ circuits, and can allow superfast
digital signal processing speeds. According to the estimate by
Likharev and Semenov (1991), a second-order section handling a
continuous flow of 32-bit numbers would require some 12,000 Josephson
junctions and could have a throughput up to 5x108 numbers per second,
even if implemented in standard present-day 2.5-um technology.
Thus, RSFQ circuits could extend the well-known complex
methods of digital signal processing from "acoustic" frequencies
(tens of kHz) to "radio" frequencies (tens and hundreds of MHz)
typical for bandwidths of radar, TV, and communication systems.
In contrast with digital signal processing, a universal von
Neumann-type computer is probably the most difficult system for
improving performance with RSFQ (or any other superfast)
technology. The reason is that such a system relies on frequent data
exchange between the processor and memory, with the exchange rate
being limited by at least the speed of light (~100 ps per 1-cm
distance), and by present-day chip packaging technologies (~1 ns for
data transfer to/from the chip). Direct implementation of such a
computer using the RSFQ logic and an existing superconductor memory -
e.g., see Kurosawa et al. (1989), Nagasawa et al. (1989), and Tahara
et al. (1991) - would give a system some 10 times faster than those
achievable with other existing technologies - see estimates by
Likharev and Semenov (1991). This advantage may not be large enough to
justify transfer to superconductor technology, with its necessity for
helium refrigeration.
Nevertheless, there are several factors which encourage me to
estimate the situation as not completely gloomy:
Josephson-junction RSFQ circuits can perform logic and arithmetic
functions at extremely high (sub-terahertz) clock frequencies, just a
few times lower than the maximum internal speed to = h/D(T) of the
superconductors employed. These circuits seem to represent the fastest
digital technology currently available. A list of other advantages of
this technology includes:
These impressive advantages are not meant to imply that RSFQ
circuits are free of problems, but we should distinguish fictional
problems (advertised in some recent popular publications) from real
ones. One of the claims was that one could not avoid the effects of
the parasitic flux trapping in superconducting thin films, especially
in the ground plane, of an integrated RSFQ circuit. However, the
experience of teams that worked on the IBM and MITI projects show that
this problem can be solved by fairly simple magnetic field shielding.
Another myth is that the low amplitude (a few hundred
microvolts) and short duration (a few picoseconds) of SFQ pulses would
make fast communications between RSFQ circuits and a semiconductor
electronic environment impossible. In fact, the existing SFQ/DC
converters can deliver several hundred millivolts of dc voltage at
their outputs in just a few tens of picoseconds. There seems to be no
conceptual problem in the development of special superconductor
drivers which would raise the output voltage to ~10 millivolts in a
few hundred picoseconds - e.g., see Suzuki et al. (1990) - and
semiconductor amplifiers with similar speed capable of reliable
readout of these signals.
Finally, a popular myth stated that RSFQ circuits were
"unreliable." This claim was never exactly formulated, and is thus
very hard to cope with, and I can only hope that recent measurements
of the rare error rate (see Sec. 3.2 above) would put an end to this
myth.
Dealing with real problems, the first one to consider is the
necessity for liquid helium cooling of low-Tc Josephson-junction
circuits. Despite recent progress in closed-cycle cryocooler
technology - see, e.g., Kotani et al. (1991) - refrigeration may
create serious problems for some potential users, and one needs to
have a sizable circuit performance advantage in order to justify the
related pain. I believe that properly designed RSFQ circuits and
systems will be able to demonstrate such performance edge in just a
few years.
Another major problem is the absence of large
Josephson-junction memories; demonstrated RAM chips had shown a decent
access time of the order of 500 ps, but only of a few-kbit size. I do
not see anything inherently wrong with implementation of high-density
Josephson-junction RAMs retaining nearly the same speed, but of course
their development would require a large investment of effort and
money. Right now these resources are not in sight; probably they could
be obtained only when RSFQ technology proves its practical value on
the level of unique small-scale (say, single-chip) devices.
Hopefully, this new digital technology will be able to survive
its painful evolution and perform a real breakthrough in superfast
processing of information.
Multiple discussions of the problems addressed in this paper with many
colleagues are gratefully acknowledged. The author is grateful to
Dr. H. Weinstock for careful editing of the manuscript and many useful
suggestions. This work was partly supported by DoD within the
framework of the University Research Initiative (AFOSR Grant #
F49620-92-J-0508) and by DARPA through the Consortium for
Superconducting Electronics (Prime Contract #MDA 972-90C-0021).
= 1/2n = (
Fmax
a)-3/2,
a is the aperture time of the counter (Fig. 14a), of the order
of 5
0. In other words, due to oversampling and averaging of the
sequential counts, the A/D converter accuracy improves by 1.5 bits for
every octave of reduction in the signal bandwidth Fmax. It means, for
example, that for 2.5-um design rules, n~16 correct bits can be
obtained for the signal bandwidth of 100 MHz. To the best of our
knowledge, this unique performance cannot be challenged by
semiconductor devices, either current or projected.
6.2. DIGITAL SQUIDS

0 , where
<< 1 is the feedback factor. This injection of a
single flux quanta would continue each clock period until the analog
output V of the dc SQUID reduces the input current I of the comparator
below its threshold. As a result, the junction J5 rather than J4 would
be switched each clock period, leading to a supply of SFQ pulses to
the reverse digital output and insertion of the one flux quantum per
each clock period into the negative (top) arm of the feedback
loop. Hence, the proper operation point (right at the threshold) is
approached from either side.

0f can be as large as ~1010
0/s even in this simplest
version of the digital SQUID. (We have used a realistic value
= 10-1
which would make the additional quantization flux noise 
= 
0/f1/2
= 3x10-7
0/Hz1/2 of the device comparable with the intrinsic noise of
the best practical dc SQUIDs.) More complex versions of the digital
feedback using SFQ pulse frequency multiplication (see Semenov (1993))
can presumably increase the slew rate further without increase of the
SQUID noise. The SFQ digital output of the device presents the time
derivative of the signal, just as in the A/D converters considered
above, so that its counting/filtering can be fulfilled by the device
shown in Fig. 14b. Note that the digital SQUIDs like that shown in
Fig. 15 have a virtually unlimited dynamic range, because the static
input inductance of the device is infinite, and current in the pick-up
coil is never large. (This fact also allows one to use large and/or
remote pick-up coils.)
6.3. DIGITAL SIGNAL PROCESSING
6.4. COMPUTING
8. Conclusion. RSFQ: Advantages and Problems
Acknowledgments
References