Image processing, the application of signal processing techniques to two-dimensional images such as photographs or video, is well known in the electronics world. [1] It is key for object recognition and classification, which are critical for automated decision making using machine learning for emerging technologies including robotic vision for self-driving cars [2], remote drones [3], automated in-vitro cell-growth tracking for virus and cancer analysis [4], optical neural networks [5], ultrahigh speed imaging [6, 7] and many others. Many of these require real-time responses to massive real-world information. The volume of data for these applications, as well as the requirement for a real-time response, places extremely high demands on the processing bandwidths. Digital image processing of images using algorithms and digital computers, a branch of digital signal processing (DSP) [8], is well established but will be inadequate to meet these extreme demands due to limitations in processing speed (i.e., the electronic bandwidth) and the well-known von Neumann bottleneck [9].
Photonic RF techniques [10-15] have attracted significant interest over the past two decades due to their ability to provide ultra-high bandwidths, low transmission loss, and strong immunity to electromagnetic interference. They can perform signal processing functions in the optical domain, thus alleviating the bandwidth limitations imposed by analog-to-digital convertors [14] and digital electronics for DSP. Photonics has enabled significant emerging technologies, such as LIDAR for autonomous vehicles, that provides greatly enhanced performance under adverse environmental conditions, as compared with simple camera-based imaging [2].
Here, we demonstrate a photonic analog video image processor. It is based on a soliton crystal Kerr micro-comb source in an integrated micro-ring resonator (MRR) [5, 16-18]. We employ a reconfigurable photonic transversal structure to achieve a range of image processing functions for image edge enhancement, detection, motion blur and others, using up to 75 taps, or wavelengths. The device employs variable fractional order Hilbert transforms and differentiation, integration and bandpass filtering. The processing speed reaches 54 GigaBaud (pixels/s), capable of processing 10,000 video signals (1,200 high definition signals) in parallel. The experimental results agree well with theory, verifying our photonic processor as a new and competitive approach for analog image and video processing with a broad operation bandwidth, high scalability and reconfigurability, and potentially reduced cost and footprint.
Soliton crystal micro-combs
Integrated Kerr optical frequency combs based on micro-cavity resonators, or micro-combs, [16-19] have achieved significant breakthroughs in key emerging applications highlighted by range finding (LIDAR) [2], as well as many other areas including spectroscopy [19, 20], communications [17, 21], optical neural networks [5], frequency synthesis [22], optical ranging [2, 23, 24], quantum sources [25, 26], metrology [27], and microwave photonics [15, 28-30].
Recently, a powerful category of micro-comb - soliton crystals - has attracted interest due to their crystal-like profile in the angular domain in micro-ring resonators [17, 31-33]. They have underpinned breakthroughs in microwave and RF photonics [15, 30], ultrahigh bandwidth communications [17] and optical neuromorphic processing [5]. Their robustness is central to providing stable micro-combs without the need for complex feedback systems. They can be generated simply and deterministically, driven by a mode crossing-induced background wave interacting via the Kerr nonlinearity at high intra-cavity powers. Because the intra-cavity energy of the soliton crystal state is similar to the chaotic state from which they originate, there is no significant change in intracavity energy when they are generated. Hence, there is very little self-induced thermal detuning shift, known as the characteristic ‘soliton step’, that requires complex tuning methods [17, 21] to compensate for in the case of single solitons. This has the important result that soliton crystals can be generated through manual adiabatic pump wavelength sweeping – a simple and reliable initiation process that also results in a much higher energy efficiency (ratio of optical power in the comb lines relative to the pump power) [17].
The MRR used to generate soliton crystal micro-combs was fabricated in a CMOS compatible doped silica glass platform [16, 17] with a Q factor of ~1.5 million, radius ~592 μm, and FSR~0.393 nm (48.9 GHz). This is a very low FSR spacing for an integrated micro-comb source and is critical to this work since it resulted in a large number of wavelengths over the C-band. The chip was coupled with a fibre array, featuring a fibre-chip coupling loss of only 0.5 dB /facet with integrated mode converters. The cross-section of the waveguide was 3 μm × 2 μm, yielding anomalous dispersion in the C band as well as the mode crossing at ~ 1552 nm.
To generate the micro-combs, a CW pump laser was amplified to 30.5 dBm and the wavelength manually swept from blue to red. When the detuning between pump wavelength and MRR’s cold resonance was small enough, the intra-cavity power (Fig. 1 (b)) reached a threshold and modulation instability (MI) driven oscillation resulted. Primary combs (Fig. 1 (ii) (iii)) were generated with a spacing determined by the MI gain peak, a function of the intra—cavity power and dispersion. As the detuning was changed further, a second jump in the intra-cavity power was observed, where distinctive ‘fingerprint’ optical spectra (Fig. 1 (iv)) appeared from the soliton crystals [17, 31-33]. Their spectral shape arises from spectral interference between the tightly packed solitons circulating along the ring cavity. We present theoretical results that support the generation of soliton crystal micro-comb (Supplementary Movie S1). The power fluctuations of the micro-comb were measured over 140 hours (5 days), with the optical spectrum captured every 15 minutes (Fig. 1 (c)), indicating that the micro-comb source is a stable multi-wavelength source for the analog video processor.
Analog image processing
Signal processing is critical for image and video analysis [34-40] to perform functions such as object identification, that include integral and fractional differentiators for edge detection [35-37], fractional Hilbert transformers for edge enhancement [38], integrators and bandpass filters for motion blur [39]. Motion blur is the apparent streaking of moving objects in a photograph or a sequence of frames, and arises when the image being recorded changes during the recording of a single exposure, due to rapid movement or long exposure [39]. Many of these functions are also used in RF applications such as radar systems, signal sideband modulators, measurement systems, signal sampling, and communications [14, 35]. They will be critical for emerging applications such as lidar for autonomous vehicles [2].
Figure 2 illustrates the conceptual diagram for the photonic analog image and video processor. First, the input frame was flattened into a vector x and encoded as the intensity of temporal symbols in a serial electrical waveform at a sampling rate of 54 GBaud with a nominal resolution of 8 bits (see Supplementary for a discussion of the effective number of bits (ENOBs). The impulse response of the image processor is represented by N (= 75) tap weights (h) that encode the optical power of the micro-comb lines via spectral shaping with a WaveShaper.
The input waveform x was multi-cast onto the N shaped comb lines via electro-optical modulation, yielding N replicas weighted by the tap weights h. The waveform was then transmitted through 3.96 km length of fibre to generate a relative delay between wavelengths. Finally, the replicas were summed by photodetection, given by
where ω is the RF angular frequency, T is the time delay between adjacent taps, and h(n) is the tap coefficient of the nth tap, which is the discrete impulse response of the transfer function H(ω) of the signal processor. The discrete impulse response h(n) can be calculated by performing the inverse Fourier transform of the transfer function H(ω) of the signal processor [15, 30]. The output waveform y was then combined before reconstruction.
For a multi-wavelength optical carrier transmitted over a dispersive medium, the relative time delay between adjacent wavelengths is
where D denotes the dispersion coefficient, L denotes the length of the dispersive medium, and Δλ represents the wavelength spacing of the soliton crystal micro-comb, as shown in Fig. 1 (a). Figure 3 illustrates the experimental set-up, which consists of two parts – the comb generation and flattening module, and the transversal structure. The soliton crystal micro-comb spectrum was pre-flattened from the initial scallop shaped spectrum by the first WaveShaper (Finisar 4000S). The flattened comb lines were then modulated by the serial electrical waveform, effectively multicasting the electrical signal onto all wavelengths. The input electronic signal then was transmitted through 3.96 km of standard single mode fibre with a dispersion ~17 ps/nm/km, to yield the progressive delay taps, with a relative inter-tap time delay between adjacent wavelengths of T = 27.08 ps. The second WaveShaper then equalized and weighted the power of the comb lines according to the designed tap weights. Finally, the weighted and delayed taps were combined and converted back into the electronic domain via high speed photodetection (Finisar BPDV2150R). By tailoring the comb lines’ power according to the tap coefficients, arbitrary phase shifts for the Hilbert transformer and fractional orders of the differentiator could be achieved.
Fig. 1 (a) shows the relationship between the wavelength spacing of the comb, the total delay of the fibre, and the resulting RF FSR, or essentially the Nyquist zone. The RF operation bandwidth of the analog image processor is half of the free RF spectral range (FSRRF), given by FSRRF = 1/T, yielding BWRF ~ 18 GHz. Note that although the use of fibre resulted in a significant signal latency, it did not affect the device throughput speed. Further, this latency can be virtually eliminated by using any one of a number of compact dispersive components such as fibre Bragg gratings (FBGs) [41] or tunable dispersion compensators [42].
Figure 4 illustrates the simulated and experimental results of the shaped comb spectra, including the temporal impulse response, frequency response and processed image for a differentiator with a fractional order of 0.5, 0.75, and 1, an integrator with numbers of 15, 45, and 75 taps, as well as a Hilbert transformer with an operation bandwidth of 12, 18, and 38 GHz. Fractional differentiation performs edge detection, while Hilbert transforms perform edge enhancement or ‘sharpening’. Both of these apply to both static images as well as frames of video signals. Integration and bandpass filters address the issue of motion blur (Fig. 4 (i)). The transmission response (Fig. 4 (ii)) was characterized by a vector network analyser (Agilent MS4644B). By varying the comb spacing as well as the fibre length, the operation bandwidth of the Hilbert transformer with a 90° phase shift could be adjusted from 12-38 GHz. Fig. 4 (iii) shows the simulated and measured processed images for different functions. The original high definition (HD) image was captured by a Nikon camera (D5600) with a resolution of 1080 × 1620 pixels. The processed images after 0.5, 0.75, and first-order differentiation were shown in Fig. 4 (a-iii) (b-iii) (c-iii), respectively, which indicate that the edge of the image was successfully detected. Fig. 4 (d -iii) (e-iii) (f-iii) show the processed images after integration with 15, 45, and 75 taps, respectively, where we see that the blur intensity increases with the number of taps. The processed images after the Hilbert transformation for operation bandwidths of 12-38 GHz are shown in Fig. 4 (g-iii) (h-iii) (I-iii), respectively.
Real time analog video processing
To process videos in real time we use a combination of fractional differentiator (order = 0.5), an integrator with 75 taps, and Hilbert transformer with a bandwidth of 18 GHz. Fig. 5 (a) shows the generated waveform together with 5 frames of the original video at a frame rate of 30 frames per second. The video had a resolution of 568 × 320 pixels and was captured by a Drone Quadcopter UAV with Optical Zoom camera (DJL Mavic Air 2 Zoom). The video after differentiation and Hilbert transformation is shown in Fig. 5 (b). Fig. 5 (c) and (d). The input and processed HD videos as well as the waveforms are seen here
For demonstrating a reconfigurable operation bandwidth, we focus on the Hilbert transformer, showing a variable range of 12 - 38 GHz with a phase shift of 90°, achieved by varying the length of fibre (1.838 km vs 3.96 km) as well as by varying the comb spacing (with a 2-FSR and 3-FSR comb spacing) Fig. 4 (g-ii) (h-ii) (I-ii). Note that tunable dispersion compensators [42] can be employed to avoid changing the hardware to vary the bandwidths.