AI using Optics - Photons replacing electrons once again

In the era of AI revolution, and simultaneous fight for reducing carbon emission, Optics is showing the path with full potential

Let’s look back a little into the history of telecommunication. Intercontinental wired communication started through under-sea copper-wires in the mid-nineteenth century. After that a century was spent for wired communication with the improvement of only from telegraph to telephone. It was in the late twentieth century when optical technology came into play for wired telecommunication through Optical Fibres, without which broadband communication and “world-wide-web” of internet would not be possible. Electrons were replaced by photons for faster and wider range of communication.

A similar scenario is going to happen in case of the modern technology revolution with AI. Nvidia is coming up with newer generations of GPUs. Tech giants are competing with more efficient generative AI models. New AI startups are coming up every day. There is always the need for faster systems, capable of handling higher amounts of data and heavier computation. On the other hand, among all these, there is the concern of environmental impact. Each answer from ChatGPT is consuming more and more energy. Reason? Each calculation in the processors needs the electrons to pass through the circuits in the processor, heating up the devices.

Here again comes ‘Light’ into the play – to make life simpler and smoother, to replace electrons by photons.

Number one, photons do not heat up its travelling path.

Number two, photons travel way faster than electrons.

This replacement can be achieved by making the whole neural network of a deep learning model with optics. - they are calling it Optical Neural Networks (ONNs)

How does a Neural Network work? Suppose you want to build an app which can detect text characters from an uploaded image of handwritten text, and convert them to digital text. So you need a system trained with this detection process that can recognise handwritten characters. A neural network can help with this training, and the practical application after the training.

The network acts as a mathematical function, - a complex nonlinear function, with plenty of coefficients weighted with the input pixel values from the image, and at the output giving a result as recognised digital characters.

This complicated function consists of multiple layers. Each layer contains maybe hundreds (or more) of neurons. Similar to the biological neurons or nerve cells transmitting signal from one to the next with the axons and dendrons, the neurons of a Deep Learning model also pass on the data from its previous layer to its next layer. The first layer takes the input values from the image, and passes on to the next layer, and so on. Finally the last layer passes the resulting signal as output. This passage of signal occurs depending on the selective activation of the neurons, and weights and biases. So, there is a network of hundreds or thousands of neurons in between the input data and the output result. And the result, be it a classification or a regression problem, depends on the training of the weights and biases of all the neurons in this network.

An average sized network, e.g., a model for recognising handwritten texts or digits, may contain millions of parameters associated with the neurons of the entire network of the model. And the size becomes larger with more complicated applications. Training a deep learning model and applying the trained model consumes a huge amount of energy. Also, for those large networks, it takes a long time to train the models, which may be hours to days.

Now, what if this entire network can be built and trained and used with optical components and devices? The model will work with the speed of light. And there will be much less energy required to run a photonic processor instead of an electronic processor.

And the exciting news is that researchers and engineers are already more than halfway towards this shift.

The training of a neural network can be represented as a matrix multiplication, where the matrix has unknown values needed to be assigned through the training process. And after the training, the values of the matrix elements become properly assigned.

Now, imagine a light source at the position of each matrix element. If the properties of the light source, e.g. wavelength, amplitude, phase, polarisation, etc. can be tuned like the assigned values of the matrix elements, it can constitute an optical neural network.

The scientists of Aydogan Ozcan’s research group of UCLA came up with a breakthrough - demonstrating a system consisting of diffractive optical elements, which can work as a trained neural network model. They call it the Diffractive Deep Neural Network or D2NN. They have simulated multiple layers of diffractive optical elements according to a trained neural network. The diffraction centres of each layer acting as a secondary light source represents a weighted neuron. Depending on the diffraction pattern of light passing through these diffractive layers, it decides the transmission of information from the input layer to the output. After the simulation of these diffractive layers, this D2NN can be physically manufactured by 3D printing or lithography. The research group already achieved a functional physical D2NN which is able to identify patterns from MNIST handwritten digits and MNIST fashion datasets.

But, in the D2NN designs proposed by Ozcan’s group are, firstly, pre-trained on electronic computers, hence the training part is not replaced by optical hardware; secondly, there is a possibility of accumulated errors during the fabrication process of the diffractive layers, which will reduce the precision of the result significantly.

MZI (Mach-Zender Interferometer) mesh offers a solution for on-chip optical neural networks. An MZI mesh can construct a neural network weight matrix using on-chip beam splitters, phase-shifters and optical attenuators. Any matrix can be decomposed into one diagonal matrix and two unitary matrices using the singular value decomposition method. Now, an optical attenuator can perform the diagonal matrix function, and beam splitters and phase-shifters can act for unitary matrices. Scientists have used this pretrained MZI mesh matrix to achieve 77% accuracy on blind testing of vowel recognition dataset.

To go one step further, you can directly train this on-chip optical hardware, which solves the problem of error accumulation during the chip-fabrication process. As well as it provides the facility of in-situ online training.

Another step forward has even been achieved when integrated diffractive elements were introduced along with the MZI mesh ONN. The diffractive elements are able to perform Fourier Transform and Inverse Transform, and thereby improve the ONN performance regarding classification experiments, as already shown for IRIS dataset and MNIST handwritten digits.

Micro Ring Resonator (MRR) offers another degree of freedom - wavelength. Due to the resonating structure, MRRs can filter and control the optical power of different wavelengths. These MRRs can be used as ‘Weight Banks’ in the optical neural network to reduce the matrix multiplication. WDM (Wavelength Division Multiplexing) and tuned filtering do the rest. Multiple data channels are encoded at different wavelengths that travel simultaneously through a single waveguide. By adjusting the resonance of each MRR, the amount of light coupled out at each wavelength can be controlled. This acts as multiplying weights to the neurons of the network.

Phase Change Materials (PCM) integrated along with the MRR weight banks on chip adds nonlinearity to the network. In addition, a spiking all-optical synaptic system based on this combination avoids the physical separation of memory and processor as in the conventional computing system.

Diffractive Metasurfaces - on chip: 

Metasurfaces containing 2D arrays of subwavelength nanostructures can be engineered to control the amplitude, phase and polarisation of light at each point precisely. By arranging these units, you can make the metasurfaces act like trainable layers, each nanostructure unit acting as a neuron and its optical response encoding the learned weight. Thus these metasurfaces can directly implement matrix multiplications to construct and act as an on-chip diffractive optical neural network (DONN).

Now, we have to incorporate nonlinearity into the activation of the neurons. A hybrid Optic-Electronic network can be built to implement nonlinearity to diffractive metasurfaces. But the target is to make it all-optical. To achieve this target, people are working on introducing nonlinearity to diffractive metasurfaces with Kerr Optical effect or adding Phase Change Materials.

These on-chip Diffractive ONNs or MZI mesh ONNs have already outperformed some Nvidia GPUs and Google TPUs in terms of computational capacity and energy consumption.

Then, what are we waiting for? Is it already close to market? Technically, there is still a long way to go, challenges to overcome - to achieve all-optical trainable ONNs. The MZI or MRR based ONNs can overcome the issue of fabrication error, but faces difficulty in scalability. They require additional energy supply, and face difficulty during synchronous modulation of large-scale and high-speed on-chip ONNs. On the other hand, on-chip diffractive ONNs with sub-wavelength structural units show high scalability, but still lack nonlinearity function.

So, how long will it take to replace the electronic AI hardware?

It is already there… of course not fully, but at least in small steps.

A Stuttgart based new company Q.ANT has launched their photonic chip and named it Native Processing Unit (NPU). It is using Lithium Niobate material to introduce nonlinearity in the operations. It claims to offer 30x energy efficiency and 50x faster processing. And what about error and precision? Q.ANT’s NPU claims to be able to do calculations with 16 floating point precision!

Another MIT spin-off start-up, Lightmatter, has launched their photonic AI accelerator chip ‘Envise’. It is a photonic-electronic hybrid chip, - acts as an alternative to digital computer and capable of running Large language Models as accurately as a 32 bit digital computer.

And what about the tech giants?

At GTC 2025, NVIDIA CEO Jensen Huang unveiled NVIDIA Spectrum-X and Quantum-X photonics networking switches, which integrate co-packaged optics (CPO) to connect millions of GPUs for parallel operation. These switches fuse optical circuits based on Micro Ring Modulators with electronic circuits, and the resulting technology replaces the electrical transceivers reducing the power consumption significantly.

It is not difficult to forecast that the new companies and the big players will, either individually or collaboratively, come up with new products which will have more optics and less electronics. And maybe we are going to see a large paradigm shift of AI computers in another decade!