{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# The Application of Quantum Neural Network in NLP\n", "\n", "[![Download Notebook](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/master/resource/_static/logo_notebook_en.svg)](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/notebook/master/mindquantum/en/case_library/mindspore_qnn_for_nlp.ipynb) \n", "[![Download Code](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/master/resource/_static/logo_download_code_en.svg)](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/notebook/master/mindquantum/en/case_library/mindspore_qnn_for_nlp.py) \n", "[![View source on Gitee](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/master/resource/_static/logo_source_en.svg)](https://gitee.com/mindspore/docs/blob/master/docs/mindquantum/docs/source_en/case_library/qnn_for_nlp.ipynb)\n", "\n", "## Overview\n", "\n", "Word embedding plays a key role in natural language processing. It embeds a high-dimension word vector to lower dimension space. When more information is added to the neural network, the training task will become more difficult. By taking advantage of the characteristics of quantum mechanics (e.g., state superposition and entanglement), a quantum neural network can process such classical information during training, thereby improving the accuracy of convergence. In the following, we will build a simple mixed quantum neural network for completing word embedding task.\n", "\n", "Import relevant dependencies of the tutorial." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import time\n", "import mindspore as ms\n", "import mindspore.ops as ops\n", "import mindspore.dataset as ds\n", "from mindspore import nn\n", "from mindquantum.framework import MQLayer\n", "from mindquantum.core.gates import RX, RY, X, H\n", "from mindquantum.core.circuit import Circuit, UN\n", "from mindquantum.core.operators import Hamiltonian, QubitOperator" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This tutorial implements a [CBOW model](https://blog.csdn.net/u010665216/article/details/78724856), which predicts a word based on its position. For example, \"I love natural language processing\", this sentence can be divided by five words, which are \\[\"I\", \"love\", \"natural\", \"language\", \"processing\"\\]. When the selected window is 2, the task to be completed would be to predict the word \"natural\" given \\[“I”, “love”, “language”, “processing”\\]. In the following, we will build a quantum neural network for word embedding to deal with the this task.\n", "\n", "![quantum word embedding](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/master/docs/mindquantum/docs/source_en/images/qcbow.png)\n", "\n", "Here, the encoding information of \"I\", \"love\", \"language\", and \"processing\" will be encoded to the quantum circuit. This quantum circuit to be trained consists of four Ansatz circuits. At last, we measure the qubit in the $\\text{Z}$ base vector for the quantum circuit end. The number of measured qubits is determined by the embedded dimenson.\n", "\n", "## Data Pre-processing\n", "\n", "It is necessary to form a dictionary for the setence to be processed and determine the samples according to the size of the window." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'I': 0, 'language': 1, 'love': 2, 'natural': 3, 'processing': 4}\n", "word dict size: 5\n", "samples: [[['I', 'love', 'language', 'processing'], 'natural']]\n", "number of samples: 1\n" ] } ], "source": [ "def GenerateWordDictAndSample(corpus, window=2):\n", " all_words = corpus.split()\n", " word_set = list(set(all_words))\n", " word_set.sort()\n", " word_dict = {w: i for i, w in enumerate(word_set)}\n", " sampling = []\n", " for index, _ in enumerate(all_words[window:-window]):\n", " around = []\n", " for i in range(index, index + 2*window + 1):\n", " if i != index + window:\n", " around.append(all_words[i])\n", " sampling.append([around, all_words[index + window]])\n", " return word_dict, sampling\n", "\n", "word_dict, sample = GenerateWordDictAndSample(\"I love natural language processing\")\n", "print(word_dict)\n", "print('word dict size: ', len(word_dict))\n", "print('samples: ', sample)\n", "print('number of samples: ', len(sample))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "According to the above information, the size of the dictionary is 5 and it is enough to select a sample.\n", "\n", "## Encoding Circuit\n", "\n", "For simplification, we use the $\\text{RX}$ rotation gate to construct the encoding circuit. The structure is as follows.\n", "\n", "![encoder circuit](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/master/docs/mindquantum/docs/source_en/images/encoder.png)\n", "\n", "We apply a $\\text{RX}$ rotation gate to each quantum qubit." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "q0: q1: q2: RX e_0 RX e_1 RX e_2 " ], "text/plain": [ "" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def GenerateEncoderCircuit(n_qubits, prefix=''):\n", " if prefix and prefix[-1] != '_':\n", " prefix += '_'\n", " circ = Circuit()\n", " for i in range(n_qubits):\n", " circ += RX(prefix + str(i)).on(i)\n", " return circ.as_encoder()\n", "\n", "GenerateEncoderCircuit(3, prefix='e').svg()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$\\left|0\\right>$ and $\\left|1\\right>$ are used to mark the two states of a two-level qubit. According to the state superposition theory, qubit can also be in the superposition of these two states:\n", "\n", "$$\\left|\\psi\\right>=\\alpha\\left|0\\right>+\\beta\\left|1\\right>$$\n", "\n", "For the quantum state of a $n$ bits, it can be in a $2^n$ Hilbert space. For the dictionary composed by the above 5 words, we only need $\\lceil \\log_2 5 \\rceil=3$ qubits to complete the encoding task, which demonstrates the superiority of quantum computing.\n", "\n", "For example. given the word \"love\" in the above dictionary, its corresponding label is 2, represented by `010` in the binary format. We only need to set `e_0`, `e_1`, and `e_2` to $0$, $\\pi$, and $0$ respectively." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Label is: 2\n", "Binary label is: 010\n", "Parameters of encoder is: \n", " [0. 3.14159 0. ]\n", "Encoder circuit is: \n", "\n", " ┏━━━━━━━━━┓ \n", "q0: ──┨ RX(e_0) ┠───\n", " ┗━━━━━━━━━┛ \n", " ┏━━━━━━━━━┓ \n", "q1: ──┨ RX(e_1) ┠───\n", " ┗━━━━━━━━━┛ \n", " ┏━━━━━━━━━┓ \n", "q2: ──┨ RX(e_2) ┠───\n", " ┗━━━━━━━━━┛ \n", "Encoder parameter names are: \n", " ['e_0', 'e_1', 'e_2']\n", "Amplitude of quantum state is: \n", " [0. 0. 1. 0. 0. 0. 0. 0.]\n", "Label in quantum state is: 2\n" ] } ], "source": [ "from mindquantum.simulator import Simulator\n", "\n", "n_qubits = 3 # number of qubits of this quantum circuit\n", "label = 2 # label need to encode\n", "label_bin = bin(label)[-1:1:-1].ljust(n_qubits, '0') # binary form of label\n", "label_array = np.array([int(i) * np.pi for i in label_bin]).astype(np.float32) # parameter value of encoder\n", "encoder = GenerateEncoderCircuit(n_qubits, prefix='e') # encoder circuit\n", "encoder_params_names = encoder.params_name # parameter names of encoder\n", "\n", "print(\"Label is: \", label)\n", "print(\"Binary label is: \", label_bin)\n", "print(\"Parameters of encoder is: \\n\", np.round(label_array, 5))\n", "print(\"Encoder circuit is: \\n\")\n", "print(encoder)\n", "print(\"Encoder parameter names are: \\n\", encoder_params_names)\n", "\n", "state = encoder.get_qs(pr=dict(zip(encoder_params_names, label_array)))\n", "amp = np.round(np.abs(state) ** 2, 3)\n", "\n", "print(\"Amplitude of quantum state is: \\n\", amp)\n", "print(\"Label in quantum state is: \", np.argmax(amp))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Through the above verification, for the data with label 2, the position where the largest amplitude of the quantum state is finally obtained is also 2. Therefore, the obtained quantum state is exactly the encoding information of input label. We summarize the process of generating parameter values through data encoding information into the following function." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(array([[0. , 0. , 0. , 0. , 3.1415927, 0. ,\n", " 3.1415927, 0. , 0. , 0. , 0. , 3.1415927]],\n", " dtype=float32),\n", " array([3], dtype=int32))" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def GenerateTrainData(sample, word_dict):\n", " n_qubits = int(np.ceil(np.log2(1 + max(word_dict.values()))))\n", " data_x = []\n", " data_y = []\n", " for around, center in sample:\n", " data_x.append([])\n", " for word in around:\n", " label = word_dict[word]\n", " label_bin = bin(label)[-1: 1: -1].ljust(n_qubits, '0')\n", " label_array = [int(i)*np.pi for i in label_bin]\n", " data_x[-1].extend(label_array)\n", " data_y.append(word_dict[center])\n", " return np.array(data_x).astype(np.float32), np.array(data_y).astype(np.int32)\n", "\n", "GenerateTrainData(sample, word_dict)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "According to the above result, we merge the encoding information of these 4 input words into a longer vector for further usage of the neural network.\n", "\n", "## Ansatz Circuit\n", "\n", "There is a variety of selections for the Ansatz circuits. We select the below quantum circuit as the Ansatz circuit. A single unit of the Ansatz circuit consists of a [RY](https://www.mindspore.cn/mindquantum/docs/en/master/core/gates/mindquantum.core.gates.RY.html) gate and a [CNOT](https://www.mindspore.cn/mindquantum/docs/en/master/core/gates/mindquantum.core.gates.CNOTGate.html) gate. The full Ansatz circuit can be obtained by repeating $p$ times over this single unit.\n", "\n", "![ansatz circuit](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/master/docs/mindquantum/docs/source_en/images/ansatz.png)\n", "\n", "The following function is defined to construct the Ansatz circuit." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "q0: q1: q2: q3: q4: RY a_0_0 RY a_0_1 RY a_0_2 RY a_0_3 RY a_0_4 RY a_1_0 RY a_1_1 RY a_1_2 RY a_1_3 RY a_1_4 " ], "text/plain": [ "" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def GenerateAnsatzCircuit(n_qubits, layers, prefix=''):\n", " if prefix and prefix[-1] != '_':\n", " prefix += '_'\n", " circ = Circuit()\n", " for l in range(layers):\n", " for i in range(n_qubits):\n", " circ += RY(prefix + str(l) + '_' + str(i)).on(i)\n", " for i in range(l % 2, n_qubits, 2):\n", " if i < n_qubits and i + 1 < n_qubits:\n", " circ += X.on(i + 1, i)\n", " return circ.as_ansatz()\n", "\n", "GenerateAnsatzCircuit(5, 2, 'a').svg()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Measurement\n", "\n", "We treat the measurements of different qubits as the data after dimension reduction. This process is similar to qubit encoding. For example, when we want to reduce the dimension of the word vector to 5, we can process the data in the 3rd dimension as follows:\n", "\n", "- 3 in the binary format is 00011.\n", "- Measure the expectation value of the Z0Z1 hams at the quantum circuit end.\n", "\n", "The below function gives the hams to generate the data in all dimension, where n_qubits represents the number of qubits, dims represents the dimension of word embedding." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[1 [Z0], 1 [Z1], 1 [Z0 Z1], 1 [Z2], 1 [Z0 Z2]]" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def GenerateEmbeddingHamiltonian(dims, n_qubits):\n", " hams = []\n", " for i in range(dims):\n", " s = ''\n", " for j, k in enumerate(bin(i + 1)[-1:1:-1]):\n", " if k == '1':\n", " s = s + 'Z' + str(j) + ' '\n", " hams.append(Hamiltonian(QubitOperator(s)))\n", " return hams\n", "\n", "GenerateEmbeddingHamiltonian(5, 5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Quantum Word Embedding Layer\n", "\n", "The quantum word embedding layer combines the above-mentioned encoding quantum circuit, the quantum circuit to be trained, and the measurement of hams. `num_embedding` words can be embedded into a word vector with `embedding_dim` dimension. Here, a Hadamard gate is added at the beginning of the quantum circuit. The initialization state is set to average superposition state for improving the representation ability of the quantum neural network.\n", "\n", "In the following, we define a quantum embedding layer and it returns a quantum circuit simulation operator." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "def QEmbedding(num_embedding, embedding_dim, window, layers, n_threads):\n", " n_qubits = int(np.ceil(np.log2(num_embedding)))\n", " hams = GenerateEmbeddingHamiltonian(embedding_dim, n_qubits)\n", " circ = Circuit()\n", " circ = UN(H, n_qubits)\n", " encoder_param_name = []\n", " ansatz_param_name = []\n", " for w in range(2 * window):\n", " encoder = GenerateEncoderCircuit(n_qubits, 'Encoder_' + str(w))\n", " ansatz = GenerateAnsatzCircuit(n_qubits, layers, 'Ansatz_' + str(w))\n", " encoder.no_grad()\n", " circ += encoder\n", " circ += ansatz\n", " encoder_param_name.extend(encoder.params_name)\n", " ansatz_param_name.extend(ansatz.params_name)\n", " grad_ops = Simulator('mqvector', circ.n_qubits).get_expectation_with_grad(hams,\n", " circ,\n", " parallel_worker=n_threads)\n", " return MQLayer(grad_ops)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The training model is similar to a classical network, composed by an embedded layer and two fully-connected layers. However, the embedded layer here is constructed by a quantum neural network. The following defines the quantum neural network CBOW." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "class CBOW(nn.Cell):\n", " def __init__(self, num_embedding, embedding_dim, window, layers, n_threads,\n", " hidden_dim):\n", " super(CBOW, self).__init__()\n", " self.embedding = QEmbedding(num_embedding, embedding_dim, window,\n", " layers, n_threads)\n", " self.dense1 = nn.Dense(embedding_dim, hidden_dim)\n", " self.dense2 = nn.Dense(hidden_dim, num_embedding)\n", " self.relu = ops.ReLU()\n", "\n", " def construct(self, x):\n", " embed = self.embedding(x)\n", " out = self.dense1(embed)\n", " out = self.relu(out)\n", " out = self.dense2(out)\n", " return out" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the following, we use a longer sentence for training. Firstly, we define `LossMonitorWithCollection` to supervise the convergence process and record the loss." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "class LossMonitorWithCollection(ms.train.callback.LossMonitor):\n", " def __init__(self, per_print_times=1):\n", " super(LossMonitorWithCollection, self).__init__(per_print_times)\n", " self.loss = []\n", "\n", " def on_train_begin(self, run_context):\n", " self.begin_time = time.time()\n", "\n", " def on_train_end(self, run_context):\n", " self.end_time = time.time()\n", " print('Total time used: {}'.format(self.end_time - self.begin_time))\n", "\n", " def on_train_epoch_begin(self, run_context):\n", " self.epoch_begin_time = time.time()\n", "\n", " def on_train_epoch_end(self, run_context):\n", " cb_params = run_context.original_args()\n", " self.epoch_end_time = time.time()\n", " if self._per_print_times != 0 and cb_params.cur_step_num % self._per_print_times == 0:\n", " print('')\n", "\n", " def on_train_step_end(self, run_context):\n", " cb_params = run_context.original_args()\n", " loss = cb_params.net_outputs\n", "\n", " if isinstance(loss, (tuple, list)):\n", " if isinstance(loss[0], ms.Tensor) and isinstance(loss[0].asnumpy(), np.ndarray):\n", " loss = loss[0]\n", "\n", " if isinstance(loss, ms.Tensor) and isinstance(loss.asnumpy(), np.ndarray):\n", " loss = np.mean(loss.asnumpy())\n", "\n", " cur_step_in_epoch = (cb_params.cur_step_num - 1) % cb_params.batch_num + 1\n", "\n", " if isinstance(loss, float) and (np.isnan(loss) or np.isinf(loss)):\n", " raise ValueError(\"epoch: {} step: {}. Invalid loss, terminating training.\".format(\n", " cb_params.cur_epoch_num, cur_step_in_epoch))\n", " self.loss.append(loss)\n", " if self._per_print_times != 0 and cb_params.cur_step_num % self._per_print_times == 0:\n", " print(\"\\repoch: %+3s step: %+3s time: %5.5s, loss is %5.5s\" % (cb_params.cur_epoch_num, cur_step_in_epoch, time.time() - self.epoch_begin_time, loss), flush=True, end='')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, embed a long setence by using the quantum `CBOW`. This command sets the thread of the quantum simulators to 4. When the number of qubits to be simulated is large, more threads can be set to improve the simulation efficiency." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "epoch: 25 step: 20 time: 0.247, loss is 0.103\n", "epoch: 50 step: 20 time: 0.265, loss is 0.049\n", "epoch: 75 step: 20 time: 0.259, loss is 0.031\n", "epoch: 100 step: 20 time: 0.245, loss is 0.022\n", "epoch: 125 step: 20 time: 0.249, loss is 0.019\n", "epoch: 150 step: 20 time: 0.270, loss is 0.020\n", "epoch: 175 step: 20 time: 0.305, loss is 0.020\n", "epoch: 200 step: 20 time: 0.234, loss is 0.023\n", "epoch: 225 step: 20 time: 0.236, loss is 0.026\n", "epoch: 250 step: 20 time: 0.231, loss is 0.021\n", "epoch: 275 step: 20 time: 0.240, loss is 0.024\n", "epoch: 300 step: 20 time: 0.281, loss is 0.022\n", "epoch: 325 step: 20 time: 0.235, loss is 0.018\n", "epoch: 350 step: 20 time: 0.255, loss is 0.018\n", "Total time used: 91.56754469871521\n" ] } ], "source": [ "import mindspore as ms\n", "ms.set_context(mode=ms.PYNATIVE_MODE, device_target=\"CPU\")\n", "corpus = \"\"\"We are about to study the idea of a computational process.\n", "Computational processes are abstract beings that inhabit computers.\n", "As they evolve, processes manipulate other abstract things called data.\n", "The evolution of a process is directed by a pattern of rules\n", "called a program. People create programs to direct processes. In effect,\n", "we conjure the spirits of the computer with our spells.\"\"\"\n", "\n", "ms.set_seed(42)\n", "window_size = 2\n", "embedding_dim = 10\n", "hidden_dim = 128\n", "word_dict, sample = GenerateWordDictAndSample(corpus, window=window_size)\n", "train_x, train_y = GenerateTrainData(sample, word_dict)\n", "\n", "train_loader = ds.NumpySlicesDataset({\n", " \"around\": train_x,\n", " \"center\": train_y\n", "}, shuffle=False).batch(3)\n", "net = CBOW(len(word_dict), embedding_dim, window_size, 3, 4, hidden_dim)\n", "net_loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction='mean')\n", "net_opt = nn.Momentum(net.trainable_params(), 0.01, 0.9)\n", "loss_monitor = LossMonitorWithCollection(500)\n", "model = ms.Model(net, net_loss, net_opt)\n", "model.train(350, train_loader, callbacks=[loss_monitor], dataset_sink_mode=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Print the loss value during convergence:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import matplotlib.pyplot as plt\n", "\n", "plt.plot(loss_monitor.loss, '.')\n", "plt.xlabel('Steps')\n", "plt.ylabel('Loss')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The method of printing the parameters of the quantum embedded layer is as follows:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 8.11327994e-02, -3.34400564e-01, -1.23247825e-01, 5.81944704e-01,\n", " -4.20968421e-03, 3.15563884e-05, 2.42589042e-01, 8.80479038e-01,\n", " -1.43023849e-01, -6.37480104e-03, 2.73182592e-03, 1.65943671e-02,\n", " 2.39036694e-01, -2.39808977e-01, -6.56178296e-01, 2.62607052e-03,\n", " -9.76558731e-05, -7.48617807e-03, 4.85512346e-01, 8.62547606e-02,\n", " 1.09600239e-02, -1.94667071e-01, 5.48206130e-03, 2.82003220e-05,\n", " 2.83775508e-01, -3.44718695e-01, 2.57234443e-02, -1.58091113e-01,\n", " -5.39550185e-03, -1.15225427e-02, 2.88938046e-01, -5.74903965e-01,\n", " -2.53041506e-01, -1.81123063e-01, -5.67151117e-04, -3.33190081e-03,\n", " 3.47066782e-02, 2.39473388e-01, 1.34246838e+00, -9.32823777e-01,\n", " 1.55618461e-03, 1.34847098e-04, 7.36262277e-02, -1.90044902e-02,\n", " -1.26371592e-01, 4.32286650e-01, -3.66644454e-05, -1.36820097e-02,\n", " 7.11344108e-02, -3.02037269e-01, -1.80939063e-01, 4.20952231e-01,\n", " -6.96726423e-03, -3.31268320e-03, 2.85857711e-02, 2.78895229e-01,\n", " -2.74261057e-01, 1.94433972e-01, -1.66424108e-03, -2.27207807e-03,\n", " 6.26490265e-02, -1.98727295e-01, -1.25026256e-01, -1.52513385e-01,\n", " -5.60277607e-03, -7.44100334e-03, 4.44238521e-02, -6.64802119e-02,\n", " 1.55135123e-02, -1.33805767e-01, 1.74699686e-02, -1.28326667e-02],\n", " dtype=float32)" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "net.embedding.weight.asnumpy()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Classical Word Embedding Layer\n", "\n", "Here, we construct a classical CBOW neural network with the classical word embedding layer. This classical CBOW is compared with the quantum one.\n", "\n", "Firstly, we construct the classical CBOW neural network and the parameters are similar to the ones in the quantum CBOW." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "class CBOWClassical(nn.Cell):\n", " def __init__(self, num_embedding, embedding_dim, window, hidden_dim):\n", " super(CBOWClassical, self).__init__()\n", " self.dim = 2 * window * embedding_dim\n", " self.embedding = nn.Embedding(num_embedding, embedding_dim, True)\n", " self.dense1 = nn.Dense(self.dim, hidden_dim)\n", " self.dense2 = nn.Dense(hidden_dim, num_embedding)\n", " self.relu = ops.ReLU()\n", " self.reshape = ops.Reshape()\n", "\n", " def construct(self, x):\n", " embed = self.embedding(x)\n", " embed = self.reshape(embed, (-1, self.dim))\n", " out = self.dense1(embed)\n", " out = self.relu(out)\n", " out = self.dense2(out)\n", " return out" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Generate the dataset for the classical CBOW neural network." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "train_x shape: (58, 4)\n", "train_y shape: (58,)\n" ] } ], "source": [ "train_x = []\n", "train_y = []\n", "for i in sample:\n", " around, center = i\n", " train_y.append(word_dict[center])\n", " train_x.append([])\n", " for j in around:\n", " train_x[-1].append(word_dict[j])\n", "train_x = np.array(train_x).astype(np.int32)\n", "train_y = np.array(train_y).astype(np.int32)\n", "print(\"train_x shape: \", train_x.shape)\n", "print(\"train_y shape: \", train_y.shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Train the classical CBOW network." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "epoch: 25 step: 20 time: 0.022, loss is 0.627\n", "epoch: 50 step: 20 time: 0.028, loss is 0.011\n", "epoch: 75 step: 20 time: 0.026, loss is 0.003\n", "epoch: 100 step: 20 time: 0.022, loss is 0.002\n", "epoch: 125 step: 20 time: 0.017, loss is 0.001\n", "epoch: 150 step: 20 time: 0.021, loss is 0.001\n", "epoch: 175 step: 20 time: 0.027, loss is 0.000\n", "epoch: 200 step: 20 time: 0.019, loss is 0.000\n", "epoch: 225 step: 20 time: 0.019, loss is 0.000\n", "epoch: 250 step: 20 time: 0.019, loss is 0.000\n", "epoch: 275 step: 20 time: 0.018, loss is 0.000\n", "epoch: 300 step: 20 time: 0.025, loss is 0.000\n", "epoch: 325 step: 20 time: 0.018, loss is 0.000\n", "epoch: 350 step: 20 time: 0.017, loss is 0.000\n", "Total time used: 8.476526975631714\n" ] } ], "source": [ "ms.set_context(mode=ms.GRAPH_MODE, device_target=\"CPU\")\n", "\n", "train_loader = ds.NumpySlicesDataset({\n", " \"around\": train_x,\n", " \"center\": train_y\n", "}, shuffle=False).batch(3)\n", "net = CBOWClassical(len(word_dict), embedding_dim, window_size, hidden_dim)\n", "net_loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction='mean')\n", "net_opt = nn.Momentum(net.trainable_params(), 0.01, 0.9)\n", "loss_monitor = LossMonitorWithCollection(500)\n", "model = ms.Model(net, net_loss, net_opt)\n", "model.train(350, train_loader, callbacks=[loss_monitor], dataset_sink_mode=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Print the loss value during convergence:" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import matplotlib.pyplot as plt\n", "\n", "plt.plot(loss_monitor.loss, '.')\n", "plt.xlabel('Steps')\n", "plt.ylabel('Loss')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "According to the above result, it can be seen that the quantum word embedding model generated by the quantum simulation can complete the word embedding task perfectly. When classical computers cannot handle the large quantity of data, the quantum computers can easily deal with large data." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", "\n", " \n", " \n", "\n", "\n", "
SoftwareVersion
mindquantum0.9.11
scipy1.10.1
numpy1.23.5
SystemInfo
Python3.9.16
OSLinux x86_64
Memory8.3 GB
CPU Max Thread8
DateMon Jan 1 01:34:10 2024
\n" ], "text/plain": [ "" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from mindquantum.utils.show_info import InfoTable\n", "\n", "InfoTable('mindquantum', 'scipy', 'numpy')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Reference\n", "\n", "[1] Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean. [Efficient Estimation of Word Representations in Vector Space](https://arxiv.org/pdf/1301.3781.pdf)" ] } ], "metadata": { "kernelspec": { "display_name": "base", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.16" } }, "nbformat": 4, "nbformat_minor": 2 }