{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 量子神经网络在自然语言处理中的应用\n", "\n", "[![下载Notebook](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/r1.7/resource/_static/logo_notebook.png)](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/notebook/r1.7/mindquantum/zh_cn/mindspore_qnn_for_nlp.ipynb) [![下载样例代码](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/r1.7/resource/_static/logo_download_code.png)](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/notebook/r1.7/mindquantum/zh_cn/mindspore_qnn_for_nlp.py) [![查看源文件](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/r1.7/resource/_static/logo_source.png)](https://gitee.com/mindspore/docs/blob/r1.7/docs/mindquantum/docs/source_zh_cn/qnn_for_nlp.ipynb)\n", "\n", "## 概述\n", "\n", "在自然语言处理过程中,词嵌入(Word embedding)是其中的重要步骤,它是一个将高维度空间的词向量嵌入到一个维数更低的连续向量空间的过程。当给予神经网络的语料信息不断增加时,网络的训练过程将越来越困难。利用量子力学的态叠加和纠缠等特性,我们可以利用量子神经网络来处理这些经典语料信息,加入其训练过程,并提高收敛精度。下面,我们将简单地搭建一个量子经典混合神经网络来完成一个词嵌入任务。\n", "\n", "## 环境准备\n", "\n", "设置系统所使用的线程数,当您的服务器CPU较多时,如果不设置,系统默认调用所有CPU,反而会导致模拟变慢甚至卡住。" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "\n", "os.environ['OMP_NUM_THREADS'] = '1'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "导入本教程所依赖模块" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import time\n", "from mindquantum.core import QubitOperator\n", "import mindspore.ops as ops\n", "import mindspore.dataset as ds\n", "from mindspore import nn\n", "from mindspore.train.callback import LossMonitor\n", "from mindspore import Model\n", "from mindquantum.framework import MQLayer\n", "from mindquantum import Hamiltonian, Circuit, RX, RY, X, H, UN" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "本教程实现的是一个[CBOW模型](https://blog.csdn.net/u010665216/article/details/78724856),即利用某个词所处的环境来预测该词。例如对于“I love natural language processing”这句话,我们可以将其切分为5个词,\\[\"I\", \"love\", \"natural\", \"language\", \"processing”\\],在所选窗口为2时,我们要处理的问题是利用\\[\"I\", \"love\", \"language\", \"processing\"\\]来预测出目标词汇\"natural\"。这里我们以窗口为2为例,搭建如下的量子神经网络,来完成词嵌入任务。\n", "\n", "![quantum word embedding](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/r1.7/docs/mindquantum/docs/source_zh_cn/images/qcbow.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "这里,编码线路会将\"I\"、\"love\"、\"language\"和\"processing\"的编码信息编码到量子线路中,待训练的量子线路由四个Ansatz线路构成,最后我们在量子线路末端对量子比特做$\\text{Z}$基矢上的测量,具体所需测量的比特的个数由所需嵌入空间的维数确定。\n", "\n", "## 数据预处理\n", "\n", "我们对所需要处理的语句进行处理,生成关于该句子的词典,并根据窗口大小来生成样本点。" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "def GenerateWordDictAndSample(corpus, window=2):\n", " all_words = corpus.split()\n", " word_set = list(set(all_words))\n", " word_set.sort()\n", " word_dict = {w: i for i, w in enumerate(word_set)}\n", " sampling = []\n", " for index, _ in enumerate(all_words[window:-window]):\n", " around = []\n", " for i in range(index, index + 2*window + 1):\n", " if i != index + window:\n", " around.append(all_words[i])\n", " sampling.append([around, all_words[index + window]])\n", " return word_dict, sampling" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'I': 0, 'language': 1, 'love': 2, 'natural': 3, 'processing': 4}\n", "word dict size: 5\n", "samples: [[['I', 'love', 'language', 'processing'], 'natural']]\n", "number of samples: 1\n" ] } ], "source": [ "word_dict, sample = GenerateWordDictAndSample(\"I love natural language processing\")\n", "print(word_dict)\n", "print('word dict size: ', len(word_dict))\n", "print('samples: ', sample)\n", "print('number of samples: ', len(sample))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "根据如上信息,我们得到该句子的词典大小为5,能够产生一个样本点。\n", "\n", "## 编码线路\n", "\n", "为了简单起见,我们使用的编码线路由$\\text{RX}$旋转门构成,结构如下。\n", "\n", "![encoder circuit](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/r1.7/docs/mindquantum/docs/source_zh_cn/images/encoder.png)\n", "\n", "我们对每个量子门都作用一个$\\text{RX}$旋转门。" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "def GenerateEncoderCircuit(n_qubits, prefix=''):\n", " if prefix and prefix[-1] != '_':\n", " prefix += '_'\n", " circ = Circuit()\n", " for i in range(n_qubits):\n", " circ += RX(prefix + str(i)).on(i)\n", " return circ" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
q0: ──RX(e_0)──\n",
       "               \n",
       "q1: ──RX(e_1)──\n",
       "               \n",
       "q2: ──RX(e_2)──\n",
       "
\n" ], "text/plain": [ "q0: ──RX(e_0)──\n", " \n", "q1: ──RX(e_1)──\n", " \n", "q2: ──RX(e_2)──" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "GenerateEncoderCircuit(3, prefix='e')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "我们通常用$\\left|0\\right>$和$\\left|1\\right>$来标记二能级量子比特的两个状态,由态叠加原理,量子比特还可以处于这两个状态的叠加态:\n", "\n", "$$\\left|\\psi\\right>=\\alpha\\left|0\\right>+\\beta\\left|1\\right>$$\n", "\n", "对于$n$比特的量子态,其将处于$2^n$维的希尔伯特空间中。对于上面由5个词构成的词典,我们只需要$\\lceil \\log_2 5 \\rceil=3$个量子比特即可完成编码,这也体现出量子计算的优越性。\n", "\n", "例如对于上面词典中的\"love\",其对应的标签为2,2的二进制表示为`010`,我们只需将编码线路中的`e_0`、`e_1`和`e_2`分别设为$0$、$\\pi$和$0$即可。下面来验证一下。" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Label is: 2\n", "Binary label is: 010\n", "Parameters of encoder is: \n", " [0. 3.14159 0. ]\n", "Encoder circuit is: \n", " q0: ──RX(e_0)──\n", " \n", "q1: ──RX(e_1)──\n", " \n", "q2: ──RX(e_2)──\n", "Encoder parameter names are: \n", " ['e_0', 'e_1', 'e_2']\n", "Amplitude of quantum state is: \n", " [0. 0. 1. 0. 0. 0. 0. 0.]\n", "Label in quantum state is: 2\n" ] } ], "source": [ "from mindquantum.simulator import Simulator\n", "from mindspore import context\n", "from mindspore import Tensor\n", "\n", "n_qubits = 3 # number of qubits of this quantum circuit\n", "label = 2 # label need to encode\n", "label_bin = bin(label)[-1: 1: -1].ljust(n_qubits, '0') # binary form of label\n", "label_array = np.array([int(i)*np.pi for i in label_bin]).astype(np.float32) # parameter value of encoder\n", "encoder = GenerateEncoderCircuit(n_qubits, prefix='e') # encoder circuit\n", "encoder_params_names = encoder.params_name # parameter names of encoder\n", "\n", "print(\"Label is: \", label)\n", "print(\"Binary label is: \", label_bin)\n", "print(\"Parameters of encoder is: \\n\", np.round(label_array, 5))\n", "print(\"Encoder circuit is: \\n\", encoder)\n", "print(\"Encoder parameter names are: \\n\", encoder_params_names)\n", "\n", "# quantum state evolution operator\n", "state = encoder.get_qs(pr=dict(zip(encoder_params_names, label_array)))\n", "amp = np.round(np.abs(state)**2, 3)\n", "\n", "print(\"Amplitude of quantum state is: \\n\", amp)\n", "print(\"Label in quantum state is: \", np.argmax(amp))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "通过上面的验证,我们发现,对于标签为2的数据,最后得到量子态的振幅最大的位置也是2,因此得到的量子态正是对输入标签的编码。我们将对数据编码生成参数数值的过程总结成如下函数。" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "def GenerateTrainData(sample, word_dict):\n", " n_qubits = np.int(np.ceil(np.log2(1 + max(word_dict.values()))))\n", " data_x = []\n", " data_y = []\n", " for around, center in sample:\n", " data_x.append([])\n", " for word in around:\n", " label = word_dict[word]\n", " label_bin = bin(label)[-1: 1: -1].ljust(n_qubits, '0')\n", " label_array = [int(i)*np.pi for i in label_bin]\n", " data_x[-1].extend(label_array)\n", " data_y.append(word_dict[center])\n", " return np.array(data_x).astype(np.float32), np.array(data_y).astype(np.int32)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(array([[0. , 0. , 0. , 0. , 3.1415927, 0. ,\n", " 3.1415927, 0. , 0. , 0. , 0. , 3.1415927]],\n", " dtype=float32),\n", " array([3], dtype=int32))" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "GenerateTrainData(sample, word_dict)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "根据上面的结果,我们将4个输入的词编码的信息合并为一个更长向量,便于后续神经网络调用。\n", "\n", "## Ansatz线路\n", "\n", "Ansatz线路的选择多种多样,我们选择如下的量子线路作为Ansatz线路,它的一个单元由一层$\\text{RY}$门和一层$\\text{CNOT}$门构成,对此单元重复$p$次构成整个Ansatz线路。\n", "\n", "![ansatz circuit](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/r1.7/docs/mindquantum/docs/source_zh_cn/images/ansatz.png)\n", "\n", "定义如下函数生成Ansatz线路。" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "def GenerateAnsatzCircuit(n_qubits, layers, prefix=''):\n", " if prefix and prefix[-1] != '_':\n", " prefix += '_'\n", " circ = Circuit()\n", " for l in range(layers):\n", " for i in range(n_qubits):\n", " circ += RY(prefix + str(l) + '_' + str(i)).on(i)\n", " for i in range(l % 2, n_qubits, 2):\n", " if i < n_qubits and i + 1 < n_qubits:\n", " circ += X.on(i + 1, i)\n", " return circ" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
q0: ──RY(a_0_0)────────●────────RY(a_1_0)───────\n",
       "\n",
       "q1: ──RY(a_0_1)────────X────────RY(a_1_1)────●──\n",
       "\n",
       "q2: ──RY(a_0_2)────────●────────RY(a_1_2)────X──\n",
       "\n",
       "q3: ──RY(a_0_3)────────X────────RY(a_1_3)────●──\n",
       "\n",
       "q4: ──RY(a_0_4)────RY(a_1_4)─────────────────X──\n",
       "
\n" ], "text/plain": [ "q0: ──RY(a_0_0)────────●────────RY(a_1_0)───────\n", " │ \n", "q1: ──RY(a_0_1)────────X────────RY(a_1_1)────●──\n", " │ \n", "q2: ──RY(a_0_2)────────●────────RY(a_1_2)────X──\n", " │ \n", "q3: ──RY(a_0_3)────────X────────RY(a_1_3)────●──\n", " │ \n", "q4: ──RY(a_0_4)────RY(a_1_4)─────────────────X──" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "GenerateAnsatzCircuit(5, 2, 'a')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 测量\n", "\n", "我们把对不同比特位上的测量结果作为降维后的数据。具体过程与比特编码类似,例如当我们想将词向量降维为5维向量时,对于第3维的数据可以如下产生:\n", "\n", "- 3对应的二进制为`00011`。\n", "- 测量量子线路末态对$Z_0Z_1$哈密顿量的期望值。\n", "\n", "下面函数将给出产生各个维度上数据所需的哈密顿量(hams),其中`n_qubits`表示线路的比特数,`dims`表示词嵌入的维度:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "def GenerateEmbeddingHamiltonian(dims, n_qubits):\n", " hams = []\n", " for i in range(dims):\n", " s = ''\n", " for j, k in enumerate(bin(i + 1)[-1:1:-1]):\n", " if k == '1':\n", " s = s + 'Z' + str(j) + ' '\n", " hams.append(Hamiltonian(QubitOperator(s)))\n", " return hams" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[1.0 [Z0] , 1.0 [Z1] , 1.0 [Z0 Z1] , 1.0 [Z2] , 1.0 [Z0 Z2] ]" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "GenerateEmbeddingHamiltonian(5, 5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 量子版词向量嵌入层\n", "\n", "量子版词向量嵌入层结合前面的编码量子线路和待训练量子线路,以及测量哈密顿量,将`num_embedding`个词嵌入为`embedding_dim`维的词向量。这里我们还在量子线路的最开始加上了Hadamard门,将初态制备为均匀叠加态,用以提高量子神经网络的表达能力。\n", "\n", "下面,我们定义量子嵌入层,它将返回一个量子线路模拟算子。" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "def QEmbedding(num_embedding, embedding_dim, window, layers, n_threads):\n", " n_qubits = int(np.ceil(np.log2(num_embedding)))\n", " hams = GenerateEmbeddingHamiltonian(embedding_dim, n_qubits)\n", " circ = Circuit()\n", " circ = UN(H, n_qubits)\n", " encoder_param_name = []\n", " ansatz_param_name = []\n", " for w in range(2 * window):\n", " encoder = GenerateEncoderCircuit(n_qubits, 'Encoder_' + str(w))\n", " ansatz = GenerateAnsatzCircuit(n_qubits, layers, 'Ansatz_' + str(w))\n", " encoder.no_grad()\n", " circ += encoder\n", " circ += ansatz\n", " encoder_param_name.extend(encoder.params_name)\n", " ansatz_param_name.extend(ansatz.params_name)\n", " grad_ops = Simulator('projectq', circ.n_qubits).get_expectation_with_grad(hams,\n", " circ,\n", " None,\n", " None,\n", " encoder_param_name,\n", " ansatz_param_name,\n", " n_threads)\n", " return MQLayer(grad_ops)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "整个训练模型跟经典网络类似,由一个嵌入层和两个全连通层构成,然而此处的嵌入层是由量子神经网络构成。下面定义量子神经网络CBOW。" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "class CBOW(nn.Cell):\n", " def __init__(self, num_embedding, embedding_dim, window, layers, n_threads,\n", " hidden_dim):\n", " super(CBOW, self).__init__()\n", " self.embedding = QEmbedding(num_embedding, embedding_dim, window,\n", " layers, n_threads)\n", " self.dense1 = nn.Dense(embedding_dim, hidden_dim)\n", " self.dense2 = nn.Dense(hidden_dim, num_embedding)\n", " self.relu = ops.ReLU()\n", "\n", " def construct(self, x):\n", " embed = self.embedding(x)\n", " out = self.dense1(embed)\n", " out = self.relu(out)\n", " out = self.dense2(out)\n", " return out" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "下面我们对一个稍长的句子来进行训练。首先定义`LossMonitorWithCollection`用于监督收敛过程,并搜集收敛过程的损失。" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "class LossMonitorWithCollection(LossMonitor):\n", " def __init__(self, per_print_times=1):\n", " super(LossMonitorWithCollection, self).__init__(per_print_times)\n", " self.loss = []\n", "\n", " def begin(self, run_context):\n", " self.begin_time = time.time()\n", "\n", " def end(self, run_context):\n", " self.end_time = time.time()\n", " print('Total time used: {}'.format(self.end_time - self.begin_time))\n", "\n", " def epoch_begin(self, run_context):\n", " self.epoch_begin_time = time.time()\n", "\n", " def epoch_end(self, run_context):\n", " cb_params = run_context.original_args()\n", " self.epoch_end_time = time.time()\n", " if self._per_print_times != 0 and cb_params.cur_step_num % self._per_print_times == 0:\n", " print('')\n", "\n", " def step_end(self, run_context):\n", " cb_params = run_context.original_args()\n", " loss = cb_params.net_outputs\n", "\n", " if isinstance(loss, (tuple, list)):\n", " if isinstance(loss[0], Tensor) and isinstance(loss[0].asnumpy(), np.ndarray):\n", " loss = loss[0]\n", "\n", " if isinstance(loss, Tensor) and isinstance(loss.asnumpy(), np.ndarray):\n", " loss = np.mean(loss.asnumpy())\n", "\n", " cur_step_in_epoch = (cb_params.cur_step_num - 1) % cb_params.batch_num + 1\n", "\n", " if isinstance(loss, float) and (np.isnan(loss) or np.isinf(loss)):\n", " raise ValueError(\"epoch: {} step: {}. Invalid loss, terminating training.\".format(\n", " cb_params.cur_epoch_num, cur_step_in_epoch))\n", " self.loss.append(loss)\n", " if self._per_print_times != 0 and cb_params.cur_step_num % self._per_print_times == 0:\n", " print(\"\\repoch: %+3s step: %+3s time: %5.5s, loss is %5.5s\" % (cb_params.cur_epoch_num, cur_step_in_epoch, time.time() - self.epoch_begin_time, loss), flush=True, end='')\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "接下来,利用量子版本的`CBOW`来对一个长句进行词嵌入。运行之前请在终端运行`export OMP_NUM_THREADS=4`,将量子模拟器的线程数设置为4个,当所需模拟的量子系统比特数较多时,可设置更多的线程数来提高模拟效率。" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "scrolled": true, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "epoch: 25 step: 20 time: 0.336, loss is 3.154\n", "epoch: 50 step: 20 time: 0.449, loss is 2.945\n", "epoch: 75 step: 20 time: 0.325, loss is 0.226\n", "epoch: 100 step: 20 time: 0.370, loss is 0.016\n", "epoch: 125 step: 20 time: 0.377, loss is 0.002\n", "epoch: 150 step: 20 time: 0.399, loss is 0.006\n", "epoch: 175 step: 20 time: 0.370, loss is 0.166\n", "epoch: 200 step: 20 time: 0.345, loss is 0.139\n", "epoch: 225 step: 20 time: 0.350, loss is 3.355\n", "epoch: 250 step: 20 time: 0.334, loss is 1.059\n", "epoch: 275 step: 20 time: 0.339, loss is 0.035\n", "epoch: 300 step: 20 time: 0.334, loss is 0.024\n", "epoch: 325 step: 20 time: 0.344, loss is 0.010\n", "epoch: 350 step: 20 time: 0.344, loss is 0.009\n", "Total time used: 126.26282787322998\n" ] } ], "source": [ "import mindspore as ms\n", "from mindspore import context\n", "from mindspore import Tensor\n", "context.set_context(mode=context.PYNATIVE_MODE, device_target=\"CPU\")\n", "corpus = \"\"\"We are about to study the idea of a computational process.\n", "Computational processes are abstract beings that inhabit computers.\n", "As they evolve, processes manipulate other abstract things called data.\n", "The evolution of a process is directed by a pattern of rules\n", "called a program. People create programs to direct processes. In effect,\n", "we conjure the spirits of the computer with our spells.\"\"\"\n", "\n", "ms.set_seed(42)\n", "window_size = 2\n", "embedding_dim = 10\n", "hidden_dim = 128\n", "word_dict, sample = GenerateWordDictAndSample(corpus, window=window_size)\n", "train_x, train_y = GenerateTrainData(sample, word_dict)\n", "\n", "train_loader = ds.NumpySlicesDataset({\n", " \"around\": train_x,\n", " \"center\": train_y\n", "}, shuffle=False).batch(3)\n", "net = CBOW(len(word_dict), embedding_dim, window_size, 3, 4, hidden_dim)\n", "net_loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction='mean')\n", "net_opt = nn.Momentum(net.trainable_params(), 0.01, 0.9)\n", "loss_monitor = LossMonitorWithCollection(500)\n", "model = Model(net, net_loss, net_opt)\n", "model.train(350, train_loader, callbacks=[loss_monitor], dataset_sink_mode=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "打印收敛过程中的损失函数值:" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "scrolled": true }, "outputs": [ { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "import matplotlib.pyplot as plt\n", "\n", "plt.plot(loss_monitor.loss, '.')\n", "plt.xlabel('Steps')\n", "plt.ylabel('Loss')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "通过如下方法打印量子嵌入层的量子线路中的参数:" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "array([ 1.5175925e+00, -5.5282825e-01, 1.9824509e-01, -2.3327057e+00,\n", " 8.4526891e-01, -1.3019586e+00, 9.3813318e-01, -1.0318477e-01,\n", " 4.4351882e-01, 1.8607093e+00, 6.0036021e-01, -3.0638957e-01,\n", " 9.3188483e-01, 6.0410827e-01, -1.8905094e-01, 6.5970606e-01,\n", " -1.2129487e+00, -3.1650740e-01, -2.5501034e+00, 3.6324959e-02,\n", " 4.0066850e-01, 7.5752664e-01, -5.6982380e-01, -5.6846058e-01,\n", " -9.0591955e-01, 3.3477244e-01, -6.1832809e-01, 2.1618415e-01,\n", " 1.0225463e-01, 4.0966314e-01, -9.0604734e-01, 1.3528558e+00,\n", " -5.3387892e-01, -3.2625124e-02, 6.8196923e-02, 4.1799426e-01,\n", " 2.6094767e-01, -3.3765252e+00, -1.9021339e+00, -1.1502613e+00,\n", " -2.0344164e+00, 8.0160522e-01, -2.8717926e-01, 3.3720109e-01,\n", " -2.1616800e+00, 1.1822585e+00, -7.0481867e-01, 4.0014455e-01,\n", " -2.8856799e-01, 8.4199363e-01, -5.8137196e-01, -1.9842222e+00,\n", " 1.7555025e-01, 4.1823694e-01, -3.1270559e+00, 2.6714945e+00,\n", " 2.3251233e+00, 3.0707479e-01, -5.3547442e-01, 3.0258337e-01,\n", " -1.5764916e+00, 3.0099937e-01, -2.9257689e+00, -1.1786047e+00,\n", " -5.7270378e-01, 2.0587114e-03, -1.5863895e+00, -2.1442556e+00,\n", " -1.7923084e-01, -1.2772868e+00, 4.1606693e-04, -9.2881303e-03],\n", " dtype=float32)" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "net.embedding.weight.asnumpy()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 经典版词向量嵌入层\n", "\n", "这里我们利用经典的词向量嵌入层来搭建一个经典的CBOW神经网络,并与量子版本进行对比。\n", "\n", "首先,搭建经典的CBOW神经网络,其中的参数跟量子版本的类似。" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [], "source": [ "class CBOWClassical(nn.Cell):\n", " def __init__(self, num_embedding, embedding_dim, window, hidden_dim):\n", " super(CBOWClassical, self).__init__()\n", " self.dim = 2 * window * embedding_dim\n", " self.embedding = nn.Embedding(num_embedding, embedding_dim, True)\n", " self.dense1 = nn.Dense(self.dim, hidden_dim)\n", " self.dense2 = nn.Dense(hidden_dim, num_embedding)\n", " self.relu = ops.ReLU()\n", " self.reshape = ops.Reshape()\n", "\n", " def construct(self, x):\n", " embed = self.embedding(x)\n", " embed = self.reshape(embed, (-1, self.dim))\n", " out = self.dense1(embed)\n", " out = self.relu(out)\n", " out = self.dense2(out)\n", " return out" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "生成适用于经典CBOW神经网络的数据集。" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "train_x shape: (58, 4)\n", "train_y shape: (58,)\n" ] } ], "source": [ "train_x = []\n", "train_y = []\n", "for i in sample:\n", " around, center = i\n", " train_y.append(word_dict[center])\n", " train_x.append([])\n", " for j in around:\n", " train_x[-1].append(word_dict[j])\n", "train_x = np.array(train_x).astype(np.int32)\n", "train_y = np.array(train_y).astype(np.int32)\n", "print(\"train_x shape: \", train_x.shape)\n", "print(\"train_y shape: \", train_y.shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "我们对经典CBOW网络进行训练。" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "epoch: 25 step: 20 time: 0.031, loss is 3.155\n", "epoch: 50 step: 20 time: 0.033, loss is 3.027\n", "epoch: 75 step: 20 time: 0.033, loss is 3.010\n", "epoch: 100 step: 20 time: 0.033, loss is 2.955\n", "epoch: 125 step: 20 time: 0.032, loss is 0.630\n", "epoch: 150 step: 20 time: 0.034, loss is 0.059\n", "epoch: 175 step: 20 time: 0.033, loss is 0.008\n", "epoch: 200 step: 20 time: 0.031, loss is 0.003\n", "epoch: 225 step: 20 time: 0.032, loss is 0.001\n", "epoch: 250 step: 20 time: 0.030, loss is 0.001\n", "epoch: 275 step: 20 time: 0.032, loss is 0.000\n", "epoch: 300 step: 20 time: 0.030, loss is 0.000\n", "epoch: 325 step: 20 time: 0.030, loss is 0.000\n", "epoch: 350 step: 20 time: 0.029, loss is 0.000\n", "Total time used: 11.819875240325928\n" ] } ], "source": [ "context.set_context(mode=context.GRAPH_MODE, device_target=\"CPU\")\n", "\n", "train_loader = ds.NumpySlicesDataset({\n", " \"around\": train_x,\n", " \"center\": train_y\n", "}, shuffle=False).batch(3)\n", "net = CBOWClassical(len(word_dict), embedding_dim, window_size, hidden_dim)\n", "net_loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction='mean')\n", "net_opt = nn.Momentum(net.trainable_params(), 0.01, 0.9)\n", "loss_monitor = LossMonitorWithCollection(500)\n", "model = Model(net, net_loss, net_opt)\n", "model.train(350, train_loader, callbacks=[loss_monitor], dataset_sink_mode=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "打印收敛过程中的损失函数值:" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "import matplotlib.pyplot as plt\n", "\n", "plt.plot(loss_monitor.loss, '.')\n", "plt.xlabel('Steps')\n", "plt.ylabel('Loss')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "由上可知,通过量子模拟得到的量子版词嵌入模型也能很好的完成嵌入任务。当数据集大到经典计算机算力难以承受时,量子计算机将能够轻松处理这类问题。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 参考文献\n", "\n", "[1] Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean. [Efficient Estimation of Word Representations in\n", "Vector Space](https://arxiv.org/pdf/1301.3781.pdf)" ] } ], "metadata": { "kernelspec": { "display_name": "MindSpore", "language": "python", "name": "mindspore" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 2 }