The basis of neural network is MP model? A new neuron model ft proposed by Zhou Zhihua group of NANDA University
Hits: 3894752
2020-04-21
From arXiv
Author: Zhang Shaoqun, Zhou Zhihua
Heart of machine compilation
In this paper, Dr. Zhang Shaoqun and Professor Zhou Zhihua from Nanjing University proposed a new neuron model, flexible transmitter (FT), which has flexible plasticity and supports complex data processing. According to the paper, this research provides a new basic construction unit for neural network, and shows the feasibility of developing neural network with neural plasticity.
Most of the current neural networks are based on MP model, which is an abstract and simplified model constructed according to the structure and working principle of biological neurons. Such models usually formalize neurons into a "weighted sum of input signals on the activation function complex".
Recently, Dr. Zhang Shaoqun and Professor Zhou Zhihua from Nanjing University put forward a model called flexible transmitter (FT), which is a new type of biomimetic neuron with flexible plasticity.
FT model uses a pair of parameters to model the neurotransmitter between neurons, and sets up a neurotransmitter regulated memory unit to record the long-term learning information of the concerned neurons. Therefore, FT model is formalized as a binary binary function in this study, and MP neuron model is its special form. FT model can deal with more complex data, even timing signals.
In order to demonstrate the capability and potential of FT model, the flexible transmitter network (ftnet) was proposed. Ftnet is based on the most common full connection feedforward architecture and uses ft neurons as its basic building block. Ftnet allows gradient calculation and can be realized by back propagation algorithm in complex field. The experimental results on a series of tasks show the excellent performance of ftnet. This study provides another basic building block for neural network and shows the feasibility of developing neural network with neural plasticity.
Paper link: https://arxiv.org/pdf/2004.03839v2.pdf
Common MP models
The basic computing unit of neural network is neuron, which corresponds to the cells of biological nervous system. Although the research of neural network has lasted for more than 50 years, many kinds of neural network algorithms and architectures emerge in endlessly, however, the research on neural modeling is still insufficient.
The most famous and commonly used neuron representation is MP model [12], as shown in Figure 1 below:
Figure 1: MP model
MP model receives the input signals x ﹣ I from n other neurons, which are transmitted through weighted connection w ﹣ I. The total input value received by neurons will be compared with the threshold value of neurons, and then the output of neurons will be generated through activation function f processing, that is to say. From Figure 1, we can see that x ˊ I represents the signal from other neurons, w ˊ represents the corresponding connection weight, θ represents the threshold value of neurons, and f represents the activation function that is usually continuous and differentiable, such as the sigmoid function commonly used for shallow networks and the relu function commonly used for deep networks.
Although the description is very simple, MP model is very successful. However, the structure of neuron cell is much more complex in practice, so it is a basic problem to explore the neuron model with other biological simulation forms. People have made a lot of efforts in modeling the discharge behavior of cells, and put forward the spike neuron model and the pulse neural network with the spike neuron as the basic unit of calculation [9, 18].
Is there another form of neuron model?
Researchers at Nanjing University considered another interesting perspective and proposed a new neuron model.
Neuroscience research [2,7] has revealed that synapses can ensure one-way communication between two neurons, that is, the flow of information flows from presynaptic cells to postsynaptic cells. Synapses usually form between the axons of presynaptic cells and the dendrites of postsynaptic cells. In common synaptic structures, there is a gap of about 20 microns between dendrites and axons (in neuroscience, called "synaptic gap"), as shown in Figure 2.
Figure 2: biological neuron (left) and its synaptic structure (right).
This means that although closely related, the axonal transmission strength of presynaptic cells is different from that of postsynaptic cells. Therefore, it is necessary to distinguish presynaptic and postsynaptic parts in neuron model.
In contrast to the MP model, which simply regards the whole synaptic structure as a learnable real value parameter W, and the ode equation with a leaky integration structure, which is used to model synaptic spike neurons, this study uses a pair of related parameters (W, V) to represent the axon transmission intensity and dendrite concentration respectively, which is the flexible transmitter.
In addition, many experimental studies [8, 6] pointed out that neurons have memory of past learning behavior, and biological voltage is continuously enhanced or inhibited according to recent activity patterns, namely long-term enhancement (LTP) or inhibition (LTD). In this study, a special memory variable, the neurotransmitter self-regulated memory element, was set up to record the long-term learning behavior of neurons.
Flexible transmitter model
This interesting discovery in neuroscience shows that the response of neuron a after receiving the stimulation signal from neuron B not only depends on the axon transmission intensity of neuron B, but also depends on the dendrite concentration of neuron a, which is related to the memory unit of neuron a, as shown in Figure 2.
Inspired by this, this study proposes a flexible transmitter model, as shown in Figure 3 below:
Figure 3: ft model illustration. Where (W, V) is the transmission transmitter parameter pair, and m ﹤ t is the strength of the memory unit of neuron a at time t.
In contrast to MP model, interaction in FT model consists of two parts: W x ﹐ T, X ﹐ T represents the stimulation signal sent to related neurons through corresponding axon transmission intensity W; v m ﹐ t ﹐ 1, m ﹐ t ﹐ 1 represents the memory intensity related to dendritic concentration V at the (t ﹐ 1) moment. That is to say, FT model uses transmission transmitter parameter pair (W, V) instead of real weight W in MP model to express synaptic plasticity.
On the other hand, the output of FT neurons at the t-th moment also includes two parts: s-uut and m-uut, where s-uut is the bioelectric / chemical stimulation signal generated by neurons, and m-uut is the current memory strength of neurons. At the end of this time, the stimulus signal s ˊ t is transmitted to the next neuron, and the memory intensity of the related neuron is also updated to m ˊ t.
The FT model uses parameter pair (W, V) to represent synaptic plasticity, and the unique variable m_ltto represent the memory unit regulated by neurotransmitters. Furthermore, the FT model can be formalized as a binary binary function with parameter pairs (W, V), as follows:
Researchers call this model flexible transmitter model. Obviously, this modeling method makes ft neurons not only have more biological fidelity, but also have more potential to deal with complex structure data.
Flexible Transmitter Network
Ftnet uses a fully connected network architecture, and researchers use ft neurons instead of real MP model. They also developed a practical and efficient back propagation algorithm for training ftnet.
Implementation of FT model
According to Formula 1, FT model is essentially dominated by binary function f and parameter pair (W, V). The input and output of FT model consist of two parts, the relationship between them is very complex. Most of the existing neural models rely on single valued function, which is difficult to be directly applied to this problem. An interesting solution is to use complex functions to represent the input and output of neurons. The resulting neuron model is as follows:
In the complex analysis, the real part and the virtual part of the output of the complex function are in pairs, that is to say, S-T and M-T share the same complex function f and parameter pair (W, V).
Simple architecture of ftnet
FT neuron is the basic unit of neural network. In order to evaluate its potential, researchers consider using the simplest full connection feedforward neural network architecture, using ft neurons as building blocks to replace the original MP neurons, so as to get ftnet. Based on Formula 2, we provide a general vectorization representation for a layer of FT neurons:
By reusing the vectorization representation in Formula 3 layer by layer, we can get the multi-layer full connection feedforward architecture of ftnet.
Now there are two questions: 1) what kind of complex function f should be? 2) How to learn its parameters?
In order to solve these two problems, the complex function f in equation 2 is divided into two parts: conversion function τ: C → C and activation function σ: C → C, where f = σ? τ. The complex structure of F is separated from the nonlinear activation function by the compound operation: the transformation function τ represents the addition operation on the complex field, which is usually differentiable, while σ represents the activation function. Therefore, ftnet allows gradient calculation and can adapt to some traditional activation functions.
Complex back propagation algorithm
In order to train ftnet, researchers propose a complex back propagation (CBP). This algorithm is an extended version of back propagation algorithm in complex field. In addition, the detailed implementation process of CBP is given by taking single-layer ftnet and double-layer ftnet as examples. See Appendix B of the original paper for details.
Experiment
The researchers compared ftnet with several common neural networks on three simulated and real datasets.
analog signal
The researchers first explored the performance of ftnet with different configurations on simulated data.
Experiments show that tanh activation function is the best one to maintain the optimal performance, whether using ft0 or FT1 architecture. In contrast, sigmoid and modrelu activation functions perform less well. The performance of zrelu is slightly better than that of P relu.
The reason, researchers suspect, is that for complex activation functions, radius may be more susceptible and important than phase. Therefore, researchers use tanh activation function and 0.01 learning rate to configure ftnet in the following real world tasks.
Single variable time series prediction: the task of predicting the number of cars on the market in Yancheng
The researchers have carried out experiments on the data set of Yancheng automobile licensing prediction competition, which is a single variable time series prediction task in the real world.
Table 1: mean square error (MSE) and model setting on the prediction task of vehicle licensing volume in Yancheng.
As can be seen from table 1, the performance of FT1 model is very competitive.
Multivariate time series prediction: hduk traffic prediction task
The researchers verified the performance of ftnet on hduk data set, which is a typical multivariate time series prediction data set. Experiments show that the performance of ftnet is better than other neural networks under the same settings.
Table 2: MSE and conference accuracy of the model on hduk traffic prediction task.
Image recognition performance on pixel by pixel MNIST dataset
Table 3: accuracy of each model on pixel by pixel MNIST task.
Experiments show that the performance of ftnet is better than that of neural network.
About the author
Zhang Shaoqun, the first author of the study, is now a member of lamda group of computer science and Technology Department of Nanjing University. His tutor is Zhou Zhihua. His research interests are time series analysis and computational neuroscience. Professor Zhou Zhihua is also the corresponding author of the study.
First author Zhang Shaoqun
The second phase of machine heart "Ai development"