推荐 :深度学习初学者不可不知的25个术语和概念(上)

释放双眼,带上耳机,听听看~!

原创译文|从神经网络说起:深度学习初学者不可不知的25个术语和概念(上)


马克·库班的这番话可能听起来挺吓人的,但道理是没毛病的!我们正经历一场大革命,这场革命就是由大数据和强大电脑计算能力发起的。

 

让我们花几分钟回想一下20世纪初的景象。那个时候很多人都不懂什么是电,在过去几十年,甚至几百年的时间里,人们一直沿用一种方式去做某件事情,但是突然间,好像身边的一切都变了。

以前需要很多人才能做成的事情,现在只需要一个人应用电力就能做成。而我们现在就正在经历相似的变革过程,今天这场变革的主角就是机器学习和深度学习。

 

如果你现在还不懂深度学习的巨大力量,那你真的要抓紧时间开始学啦!这篇文章就为大家介绍了深度学习领域常用的一些术语和概念。现在就从神经网络开始讲起。

 

神经网络基础概念:

(1)神经元——正如神经元是大脑的基本单位一样,在神经网络结构中,神经元也是一个小单位。大家不妨想象一下当我们接触到新的信息时,大脑是如何运作的。

首先,我们会在脑中处理这个信息,然后产生输出信息。在神经网络中也是如此,神经元接收到一个输入信息,然后对它进行加工处理,然后产生输出信息,传输到其他神经元中进行进一步信息处理。

 

(2)权重——当输入信息到达神经元时,它就会乘上一个权重。举例来说,如果一个神经元包含两个输入信息,那么每个输入信息都被赋予它的关联权重。我们随机初始化权重,并在模型训练过程中更新这些权重。

接受训练后的神经网络会赋予它认为重要的输入信息更高的权重值,而那些不重要的输入信息权重值则会相对较小。权重值为零就意味着这个特征是无关紧要的。

 

我们不妨假设输入信息为a,其关联权重为推荐 :深度学习初学者不可不知的25个术语和概念(上),通过节点后,输入信息变为

推荐 :深度学习初学者不可不知的25个术语和概念(上)

 

(3)偏置 —— 除了权重之外,输入还有另一个线性分量,被称为偏置。输入信息乘上权重后再加上偏置,用来改变权重乘输入的范围。加上偏置之后,结果就变为

推荐 :深度学习初学者不可不知的25个术语和概念(上),这就是输入信息变换的最终线性分量。

 

(4)激活函数——线性分量应用可以到输入信息,非线性函数也可以应用到输入信息。这种输入信息过程是通过激活函数来实现的。

激活函数将输入信号翻译成输出信号。激活函数产生的输出信息为推荐 :深度学习初学者不可不知的25个术语和概念(上),其中f(x)就是激活函数。

 

在下面的图表中,我们可以看到,输入信息数量为n,表示为推荐 :深度学习初学者不可不知的25个术语和概念(上)推荐 :深度学习初学者不可不知的25个术语和概念(上),相应的权重为推荐 :深度学习初学者不可不知的25个术语和概念(上)推荐 :深度学习初学者不可不知的25个术语和概念(上)。将偏置设为推荐 :深度学习初学者不可不知的25个术语和概念(上)。权重乘以输入信息,再加偏置,我们设所得的值为u:

u=∑w*x+b

 

将u带入激活函数中,最后我们就可以得到从神经元输出的推荐 :深度学习初学者不可不知的25个术语和概念(上)

 

常用激活函数

最常用的激活函数有Sigmoid、ReLU 和softmax。

 

  • Sigmoid——Sigmoid是最常用的激活函数之一。 它的定义为:

推荐 :深度学习初学者不可不知的25个术语和概念(上)

Sigmoid函数会生成0到1之间的更平滑的取值范围。我们可能需要观察输出值的变化,同时输入值也会略有变化。而平滑的曲线更方便我们观察,因此它优于阶梯函数(step functions)。

 

  • ReLU(线性修正单位)——不同于sigmoid函数,现在的网络更倾向于使用隐层ReLu激活函数。该函数的定义是:

推荐 :深度学习初学者不可不知的25个术语和概念(上)

当X> 0时,函数的输出为X,X <= 0时为0。该函数如下所示:

 

使用ReLU的好处主要是它对于大于0的所有输入值都有对应的不变导数值。而常数导数值可以加快对网络的训练。

 

  • Softmax——Softmax激活函数常用于输出层的分类问题。 它与sigmoid函数类似,唯一的区别是在Softmax激活函数中,输出被归一化,总和变为1。

如果我们遇到的是二进制输出问题,就可以使用Sigmoid函数,而如果我们遇到的是多类型分类问题,使用softmax函数可以轻松地为每个类型分配值,并且可以很容易地将这个值转化为概率。

 

这样看可能更容易理解一些——假设你正在尝试识别一个看起来像8实际为6的数。该函数将为每个数字赋值,如下所示。我们可以很容易地看出,最高概率被分配给了6,下一个最高概率则分配给8,依此类推…

 

(5)神经网络 ——神经网络是深度学习的主干之一。神经网络的目标是找到未知函数的一个近似值。它由相互联系的神经元组成。

这些神经元具有权重,并且会根据出错情况,在网络训练期间更新偏置值。激活函数将非线性变换置于线性组合,之后生成输出。被激活的神经元组合再产生输出。

 

对神经网络的定义中,以Liping Yang的最为贴切:

 “神经网络由许多相互关联的概念化的人造神经元组成,这些人造神经元之间可以互相传递数据,并且具有根据网络‘经验’调整的相关权重。

神经元具有激活阈值,如果结合相关权重组合并激活传递给他们的数据,神经元的激活阈值就会被解除,激活的神经元的组合就会开始‘学习’。”

更多深度学习精彩内容,下期继续!

英文原文

25 Must Know Terms & concepts forBeginners in Deep Learning

Artificial Intelligence, deep learning,machine learning — whatever you’re doing if you don’t understand it — learn it.Because otherwise you’re going to be a dinosaur within 3 years.

 ——Mark Cuban

This statement from Mark Cuban might sounddrastic – but its message is spot on! We are in middle of a revolution – arevolution caused by Big Huge data and a ton of computational power.

 

For a minute, think how a person would feelin early 20th century if he / she did not understand electricity. 

You wouldhave been used to doing things in a particular manner for ages and all of asudden things around you started changing. 

Things which required many peoplecan now be done with one person and electricity. We are going through a similarjourney with machine learning & deep learning today.

 

So, if you haven’t explored or understoodthe power of deep learning – you should start it today. I have written thisarticle to help you understand common terms used in deep learning.

 

Who should read this article?

 

If you are some one who wants to learn orunderstand deep learning, this article is meant for you. In this article, Iwill explain various terms used commonly in deep learning.

 

If you are wondering why I am writing thisarticle – I am writing it because I want you to start your deep learningjourney without hassle or without getting intimidated.

 When I first beganreading about deep learning, there were several terms I had heard about, but itwas intimidating when I tried to understand them. There are several words whichare recurring when we start reading about any deep learning application.

 

In this article, I have created somethinglike a deep learning dictionary for you which you can refer whenever you needthe basic definition of the most common terms used. I hope after this articlethese terms wouldn’t haunt you anymore.

 

Terms related to topics:

 

To help you understand various terms, Ihave broken them in 3 different groups. If you are looking for a specific term,you can skip to that section. If you are new to the domain, I would recommendthat you go through them in the order I have written them.

 

  • Basics of Neural Networks

  • Common Activation Functions

  • Convolutional Neural Networks

  • Recurrent Neural Networks

 

Basics of Neural Networks

 

(1) Neuron– Just like a neuron forms thebasic element of our brain, a neuron forms the basic structure of a neuralnetwork. Just think of what we do when we get new information. 

When we get theinformation, we process it and then we generate an output. Similarly, in caseof a neural network, a neuron receives an input, processes it and generates anoutput which is either sent to other neurons for further processing or it isthe final output.

(2) Weights – When input enters the neuron,it is multiplied by a weight. For example, if a neuron has two inputs, theneach input will have has an associated weight assigned to it. We initialize theweights randomly and these weights are updated during the model trainingprocess. 

The neural network after training assigns a higher weight to the inputit considers more important as compared to the ones which are considered lessimportant. A weight of zero denotes that the particular feature isinsignificant.

 

Let’s assume the input to be a, and theweight associated to be 推荐 :深度学习初学者不可不知的25个术语和概念(上). Then after passing through the node the inputbecomes 推荐 :深度学习初学者不可不知的25个术语和概念(上)

(3) Bias – In addition to the weights,another linear component is applied to the input, called as the bias. It isadded to the result of weight multiplication to the input. 

The bias isbasically added to change the range of the weight multiplied input. Afteradding the bias, the result would look like 推荐 :深度学习初学者不可不知的25个术语和概念(上). This is the final linearcomponent of the input transformation.

 

(4) Activation Function – Once the linearcomponent is applied to the input, a non-linear function is applied to it. Thisis done by applying the activation function to the linear combination.

Theactivation function translates the input signals to output signals. The outputafter application of the activation function would look something like 推荐 :深度学习初学者不可不知的25个术语和概念(上)where f() is the activation function.

 

In the below diagram we have “n” inputsgiven as 推荐 :深度学习初学者不可不知的25个术语和概念(上) to 推荐 :深度学习初学者不可不知的25个术语和概念(上), and corresponding weights 推荐 :深度学习初学者不可不知的25个术语和概念(上) to 推荐 :深度学习初学者不可不知的25个术语和概念(上). We have a bias given as 推荐 :深度学习初学者不可不知的25个术语和概念(上).

The weights are first multiplied to its corresponding input and are thenadded together along with the bias. Let this be called as u.

 

u=∑w*x+b

 

The activation function is applied to ui.e. f(u) and we receive the final output from the neuron as 推荐 :深度学习初学者不可不知的25个术语和概念(上)

Commonly applied Activation Functions

 

The most commonly applied activationfunctions are – Sigmoid, ReLU and softmax

 

(a) Sigmoid – One of the most commonactivation functions used is Sigmoid. It is defined as:

The sigmoid transformation generates a moresmooth range of values between 0 and 1. 

We might need to observe the changes inthe output with slight changes in the input values. Smooth curves allow us todo that and are hence preferred over step functions.

 

(b) ReLU(Rectified Linear Units) – Insteadof sigmoids, the recent networks prefer using ReLu activation functions for thehidden layers. The function is defined as:

The output of the function is X when X>0and 0 for X<=0. The function looks like this:

The major benefit of using ReLU is that ithas a constant derivative value for all inputs greater than 0. The constantderivative value helps the network to train faster.

 

(c) Softmax – Softmax activation functionsare normally used in the output layer for classification problems. It issimilar to the sigmoid function, with the only difference being that theoutputs are normalized to sum up to 1. 

The sigmoid function would work in casewe have a binary output, however in case we have a multiclass classificationproblem, softmax makes it really easy to assign values to each class which canbe easily interpreted as probabilities.

 

It’s very easy to see it this way – Supposeyou’re trying to identify a 6 which might also look a bit like 8. 

The functionwould assign values to each number as below. We can easily see that the highestprobability is assigned to 6, with the next highest assigned to 8 and so on…

(5) Neural Network – Neural Networks formthe backbone of deep learning.The goal of a neural network is to find anapproximation of an unknown function. 

It is formed by interconnected neurons.These neurons have weights, and bias which is updated during the networktraining depending upon the error. 

The activation function puts a nonlineartransformation to the linear combination which then generates the output. Thecombinations of the activated neurons give the output.

 

A neural network is best defined by “LipingYang” as –

 

“Neural networks are made up of numerous interconnectedconceptualized artificial neurons, which pass data between themselves, andwhich have associated weights which are tuned based upon the network’s“experience.” 

Neurons have activation thresholds which, if met by a combinationof their associated weights and data passed to them, are fired; combinations offired neurons result in “learning”.

给TA打赏
共{{data.count}}人
人已打赏
安全运维

MongoDB数据建模小案例:多列数据结构

2021-12-11 11:36:11

安全运维

Ubuntu上NFS的安装配置

2021-12-19 17:36:11

个人中心
购物车
优惠劵
今日签到
有新私信 私信列表
搜索