数据仓库 维度建模 维度表
Generally, only 5V’s of big data are discussed. But here we will also talk about the sixth one and will also understand why we should not ignore the 6
th
one.
通常,仅讨论5V的大数据。 但是在这里我们还将讨论第六个,也将理解为什么我们不应该忽略第六
个
。
The most commonly discussed 5 Big V’s of Big Data are:
讨论最多的5个大数据大V是:
-
Volume
卷
-
Variety
品种
-
Velocity
速度
-
Veracity
真实性
-
Valence
价
And the sixth v that we will discuss here is
Value
.
我们将在这里讨论的第六个v是
Value
。
Now let’s discuss about all the v’s in brief,
现在让我们简短地讨论所有v
1)音量
(
1) Volume
)
In the name itself of Big data the word big is mentioned, means the data big, it’s voluminous.
在大数据本身的名称中,提到了“大”一词,意思是大数据,它是庞大的。
Over millions of petabytes of data are produced per minute. We cannot even imagine of all the time, cost, energy that will be used to store and extract sense out of such an amount of data.
每分钟产生数百万PB的数据。 我们甚至无法想象所有时间,成本和精力都将用于存储和提取如此大量数据中的感觉。
Also, there is the number of challenges we should encounter while dealing with the massive volume of big data. Specifically, the storing of data, the amount of storage space required to store that data efficiently will also be large. However, we also need to be able to retrieve that large amount of data fast enough and move them to processing units in a timely fashion to get results when we need them. This brings additional challenge such as networking, bandwidth, and cost of storing data.
此外,在处理海量大数据时,我们还应该面对许多挑战。 具体地,数据的存储,有效地存储该数据所需的存储空间量也将很大。 但是,我们还需要能够足够快地检索到大量数据,并及时将它们移至处理单元以在需要时获得结果。 这带来了额外的挑战,例如网络,带宽和存储数据的成本。
2)品种
(
2) Variety
)
Variety is a form of scalability. Here scale does not refer to the size it refers to increased diversity.
多样性是可扩展性的一种形式。 这里的规模不是指规模,而是指增加的多样性。
Just think over the internet, or in our daily lives also we came across different types of data. Text files, embedded images, videos etc.
只是想想互联网,或者在我们的日常生活中,我们也遇到了不同类型的数据。 文本文件,嵌入式图像,视频等
So, variety is also another important thing we need to deal with cause data that produce is very varied in manner.
因此,多样性也是我们需要处理的另一件事,原因是产生的数据的方式差异很大。
3)速度
(
3) Velocity
)
By velocity we refer to the high speed at which the data is created and according to which data needs to be stored and analyzed. We should match our processing speed with the speed at which the data is produced because if a business cannot take advantage of the data as it gets generated, because of timing problem, they often miss opportunities. Velocity is an important factor of data, being able to catch up with the velocity of big data and analyzing it as it gets generated can even impact the quality of human life. For example, sensors and smart devices monitoring the human body can detect abnormalities in real time and trigger immediate action, potentially saving lives.
通过速度,我们指的是创建数据以及根据其需要存储和分析数据的高速。 我们应该将处理速度与数据的生成速度相匹配,因为如果业务由于时序问题而无法利用生成的数据,则它们往往会错失机会。 速度是数据的重要因素,能够赶上大数据的速度并在生成数据时对其进行分析甚至会影响人类的生活质量。 例如,监视人体的传感器和智能设备可以实时检测异常并立即采取措施,从而有可能挽救生命。
4)真实性
(
4) Veracity
)
By veracity, we refer to the quality of the data. Big data is varied, generated at a high speed so, it is likely to be noisy and uncertain. It can be full of biases, abnormalities and it can imprecise.
“There is no value of data if it is not accurate”
.
通过准确性,我们指的是数据的质量。 大数据是多种多样的,并且是高速生成的,因此,它可能是嘈杂的和不确定的。 它可能充满偏差,异常并且可能不精确。
“如果数据不正确,将没有任何价值”
。
5)价
(
5) Valence
)
Valence refers to the connectedness. The more connected data is the higher its valences. The term comes from chemistry, remember we talk about valence electrons in chemistry. Valence electrons are in outermost shells, have the highest energy level and are responsible for bonding with other atoms. Higher valence results in greater bonding, that is greater connectedness.
价是指联系。 连接的数据越多,其价数就越高。 该术语来自化学,记住我们谈论化学中的价电子。 价电子位于最外层的壳中,具有最高的能级,并与其他原子键合。 价数越高,键合越好,即连通性越强。
For a collection of data valence measures the ratio of actually connected data items to the possible number of connections that could occur within the collection.
对于数据集合,量度度量实际连接的数据项与该集合内可能发生的连接的可能数量之比。
The most important aspect of valence is that the data connectivity increases over time.
价的最重要方面是数据连接性随时间增加。
Above we discussed the
5 v’s of big data often referred to as the dimensions of big data
. Each of them shows us the challenges associated with different dimensions of big data namely, size, complexity, speed, quality, and connectedness.
上面我们讨论了
大数据
的
5个v,通常称为大数据的维度
。 他们每个人都向我们展示了与大数据的不同维度(大小,复杂性,速度,质量和连接性)相关的挑战。
At the heart of the big data, the challenge is turning all of the other dimensions into truly useful business
“value”
.
在大数据的核心,挑战在于将所有其他方面变成真正有用的业务
“价值”
。
So, this value is our
sixth v
. The main purpose between collecting, storing, analyzing and all the other things we do is to extract
“Value”
from Big Data.
因此,该值是我们的
第六个v
。 收集,存储,分析以及我们所做的所有其他事情之间的主要目的是从大数据中提取
“价值”
。
Conclusion:
结论:
In the above article, we discussed all the
dimensions of big data and also got introduced to the 6th v of big data i.e. value
, which is the heart of all the other processes. For any further queries shoot your questions in the comment section below. Will see you in my next article till then stay healthy and keep learning!
在上面的文章中,我们讨论
了大数据的
所有
维度,还介绍了大数据的第六个维度,即value
,这是所有其他过程的核心。 如有其他疑问,请在下面的评论部分中提出您的问题。 在我的下一篇文章中将看到您,直到您保持健康并继续学习为止!
翻译自:
https://www.includehelp.com/big-data/dimensions-of-big-data.aspx
数据仓库 维度建模 维度表