在做时间序列分析时,需要计算Hurst指数,由于Hurst指数计算比较复杂,刚开始懒得自己写,就在github上进行搜索,多是这个代码:
from numpy import std, subtract, polyfit, sqrt, log
def hurst(ts):
"""Returns the Hurst Exponent of the time series vector ts"""
# create the range of lag values
i = len(ts) // 2
lags = range(2, i)
# Calculate the array of the variances of the lagged differences
tau = [sqrt(std(subtract(ts[lag:], ts[:-lag]))) for lag in lags]
# use a linear fit to estimate the Hurst Exponent
poly = polyfit(log(lags), log(tau), 1)
# Return the Hurst Exponent from the polyfit output
return poly[0] * 2.0
用该代码进行测试时,发现结果跟预期差别较大,理论上为长期趋势时,hurst指数应该接近1,但是对构造好的测试集进行测试时发现hurst指数居然接近与0.5比较多,因此根据查到的Hurst指数构建理论(理论参考为:
http://www.360doc.com/content/16/0409/15/20041187_549224354.shtml
)自己手写了一个Hurst指数计算代码:
理论部分如下:
代码部分如下:
# coding: utf-8
from __future__ import division
from collections import Iterable
import numpy as np
from pandas import Series
def calcHurst2(ts):
if not isinstance(ts, Iterable):
print 'error'
return
n_min, n_max = 2, len(ts)//3
RSlist = []
for cut in range(n_min, n_max):
children = len(ts) // cut
children_list = [ts[i*children:(i+1)*children] for i in range(cut)]
L = []
for a_children in children_list:
Ma = np.mean(a_children)
Xta = Series(map(lambda x: x-Ma, a_children)).cumsum()
Ra = max(Xta) - min(Xta)
Sa = np.std(a_children)
rs = Ra / Sa
L.append(rs)
RS = np.mean(L)
RSlist.append(RS)
return np.polyfit(np.log(range(2+len(RSlist),2,-1)), np.log(RSlist), 1)[0]
使用该代码对随机数进行计算Hurst指数时,比较趋近与0.5,即符合随机,而排序后的数据进行计算则接近于1,即为长期趋势,不过由于并没有对数据进行全分类,而是分类的最小集合为每个子集中有3个元素,因此理论上会出现大于1的现象,不过超出部分比较小,且出现几率并不是很大,因此可以视为1。
版权声明:本文为xiaodongxiexie原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。