0% found this document useful (0 votes)

55 views15 pages

Time Series Indexing Guide

1. The document discusses 1-D time series data indexing, which involves preprocessing time series data using techniques like DFT and clustering to convert it to spatial data, and then searching the spatial data to find similar time series data points. 2. In the preprocessing stage, DFT is used to transform time series data to the frequency domain, and clustering is used to group similar data points. This preprocessed data is then indexed for searching. 3. The searching process involves applying DFT to a query time series to convert it to spatial data, finding the nearest cluster, and then refining the results to find the most similar data point.

Uploaded by

Senthil Ilangovan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

55 views15 pages

Time Series Indexing Guide

Uploaded by

Senthil Ilangovan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Advanced Database Systems

1-D Time Series Data Indexing

Donghyun Jeong

Contents
n
n
n
n
n
n
n

1-D Time Series Data

Generate Data
Overall Processing Method
Preprocessing (I),(II),(III)
Searching (I), (II)
Implementation (I), (II)
Demo & Reference

Ill briefly introduce about 1-D time series data indexing. In fact, I
cannot get currently used data such as stock data. Therefore I
generate artificial data to process and show how to do indexing.
Ill have a look all processes of designed and implemented
procedures. (More brief information can find in the book;
ADVANCED DATABASE SYSTEMS)

1-D Time Series Data

Stock Data(example)
ABC NEWS
70
60

Stock

50
40
30
20
10

M9
M1
0
M1
1
M1
2

Month

An example of 1-D time series data is stock data. I have

designed indexing procedures with computer-generated stock
data instead of real data. As you see the histogram on the above,
it denotes time series data.
Sample Data (a part)
M1

M3
NAME

57.709375 59.903125 52.584375

News

45.6375

38.69375

69.34375

62.35625

58.6125

67.8875

75.690625

60.2
61.340625
. .. MBS TV

69.859375 77.396875 73.921875

ABC

Hallym

Generate Data
n

1-D Time Series Data Generating

As I said on the previous page, time series data(Stock data)

generated on special program we made. The generating program
made with MFC. Stock data has fluctuations about 10. I
generate 100 samples include modeled company names to show
it looks like real data. (See on the previous page)

Overall Processing Method

Preprocessing
n

1-D Time Series Data -> Spatial Data

Using DFT, Clustering(Iterative Method)

Indexing (R-tree like method)

Searching
n
n

1-D Time Series Data -> Spatial Data

Search nearest data
n

Using Euclidean Distance

Refining Process

1-D Time Series Data Indexing has two parts of processing

procedures.(Preprocessing/ Searching)
1, Preprocessing
First change 1-D Time Series Data to spatial data on the
frequency domain using DFT(Discrete Fourier Transform).
Second grouping with similar data using clustering method. Last
indexing grouped ID(Index) to get a possibility of searching.
2. Searching
After preprocessing procedure, we can search data which we
want. To find data we need to change our data to spatial data
using DFT as we used in preprocessing. After get new spatial
data which we want to search, calculate Euclidean distance with
original spatial data. And then choose a nearest one. In result,
we do refining process to get final data.
? see more on next page

Preprocessing (I)
n

DFT(Discrete Fourier Transform)

No fault dismissals(Parsevals theorem )

Dfeature(F(x),F(y)) <= D(x,y)

Using DFT, we can change 1-D time series data to spatial data
on frequency domain. After change to spatial data, choose three
data (f = 1 ~ 3). If the wanted data is included(false alarm), we
can find data with the refining process. But if the wanted data is
not included(false dismissals), we cannot find data. The reason
of changing data to spatial data is that the Euclidean distance of
original data is always greater than the Euclidean distance of
spatial data which is changed to spatial data using DFT. It
denotes no fault dismissals. [2][3]
DFT Pseudo Code
for k=0 to sizeofdata -1
real_tmp = 0

// real data

imag_tmp = 0

// image data

for n=0 to sizeofdata -1

real_tmp = real_tmp + data(n)*cos(2*PI/12*k*n)
imag_tmp = imag_ tmp + -1*data(n)*sin(2*PI/12*k*n)
next
power(k) = sqr (real_tmp*real_ tmp + imag_tmp*imag_ tmp)
// calculate power of data on frequency domain
next

Preprocessing (II)
n

Clustering
n

Calculate Cluster Center using K-means

K-means is a method to search nearest

point using Squared Euclidean Distance.

Grouping some elements which have a

same center value.

I use the iterative method to cluster data. Especially I used Kmeans algorithm to cluster. K-means algorithm is to make K
numbers of cluster center using squared Euclidean distance. [4]
----- GROUP LIST(Clustering) ----Cluster Number + {Elements}
0{0}
1 { 28 44 87 }
2 { 19 30 67 84 }
3 { 3 32 70 }
4 { 4 39 81 91 95 }
5 { 5 37 69 83 86 }
6 { 6 46 48 62 80 93 }
7 { 7 41 71 74 }
8 { 8 36 40 63 }
9 { 9 60 61 78 88 89 90 97 }
10 { 10 31 54 }
11 { 11 72 73 92 }
12 { 12 96 }
13 { 13 42 49 59 85 }
14 { 14 35 47 77 }
15 { 15 26 34 43 55 57 58 65 82 }
16 { 16 50 64 75 }
17 { 17 68 94 98 }
18 { 18 38 51 56 }
19 { 25 53 66 }

Preprocessing (II)
n

Clustering
3-D Space(without Z value)
80
70

DFT2(Y)

60
50
40
30
20
10
0
0

60
DFT1(X)

100

120

After clustering data as you can see on the above, we can

search with cluster center values. The total data(100) changed to
25 cluster center.
----- Cluster Center ---DFT1

DFT2

DFT3

23.340117 66.733917 10.157041

50.151770 25.601297 4.004073
61.989542 15.195794 22.199620
7.714025 28.789136 24.238079
40.608878 13.905752 7.586689
67.583971 34.087372 15.156011
45.037389 37.401815 18.088304
12.462210 29.154218 6.057019
43.047011 20.739349 31.244777
16.180685 15.162112 11.154085
8.316712 9.745249 3.354249
37.904022 31.389206 11.713221
26.594263 6.602002 19.067617
26.666832 18.493810 13.540259
16.295715 23.074414 19.648849
93.036619 44.201457 28.903012
30.942753 10.509283 10.143203
38.378123 18.404003 17.045466
25.086263 30.154396 19.368506
83.322350 16.709450 20.495816

Searching (I)
n

Search Data

As people input data on the web, the preprocessing procedure

change data to spatial data using DFT. With this spatial data, we
choose the nearest cluster. It denotes false alarm, we can find a
result data after using refining process.

Searching (II)
n

Search Data
Similary
Search
Result

Refining
Result

As you see on the above, we can find similar search results and
refining result. On the refining process, we determined e value
must be smaller than 1.0 (e<1.0). Otherwise the exact data which
we want to search cannot be found.
Serch
Found Similary Search Result
Company Name : GVE
Company Name : Medal co.
Company Name : YAYAWA
Refining Result
Company Name : Medal co.

Implementation (I)
n

Random Generate & insert data

User Interface
n

Visual C++ 6.0

ASP, Msoffice2000 component(Web
Application)

Database
n

SQL Server 7.0

We generate 1-D time series data using random generator in

Visual C++. Also we designed user interface using ASP(Active
Server Page) to search data on the web. 1-D time series data
shows graphical view using graph components in MSOffice200.
In fact we should make database system to test data indexing.
Instead of making database system, we use well-known
database system(SQLServer 7.0).

Implementation (II)
n

Datagram(table descript)

Cluster
values

Data &
DFT
values

Table consists of cluster data and spatial data. Cluster data and
ID are connected with foreign key with GRP(cluster group
number).
CREATE TABLE [dbo].[timeseries ] (
[ID] [float] NULL ,
[M1] [float] NULL , [M2] [float] NULL , [M3]
[float] NULL , [M4] [float] NULL , [M5] [float] NULL , [M6] [float] NULL , [M7]
[float] NULL , [M8] [float] NULL , [M9] [float] NULL , [M10] [float] NULL , [M11]
[float] NULL , [M12] [float] NULL , [DFT1] [float] NULL , [DFT2] [float] NULL ,
[DFT3] [float] NULL , [GRP] [int] NULL , [NAME] [nvarchar] (255) NULL
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[cluster] (
[ID] [int] NOT NULL , [C1] [float] NULL , [C2] [float] NULL , [C3] [float]
NULL ) ON [PRIMARY]
GO
ALTER TABLE [dbo].[cluster] WITH NOCHECK ADD
CONSTRAINT [PK_cluster] PRIMARY KEY NONCLUSTERED
( [ID] ) ON [PRIMARY]
GO
ALTER TABLE [dbo].[timeseries ] ADD
CONSTRAINT [FK_timeseries_cluster] FOREIGN KEY
( [GRP] ) REFERENCES [dbo].[cluster] ( [ID] )
GO

Demo

The image shows web designed 1-D time series data.

MAIN PAGE (overall composition)

Result
n

A possibility of 1-D Time Series Data

Searching on the Internet.
Dont know how fast it is compare
with another applications.
Should use real data instead of
generated data

We designed and implemented 1-D time series data searching

on the web. Actually we has a look a possibility of searching
stock data on the web in relevantly short time. In fact there is no
1-D times series data searching product on the web.
The problem is that searching 1-D time series data searching on
the web is possible but there is no analysis method to measure
the performance of 1-D time series data searching.

Reference
n

[1]ADVANCED DATABASE SYSTEMS,Carlo Zaniolo et al.

Morgan Kaufmann Publishers, pp.295-305, 1997.
[2]Fourier Transform of an image,
http://www.postech.ac.kr/~yirin/fft/fft.html
[3]C++ ALGORITHMS for DIGITAL SIGNAL PROCESSING
Second Edition, Paul M. Embree, Damon Danieli, pp.331339, 1998.
[4]Pattern Recognition with Neural Networks in C++, Abhjit
S. Pandya, Robert B. Macy, pp.213-230, 1995.

EVS Notes PDF
89% (9)
EVS Notes PDF
73 pages
Importance of Technology Transfer
80% (10)
Importance of Technology Transfer
6 pages
AC & DC Circuits
No ratings yet
AC & DC Circuits
142 pages
Session 11 Hierarchical DBSCAN
No ratings yet
Session 11 Hierarchical DBSCAN
27 pages
(Updated) DBMI SEM6 FINAL THEORY
No ratings yet
(Updated) DBMI SEM6 FINAL THEORY
41 pages
ICDM23 Tutorial Robust TS 12 03
No ratings yet
ICDM23 Tutorial Robust TS 12 03
105 pages
Clustering Time Series Under The FR Echet Distance: Anne Driemel Amer Krivo Sija Christian Sohler December 15, 2015
No ratings yet
Clustering Time Series Under The FR Echet Distance: Anne Driemel Amer Krivo Sija Christian Sohler December 15, 2015
53 pages
Dbmi Sem6 Final Theory
No ratings yet
Dbmi Sem6 Final Theory
29 pages
Density Based Clustering
No ratings yet
Density Based Clustering
25 pages
(Balasko, Dkk. 2007) Fuzzy Clustering
No ratings yet
(Balasko, Dkk. 2007) Fuzzy Clustering
77 pages
Kshape
No ratings yet
Kshape
49 pages
Din V 18599-3
No ratings yet
Din V 18599-3
81 pages
Chapter 7. Statistical Intervals For A Single Sample
No ratings yet
Chapter 7. Statistical Intervals For A Single Sample
102 pages
Data Mining Unit-Iv
No ratings yet
Data Mining Unit-Iv
34 pages
Machine Learning Unit-4
No ratings yet
Machine Learning Unit-4
24 pages
Advanced Clustering for Varied Densities
No ratings yet
Advanced Clustering for Varied Densities
4 pages
Anna University - Conduct-of-Examination-Manual
100% (2)
Anna University - Conduct-of-Examination-Manual
34 pages
Exam-Auc-1 Anna University Chennai CHENNAI 600 025. Instructions For The Conduct of Examination
100% (3)
Exam-Auc-1 Anna University Chennai CHENNAI 600 025. Instructions For The Conduct of Examination
34 pages
Network Theorems
No ratings yet
Network Theorems
21 pages
Ee8261 - Engineering Practices Laboratory: Syllabus Group B (Electrical & Electronics) A.Electrical Engineering Practice
No ratings yet
Ee8261 - Engineering Practices Laboratory: Syllabus Group B (Electrical & Electronics) A.Electrical Engineering Practice
24 pages
Dbscan and Optics
No ratings yet
Dbscan and Optics
28 pages
Lecture 12 - Unsupervised Learning - Shoould Be Marged
No ratings yet
Lecture 12 - Unsupervised Learning - Shoould Be Marged
31 pages
KNN Block Dbscan
No ratings yet
KNN Block Dbscan
15 pages
Destiny Control Manual For Swara Calendar App
100% (7)
Destiny Control Manual For Swara Calendar App
12 pages
Zhao Xiaojian
No ratings yet
Zhao Xiaojian
114 pages
Cluster Analysis
No ratings yet
Cluster Analysis
22 pages
A Review On Time Series Data Mining
100% (1)
A Review On Time Series Data Mining
18 pages
Clustering of Time-Series Data
No ratings yet
Clustering of Time-Series Data
20 pages
Iso 45009
No ratings yet
Iso 45009
30 pages
High Voltage Engineering Insights
No ratings yet
High Voltage Engineering Insights
45 pages
AF-DBSCAN Presentation
No ratings yet
AF-DBSCAN Presentation
30 pages
Data Mining II 4986
No ratings yet
Data Mining II 4986
4 pages
Introduction To Data Science Unsupervised Learning: CS 194 Fall 2015 John Canny
No ratings yet
Introduction To Data Science Unsupervised Learning: CS 194 Fall 2015 John Canny
54 pages
Temporal and Spatial Database
No ratings yet
Temporal and Spatial Database
26 pages
UNIT-6 DBSCAN Clustering
No ratings yet
UNIT-6 DBSCAN Clustering
6 pages
GE8261 - Electronics
50% (2)
GE8261 - Electronics
3 pages
Bej1906 004r2a0 PDF
No ratings yet
Bej1906 004r2a0 PDF
35 pages
Density Based Clustering (Unit 5)
No ratings yet
Density Based Clustering (Unit 5)
5 pages
Christophanic Exegesis and The Problem o PDF
No ratings yet
Christophanic Exegesis and The Problem o PDF
20 pages
Fuzzy Clustering Toolbox
No ratings yet
Fuzzy Clustering Toolbox
77 pages
Advanced Database Indexing
No ratings yet
Advanced Database Indexing
17 pages
Philosophy of Freedom Overview
No ratings yet
Philosophy of Freedom Overview
150 pages
Dbscan: Presented By: Garrett Poppe
No ratings yet
Dbscan: Presented By: Garrett Poppe
22 pages
Chapter - 1: 1.1 Overview
No ratings yet
Chapter - 1: 1.1 Overview
50 pages
DBSCAN - Introduction in Machine Learning.
No ratings yet
DBSCAN - Introduction in Machine Learning.
3 pages
Age Detection
No ratings yet
Age Detection
12 pages
A02-Multivariate Time Series Clustering Based On Complex Network
No ratings yet
A02-Multivariate Time Series Clustering Based On Complex Network
17 pages
Time Series
No ratings yet
Time Series
29 pages
Chem 2BLabManual201303
No ratings yet
Chem 2BLabManual201303
121 pages
Apriori Algorithm & Clustering Guide
No ratings yet
Apriori Algorithm & Clustering Guide
8 pages
CS614 Finalterm Subjective Referencefile
No ratings yet
CS614 Finalterm Subjective Referencefile
27 pages
Evolution of Programming Languages
No ratings yet
Evolution of Programming Languages
92 pages
Engineering Applications of Artificial Intelligence: Tak-Chung Fu
No ratings yet
Engineering Applications of Artificial Intelligence: Tak-Chung Fu
18 pages
2 Mark 16 Mark With Answer 2
100% (1)
2 Mark 16 Mark With Answer 2
83 pages
Similarity Search On Time Series Data
No ratings yet
Similarity Search On Time Series Data
37 pages
Sihem Jebari
No ratings yet
Sihem Jebari
10 pages
Comparing Clustering Algorithms Using Financial Time-Series Data
No ratings yet
Comparing Clustering Algorithms Using Financial Time-Series Data
21 pages
Atg Info2019en
100% (1)
Atg Info2019en
47 pages
Comparative Analysis of Clustering Techniques
No ratings yet
Comparative Analysis of Clustering Techniques
13 pages
An Improvement of DBSCAN Algorithm To Analyze Cluster For Large Dataset
No ratings yet
An Improvement of DBSCAN Algorithm To Analyze Cluster For Large Dataset
5 pages
EEE Exam: Electronic Devices & Circuits
No ratings yet
EEE Exam: Electronic Devices & Circuits
16 pages
EEE Exam: Electronic Devices & Circuits
No ratings yet
EEE Exam: Electronic Devices & Circuits
16 pages
Unit 4
No ratings yet
Unit 4
5 pages
DBSCAN
No ratings yet
DBSCAN
5 pages
Machine Learning Introduction Presentation
No ratings yet
Machine Learning Introduction Presentation
35 pages
Autoepsdbscan: Dbscan With Eps Automatic For Large Dataset: Manisha Naik Gaonkar & Kedar Sawant
No ratings yet
Autoepsdbscan: Dbscan With Eps Automatic For Large Dataset: Manisha Naik Gaonkar & Kedar Sawant
6 pages
DMDW Question Bank
No ratings yet
DMDW Question Bank
17 pages
Enhanced DBSCAN for Clustering
No ratings yet
Enhanced DBSCAN for Clustering
5 pages
Audio Signa
No ratings yet
Audio Signa
23 pages
OPTICS: Ordering Points To Identify The Clustering Structure
No ratings yet
OPTICS: Ordering Points To Identify The Clustering Structure
12 pages
Rust Language Cheat Sheet
No ratings yet
Rust Language Cheat Sheet
19 pages
Time-Series Clustering: Decade Review
No ratings yet
Time-Series Clustering: Decade Review
23 pages
On Clustering Binary Data: Tao Li Shenghuo Zhu
No ratings yet
On Clustering Binary Data: Tao Li Shenghuo Zhu
5 pages
Introduction To (Demand) Forecasting
No ratings yet
Introduction To (Demand) Forecasting
35 pages
Datamining and Dataware Housing With Special Reference TO Partitional Algorithms in Clustering of Data Mining
No ratings yet
Datamining and Dataware Housing With Special Reference TO Partitional Algorithms in Clustering of Data Mining
10 pages
IJRET - Scalable and Efficient Cluster-Based Framework For Multidimensional Indexing
No ratings yet
IJRET - Scalable and Efficient Cluster-Based Framework For Multidimensional Indexing
5 pages
Clustering Data Stream Based On Shared Density Graph: Algorithm Explanation
No ratings yet
Clustering Data Stream Based On Shared Density Graph: Algorithm Explanation
2 pages
Tinjauan Yuridis Tentang Upaya-Upaya Hukum Oleh Putra Halomoan HSB
No ratings yet
Tinjauan Yuridis Tentang Upaya-Upaya Hukum Oleh Putra Halomoan HSB
23 pages
University Question Bank - CT
No ratings yet
University Question Bank - CT
25 pages
University Question Bank - CT
No ratings yet
University Question Bank - CT
25 pages
2308 Ngọc
No ratings yet
2308 Ngọc
14 pages
TQM - TRG - F-07 - Cluster Analysis - Rev02 - 20180421
No ratings yet
TQM - TRG - F-07 - Cluster Analysis - Rev02 - 20180421
42 pages
Eno ModernismIndia 1925
No ratings yet
Eno ModernismIndia 1925
17 pages
Goal Setting for Grade 11 Students
No ratings yet
Goal Setting for Grade 11 Students
11 pages
Modul React Bahasa Inggeris PMR (Pemulihan) Paper 2 Section B - Literature - Novel
No ratings yet
Modul React Bahasa Inggeris PMR (Pemulihan) Paper 2 Section B - Literature - Novel
5 pages
Alessandra Lemma - Minding The Body - The Body in Psychoanalysis and Beyond (2014, Routledge)
80% (5)
Alessandra Lemma - Minding The Body - The Body in Psychoanalysis and Beyond (2014, Routledge)
211 pages
Exam Administration Guide
No ratings yet
Exam Administration Guide
11 pages
Summoning Primers
100% (2)
Summoning Primers
2 pages
Ge English Through Literature DU
No ratings yet
Ge English Through Literature DU
5 pages
Measurement of Voltage in Engineering Practices Lab
No ratings yet
Measurement of Voltage in Engineering Practices Lab
4 pages
University Exam Fees
No ratings yet
University Exam Fees
1 page
Heidegger and The Question of Daseins Being-A-Whole
No ratings yet
Heidegger and The Question of Daseins Being-A-Whole
4 pages
Introduction To Mechanisms: 2 Mechanisms and Simple Machines
No ratings yet
Introduction To Mechanisms: 2 Mechanisms and Simple Machines
6 pages
3 Pamatong V Comelec GR No 161872
100% (1)
3 Pamatong V Comelec GR No 161872
2 pages
Gardner - Property & Theft' Notes
No ratings yet
Gardner - Property & Theft' Notes
4 pages
Solved Questions EEE 2015
No ratings yet
Solved Questions EEE 2015
30 pages
The Case of The Vanishing
No ratings yet
The Case of The Vanishing
7 pages
Áreas de Brodmann
No ratings yet
Áreas de Brodmann
3 pages
Lesson Plan Letter H
No ratings yet
Lesson Plan Letter H
5 pages
Panchakanya Marketing and Sales
No ratings yet
Panchakanya Marketing and Sales
3 pages
Electrical Power Measurement Guide
No ratings yet
Electrical Power Measurement Guide
4 pages
Lab Activity 1
No ratings yet
Lab Activity 1
2 pages
Basic Japanese Free Learning Guide Lesson 1.5
No ratings yet
Basic Japanese Free Learning Guide Lesson 1.5
3 pages
2.B. Ep Mech Set - 2 2018
No ratings yet
2.B. Ep Mech Set - 2 2018
4 pages
EC7 Pile Design Seminar Overview
100% (2)
EC7 Pile Design Seminar Overview
66 pages
AUR Claim Form Instructions
No ratings yet
AUR Claim Form Instructions
1 page

Time Series Indexing Guide

Uploaded by

Time Series Indexing Guide

Uploaded by

Advanced Database Systems

1-D Time Series Data Indexing

1-D Time Series Data

1-D Time Series Data

An example of 1-D time series data is stock data. I have

57.709375 59.903125 52.584375

69.859375 77.396875 73.921875

1-D Time Series Data Generating

As I said on the previous page, time series data(Stock data)

Overall Processing Method

1-D Time Series Data -> Spatial Data

Using DFT, Clustering(Iterative Method)

Indexing (R-tree like method)

1-D Time Series Data -> Spatial Data

Using Euclidean Distance

1-D Time Series Data Indexing has two parts of processing

DFT(Discrete Fourier Transform)

No fault dismissals(Parsevals theorem )

for n=0 to sizeofdata -1

Calculate Cluster Center using K-means

K-means is a method to search nearest

Grouping some elements which have a

After clustering data as you can see on the above, we can

23.340117 66.733917 10.157041

As people input data on the web, the preprocessing procedure

Random Generate & insert data

Visual C++ 6.0

SQL Server 7.0

We generate 1-D time series data using random generator in

The image shows web designed 1-D time series data.

MAIN PAGE (overall composition)

A possibility of 1-D Time Series Data

We designed and implemented 1-D time series data searching

[1]ADVANCED DATABASE SYSTEMS,Carlo Zaniolo et al.

You might also like