Earticle

현재 위치 Home

International Journal of Database Theory and Application

간행물 정보
  • 자료유형
    학술지
  • 발행기관
    보안공학연구지원센터(IJDTA) [Science & Engineering Research Support Center, Republic of Korea(IJDTA)]
  • pISSN
    2005-4270
  • 간기
    격월간
  • 수록기간
    2008 ~ 2016
  • 주제분류
    공학 > 컴퓨터학
  • 십진분류
    KDC 505 DDC 605
Vol.7 No.5 (18건)
No
1

Improved Optimization for Data Disaster Recovery System over Low-Bandwidth Networks

Jian Wan, Xiaolong Hong, Jinlin Zhang

보안공학연구지원센터(IJDTA) International Journal of Database Theory and Application Vol.7 No.5 2014.10 pp.1-16

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

Data generated by various fields are increasing exponentially and thus results in challenges for data performances in both scales of diversity and complexity. The problem how to solve the bottlenecks of low -bandwidth networks has been of fatal significance for all kinds of network status. We present a new approach on improved optimization for data disaster recovery system (DDRS) over low-bandwidth networks that not only aims to improve the defects and deficiencies of mainstream DDRS but also helps ensure the reliable network resources for operators to conduct multi-services. A novel bandwidth self-adaptive approach (BSAA) for data packing replication was essentially established to make contribution to the integral performance improvement. A Hidden Markov Model (HMM) for predicting network status was also built to ensure system availability and stability. Experiments showed that the DDRS over low-bandwidth networks named InfoDr can effectively optimize the workload with better performance and better application self-adaptability for multi-services. Keywords: Data Disaster Recovery System; Low-bandwidth network; Deduplication

2

Critical Analysis of Density-based Spatial Clustering of Applications with Noise (DBSCAN) Techniques

Said Akbar, M.N.A. Khan

보안공학연구지원센터(IJDTA) International Journal of Database Theory and Application Vol.7 No.5 2014.10 pp.17-28

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

Clustering is the most used technique in data mining. Clustering maximize the intra-cluster similarity and minimize the inter clusters similarity. DBSCAN is the basic density based clustering algorithm. Cluster is defined as regions of high density are separated from regions that are less dense. DBSCAN algorithm can discover clusters of arbitrary shapes and size in large spatial databases. Beside its popularity, DBSCAN has drawbacks that its worst time complexity reaches to O (n2). Similarly, it cannot deal with varied densities. It is hard to know the initial value of input parameters. In this study, we have studied and discussed some significant enhancement of DBSCAN algorithm to tackle with these problems. We analysed all the enhancements to computational time and output to the original DBSCAN. Majority of variations adopted hybrid techniques and use partitioning to overcome the limitations of DBSCAN algorithm. Some of which performs better and some have their own usefulness and characteristics.

3

Recognizing Comparative Sentences from Chinese Review Texts

Wei Wang, TieJun Zhao, GuoDong Xin, YongDong Xu

보안공학연구지원센터(IJDTA) International Journal of Database Theory and Application Vol.7 No.5 2014.10 pp.29-38

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

Comparisons play an important role in making decisions by referring to the comparative opinions of opinion holders in earlier customer reviews. Recognizing comparative sentences from review texts contributes to opinion mining and information recommendation. Our objective is to automatically recognize comparative sentences from Chinese text documents. In this paper, an effective approach is proposed based on comparative patterns to recognize comparative sentences in Chinese. Our experiments on customer-generated product reviews show that the proposed approach is effective.

4

Sentiment Analysis Approaches on Different Data set Domain : Survey

Shailendra Kumar Singh, Sanchita Paul, Dhananjay Kumar

보안공학연구지원센터(IJDTA) International Journal of Database Theory and Application Vol.7 No.5 2014.10 pp.39-50

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

The growth of social website and electronic media contributes vast amount of user generated content such as customer reviews, comments and opinions. Sentiment Analysis term is referred to the extraction of others (speaker or writer) opinion in given source material (text) by using NLP, Linguistic Computation and Text mining. Sentiment classification of product and service reviews and comments has emerged as the most useful application in the area of sentiment analysis. This paper focuses on the comparative study (1997 – 2012) of different sentiment classification techniques performed on different data set domain such as web discourse, reviews and news articles etc. The most popular approaches are Bag of words and feature extraction used by researchers to deal with sentiment analysis of opinion related to movies, electronics, cars, music etc. The sentiment analysis is used by manufacturers, politicians, news groups, and some organization to know the opinions of customer, people, and social website users.

5

An Optimized Splitting Attribute Algorithm for Inconsistent Conflict in Context Lattice

Zhou Zhong, Junzhong Gu

보안공학연구지원센터(IJDTA) International Journal of Database Theory and Application Vol.7 No.5 2014.10 pp.51-84

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

With the emergence of Cloud computing and Internet of Things, Context-aware applications face new challenges. One of them is big data from huge context application and sources. The main stream of applications have used not only real-time versions but also history versions of context data. This paper concerned about optimization techniques of storage and reasoning in the CMS (context management system). For our storage of context data from different sources, FCA Lattice has been employed as a kind of storage schema to support modeling and fusion of these different context data. Further, context conditions about data are essential to logical reasoning. Under different context conditions, context data can be promoted to be knowledge, which makes context reasoning readily. In the dynamic environment, to get reasonable results, reasoning services require their input to keep consistent in the changeable conditions. The changeable conditions can be represented as context attributes, intervals and relations etc. To make consistent knowledge available in the conditions, our pervious works have analyzed incremental cache and check of consistent intervals, and proposed a context lattice-based distributed optimized update algorithm. In this paper, based on the algorithm, our problem is to optimize the split function. The split is needed when current lattice has no condition making knowledge consistent. The main aim of this paper is to improve time performance of splitting attributes or intervals or fuzzy relations that could be detailed. We propose a new parallel split algorithm. This algorithm computes the priorities of candidates. To reduce time cost, it decreases the split scope by choosing the split candidate with the highest priority value. To decrease the full lattice update time in the split process, it generates the sub lattices split by the candidates concurrently and merges them after. On the theory, we analyze the feasibility of the algorithm. On the test, as a new part of the whole update algorithm, it is compared with the naïve one, and it shows the better time performance. What’s more, it makes multi-threads execute on the same lattice to avoid producing more memory cost caused by copying the lattice for an independent thread.

6

Imbalanced Data Classification Based on AdaBoost-SVM

Li Peng, Bi Ting-ting, Yu Xiao-yang, Li Si-ben

보안공학연구지원센터(IJDTA) International Journal of Database Theory and Application Vol.7 No.5 2014.10 pp.85-94

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

The classification of imbalanced data is one of the most challenging problems in data mining and machine learning research. Imbalanced dataset is a form that exists in reality area, which describes truly and objectively the essential characters of something. There will appear paucity of data and flooded in the classification of imbalanced dataset. Beside problems such as loss of information and data splitting phenomenon will also appear when using the traditional machine learning methods. So how to solve the classification problem of imbalanced data will be challenging. In this paper, aiming at the above problems, a classification algorithm based on AdaBoost-SVM is proposed. In the experiments with four typical forms of imbalanced data sets in UCI were validated the effectiveness of this strategy.

7

A Rough Set Based Classification Model for the Generation of Decision Rules

Vinod Rampure, Akhilesh Tiwari

보안공학연구지원센터(IJDTA) International Journal of Database Theory and Application Vol.7 No.5 2014.10 pp.95-108

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

This paper introduces a very important classification aspect for the analysis of huge amount of data stored in databases and other repositories. Numerous classification models are available in the literature, to predict the class of objects whose class level is unknown. Literature reveals that most of the available models are not capable in handling imperfect data. In view of this, present paper proposes a new rough set based classification model to derive the classification (IF-THEN) rules. Furthermore, developed model has been applied to handle bank-loan applications database as either safe, unsafe or risky. However, proposed model can also be used for the analysis of data from other domains.

8

A Petri Net Processing Model of STeCEQL

Huiyong Li, Yixiang Chen, Kangli He

보안공학연구지원센터(IJDTA) International Journal of Database Theory and Application Vol.7 No.5 2014.10 pp.109-122

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

The internet of vehicles is an important internet of things. It is an urgent problem that how to real-time processes these spatial and temporal data of the internet of vehicles. The complex event processing technology can filter the concerned data to the event by the complex event query language (EQL) and the system can responds effectively. The STeCEQL is a complex event language for the internet of vehicles, which constraint with spatial and temporal. The processing model of the complex event language is a core issue of the complex event processing technology. In this paper, we established a Petri Net processing model of STeCEQL. We give all kind of processing model of STeCEQL expressions. And these processing models are synthesized by two types of basic Petri Net model: the sequence structure and the logic and structure. Finally, we proved the Petri Net processing model of STeCEQL is structural boundedness and structural conservativeness, but it is not structural repetitiveness.

9

A Novel Text Steganography System in Financial Statements

Md Khairullah

보안공학연구지원센터(IJDTA) International Journal of Database Theory and Application Vol.7 No.5 2014.10 pp.123-132

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

Steganography can be defined as a method of hiding data within a cover media so that other individuals fail to realize their existence. Image, audio and video are some popular media for steganography. But text is ideal for steganography due to its ubiquity and smaller size compared to these media. Particularly, the size really does matter for mobile communication. However, text communication channels do not necessarily provide sufficient redundancy for covert communication. In this paper, a new approach for steganography in various financial statements is proposed. The main idea is that additional zeroes can be added before a number and also after the fractional part of a number without changing the value of the number.

10

Design of Tool Automatic Identification and Database Management System Based on RFID

Xiulin Sui, Yan Teng, YongQiu Chen, XueHui Wang

보안공학연구지원센터(IJDTA) International Journal of Database Theory and Application Vol.7 No.5 2014.10 pp.133-144

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

RFID (Radio Frequency Identification) is a non-contact automatic identification technology, without human intervention. It is convenient to apply and it can identify fast moving objects. Around the tool and the tool information, the problem of low operational efficiency and low degree of tool management information are still unsolved. In this paper, RFID technology is applied in the tool automatic identification and management system. The tool automatic identification and database management system based on RFID is built. The application model, tool information database design and Upper Computer management software of the RFID in tool identification and management are analyzed. The overall architecture and the software function module of the system are constructed. The database management system is developed by the LabVIEW software. The interface design between LabVIEW and Access database is completed by using ActiveX, as a result, the information of Access database can be written-in and read-out by the management system. Meanwhile, the communication protocol between the Upper Computer and Lower Computer is programed that based on VISA. The RS232 is applied to make the communication between the Upper Computer and Lower Computer come true. Finally, take the SECO's Solid Carbide Ball-end Milling Cutter as an example to have an experiment. Through the experiment, the system proved to be with high reliability and efficiency.

11

Testing and Evaluation of a Hierarchical SOAP based Medical Web Service

Abhijit Bora, Tulshi Bezboruah

보안공학연구지원센터(IJDTA) International Journal of Database Theory and Application Vol.7 No.5 2014.10 pp.145-160

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

Performance testing of hierarchical web service communications is essential from the perspective of users as well as developers, since it directly reflects the behavior of the service. As such we have developed a SOAP based research web service, suitable for online medical services to study the performance and to evaluate the technique used for developing the service. We call the service as MedWS (prototype research medical web service). Load and stress testing have been carried out on MedWS using Mercury Load Runner to study the performance, stability, scalability and efficiency of the service. The performance depends on metrics such as hits/sec, response time, throughput, errors/s and transaction summary. These metrics are tested with different stress levels. The statistical analysis on the recorded data has been carried out to study the stability and quality of the application. The present study reveals that the SOAP based web service which has been developed with Java programming language is stable, scalable and effective. We present here the architecture, testing procedure, results of performance testing as well as the results of statistical analysis on recorded data of MedWS.

12

A Novel Text Copy Detection Method based on Semantic Feature

Jianjun Zhang, Xingming Sun, Jin Wang

보안공학연구지원센터(IJDTA) International Journal of Database Theory and Application Vol.7 No.5 2014.10 pp.161-170

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

With the rapid development of the Internet, getting shared resource on the network is becoming more easy, and various plagiarize is becoming to breed, so the research of text copy detection technology is becoming more important. The traditional copy detection technology is based on term frequency statistics, and does not consider the context semantic. Some plagiarism can be easily made by replacing synonyms, changing the sentence structure, or translating from one language to another language. But the traditional copy detection technology could not detect such plagiarism. In this paper, a text copy detection method based on semantic is proposed. By using an improved TFIDF algorithm, terms could be more accurately extracted from each document in the corpus. By putting the documents corresponding to the terms one by one, a terms category is built in the database. When a document is detecting, the terms are read from the database and matched. The testing results show that, compared to the traditional TFIDF algorithm, the improved method could more accurately detect the plagiarism.

13

A Formal Description of XML Tree Pattern Query for XQuery Language

Husheng Liao, Xiaoqing Li, Hang Su

보안공학연구지원센터(IJDTA) International Journal of Database Theory and Application Vol.7 No.5 2014.10 pp.171-186

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

In order to express tree pattern query in query plan and take advantage of formal method to analyze its behavioral characteristics, this paper present a formal description of tree pattern query based on functional language and denotational semantics. This description major focuses on behavior of a tree pattern query on matching against an eXtensible Markup Language (XML) document tree. First, we introduce a formal definition for a kind of extended generalized tree pattern (GTP++). Then we present a functional tree pattern description language (XTPL) for GTP++ and give its complete denotational semantics based on a novel data structure, named WTree, which efficiently organizes this typical XML data query results and provides flexible data access method. In the end, we present the formal semantics of identifying tree pattern from path expressions. By using formal methods, the semantics of tree pattern query is consistent and analyzable. As the core operation of XML query, this formal description can provide an initial step for analyzing the correctness of XML queries, and improves the reliability and robustness of query processing methods.

14

Promising the developers with the facility of distributed collaborative development for complex simulation applications, HLA (High Level Architecture) provides a baseline supporting the reuse of capabilities available in different simulations with a significant reduction in cost and time. Along with improved execution process and reusability, the induction of object oriented model in the implementation design of HLA also enable to take the advantage of latest object oriented features making design, implementation and maintainability easier and at each level of federate development and execution. The important areas to be addressed for design reconsideration consists of data exchange model and HLA communication layer. The data exchange model comprises of federates in federation and between runtime infrastructure and federation. The Federate Object Model (FOM) architecture is not completely object oriented, the induction of Object Service Tier (OST) middleware may offer a degree of FOM agility which is the ability of an application to adjust according to different FOMs (behaviors for Federates). Whereas, in HLA communication layer, customary HLA systems are based upon bidirectional call/callback interactions between federate. Several enhancements and changes anticipated in object oriented communication layer (OOP-COMM) introduced in Object Service Tier (OST) as compared to native procedures, such as communication mechanisms, data encoding, session handling, distributed environment and performance analysis. The motivation behind the use of core object oriented modeling features and proposed Object Service Tier (OST) middleware is the reuse of legacy systems, features which may further enhance the integration of distributed simulation systems and extension types. So, this paper provides a multidimensional analysis of important design aspects of Object Service Tier (OST) middleware in HLA framework and devises some design constructs of Object Service Tier (OST) using object oriented model. This paper is intended to propose object-oriented model providing generalization through the Object Service Tier (OST) middleware in HLA framework for the distributed simulation system environment.

15

The existing classifiers for uncertain data don’t consider the dynamic cost, so this paper proposes the classification approach of the dynamic cost-sensitive decision tree for uncertain data based on the genetic algorithm (GDCDTU) , which overcomes the limitations of the stationary cost, and searches automatically the suitable cost space of every sub datasets. Firstly, this paper gives the dynamic cost- sensitive learning thought, and disposes the continuous and discrete attributes for uncertain data by the probabilistic cardinality. Secondly, we give the selection methods for the splitting attributes and the construction process for cost-sensitive decision tree, and the interval number for describing dynamic cost is coded by its centre and radius. At last, the dynamic cost-sensitive decision tree for uncertain data is structured, which uses the genetic algorithm as the optimal misclassification cost searching way, and the optimum cost is got by the hybridization, the mutation, the selection. The experiments using both artificial and real data sets show that, compared to the other decision tree classification algorithms for uncertain data, GDCDTU has higher classification accuracy and performance, and the total expenditure is lower.

16

An Accurate Identification of Extended XML Tree Pattern for XQuery Language

Husheng Liao, Xiaoqing Li, Junpeng Chen

보안공학연구지원센터(IJDTA) International Journal of Database Theory and Application Vol.7 No.5 2014.10 pp.211-226

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

In order to utilize high-performance XML tree pattern query (TPQ) for implementing of XQuery language effectively, it is necessary to analysis the query plan and identify tree pattern from it. In this paper, we extend the functional intermediate language FXQL, which is used to implement XQuery language, with an extended XML generalized tree pattern representation (GTP++). Then, we propose an XML tree pattern identification approach, which is composed of a suit of query expression rewriting rules for extracting tree pattern and a GTP++ construction algorithm. Based on this approach, both explicit and implied propositional logic, various structural constraints and predicates can be extracted across nested query blocks in XQuery FLWOR expressions. The tree pattern identified by this approach is more holistic and precisely than previous methods. The approach expands the application of XML tree pattern query technology in the implementation of XQuery language. Experiments show its effectiveness and practicability.

17

Using SPARQL/UPDATE to Extend RDB-to-RDF : A Mapping Approach

Yanping Chen, Qian Yang

보안공학연구지원센터(IJDTA) International Journal of Database Theory and Application Vol.7 No.5 2014.10 pp.227-238

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

18

Reliability Analysis and Prediction for Product Design Based on Feature Similarity

Tao Yang, Yu Yang, Yao Jiao

보안공학연구지원센터(IJDTA) International Journal of Database Theory and Application Vol.7 No.5 2014.10 pp.239-252

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

During product design phase, aiming at the problem of lacking reliability data, lower of product reliability, feature similarity-based new product design reliability analysis and prediction model were proposed. Putting the new product features as an evaluation objectives, an approach named Technique for Order Preference by Similarity to Ideal Solution(TOPSIS) was established firstly for selecting similar features products; Then, in order to realize the reliability analysis relational mapping with the new product design, the failure structure of the similar features products was quantified and the product failure structure matrix (FSM) was established, respectively; Afterwards, the Group Decision Making Method (GDMM) was presented for determining the improvement factor of the similar features products failure causes, on that basis, the new product features failure structure was generated to predict the reliability of new designing products. Finally, feasibility and effectiveness of the model were verified through an example of new Smart Mobile Phone product design.

 
페이지 저장