Earticle

현재 위치 Home

International Journal of Database Theory and Application

간행물 정보
  • 자료유형
    학술지
  • 발행기관
    보안공학연구지원센터(IJDTA) [Science & Engineering Research Support Center, Republic of Korea(IJDTA)]
  • pISSN
    2005-4270
  • 간기
    격월간
  • 수록기간
    2008 ~ 2016
  • 주제분류
    공학 > 컴퓨터학
  • 십진분류
    KDC 505 DDC 605
Vol.6 No.4 (16건)
No
1

NoSQL Database: New Era of Databases for Big data Analytics-Classification, Characteristics and Comparison

A B M Moniruzzaman, Syed Akhter Hossain

보안공학연구지원센터(IJDTA) International Journal of Database Theory and Application Vol.6 No.4 2013.08 pp.1-13

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

Digital world is growing very fast and become more complex in the volume (terabyte to petabyte), variety (structured and un-structured and hybrid), velocity (high speed in growth) in nature. This refers to as ‘Big Data’ that is a global phenomenon. This is typically considered to be a data collection that has grown so large it can’t be effectively managed or exploited using conventional data management tools: e.g., classic relational database management systems (RDBMS) or conventional search engines. To handle this problem, traditional RDBMS are complemented by specifically designed a rich set of alternative DBMS; such as - NoSQL, NewSQL and Search-based systems. This paper motivation is to provide - classification, characteristics and evaluation of NoSQL databases in Big Data Analytics. This report is intended to help users, especially to the organizations to obtain an independent understanding of the strengths and weaknesses of various NoSQL database approaches to supporting applications that process huge volumes of data.

2

Semantics Oriented Web Searching

Junzhong G

보안공학연구지원센터(IJDTA) International Journal of Database Theory and Application Vol.6 No.4 2013.08 pp.15-26

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

Current Web search can’t fully satisfy users’ requirements. This is also due to the fact that it is term based rather than concept based or even object oriented. It is studied here, how to extend a search function semantically in order to improve its functionality and performance. It’s proved that semantics oriented Web searching is a good solution to enable concept/object oriented Web search, faceting search, as well as associate and relationship search. The knowledge supporting for semantics extension is proposed and a corresponding ontology solution is designed. At last, a case study named SmartSearch, launched at ECNU-ICA since 2011, is presented. The case study shows that semantics oriented Web search is powerful and that the search results will be efficiently improved with a semantic extension.

3

The B/S mode based student party member information management system is a typical management system, and the development of this system is primarily composed of two aspects: establishment and maintenance of foreground application and background database. This system is developed by using the ASP.NET platform, C# language, SQL Server database in the integrated environment of Microsoft.NET framework in combination with Web development technologies. When this system is completed, it can manage a full range of information of all the student party members in a standardized way. The general design provides a detailed description of the functionality of each module and the design of the database. Besides, it describes the login page, main page and the implementation processes of several functions such as data adding and query in details.

4

A Continuous Information Attribute Reduction Algorithm Based on Hierarchical Granulation

Long Chen, Tengfei Zhang

보안공학연구지원센터(IJDTA) International Journal of Database Theory and Application Vol.6 No.4 2013.08 pp.39-48

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

The attribute reduction algorithms based on neighborhood approximation usually use the distance as the approximate metric. Algorithms could result in the loss of information with the same distance threshold to construct the neighborhood families of different dimension spaces. Thereby, an attribute reduction algorithm based on hierarchical granulation is proposed. This algorithm can reduce redundant attributes in the same granularity. Experimental results with UCI data sets show that the algorithm can improve the classification power, and reduce the loss of information.

5

E-R Method Applied to Design the Teacher Information Management System’s Database Model

Yingjian Kang, Dan Zhao

보안공학연구지원센터(IJDTA) International Journal of Database Theory and Application Vol.6 No.4 2013.08 pp.49-58

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

The development and application of management information system must be supported by the database. Data model is the database of data organization form. One of the core problems of database design is to have a good data model, storing the data effectively, meeting the application requirements of various users. E-R method is widely used in database model design. This article focuses on E-R method applied to design the teacher information management system’s database model.

6

Design and Implementation of Rich Client Cloud Storage System

Meng Fanjun, Zhang Zhongwei, Lin min, Jiong Xie

보안공학연구지원센터(IJDTA) International Journal of Database Theory and Application Vol.6 No.4 2013.08 pp.59-70

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

In tradition, we always simply purchase additional machines to solve the storage problems. However, this method only expands the storage capacity. It is un-efficient and high-cost, which cannot solve the storage problems due to the architecture issue. After future analyzing the reasons of storage issue, we bring the cloud computing and cloud storage technology (a rich client-based cloud storage model) into our storage architecture. In the present study, we design and implement a rich client-based cloud storage system. Experimental results show that our modified system can effectively manage huge amount of data with low developing costs and high processing speed. In addition, our design can be easily spread with high scalability and stability.

7

A Comparative Study of Different Feature Extraction and Classification Methods for Recognition of Handwritten Kannada

Mamatha Hosalli Ramappa, Srikantamurthy Krishnamurthy

보안공학연구지원센터(IJDTA) International Journal of Database Theory and Application Vol.6 No.4 2013.08 pp.71-90

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

The feature extraction and classification method(s) used to recognize handwritten characters play an important role in Handwritten Character Recognition applications. A suitable feature extractor and a good classifier play a very important role in achieving high recognition rate for a recognition system. If we want to develop a new feature extractor for a script, it will help us if we have the knowledge of the recognition ability of the existing feature extractor. Kannada is a major south Indian script spoken by about 50 million people. This paper examines a variety of feature extraction approaches and classification methods which have been used in various Optical Character Recognition applications which are designed to recognize handwritten numerals of Kannada script. The study has been conducted using 8 different features computed from zonal extraction, image fusion, radon transform, fan beam projections, directional chain code, discrete fourier transform, run length count and curvelet transform along with ten different classifiers like Euclidean distance, Chebyshev distance, Manhattan (city block) distance, Cosine distance, K-NN, K-means, K-medoids, Linear classifier, Artificial Immune system and Classifier fusion are considered.

8

Optimizing Theta-Joins in a MapReduce Environment

Changchun Zhang, Jing Li, Lei Wu

보안공학연구지원센터(IJDTA) International Journal of Database Theory and Application Vol.6 No.4 2013.08 pp.91-108

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

Data analyzing and processing are important tasks in cloud computing. In this field, the MapReduce framework has become a more and more popular tool to analyze large-scale data over large clusters. Compared with the parallel relational database, it has the advantages of excellent scalability and good fault tolerance. However, the performance of join operation using MapReduce is not as good as that of parallel relational database. Thus, how to optimize theta-join operations using MapReduce is an attractive point to which researchers have been paying attention. In this paper, a randomized algorithm named Strict-Even-Join(SEJ) is designed to solve the multi-way theta-joins in a single MapReduce job. Moreover, a dynamic programming algorithm is elaborated to optimize the multi-way theta-joins by calling the SEJ algorithm. The results of experiments show that our approach is feasible and effective.

9

Performance Investigation of Support Vector Regression using Meteorological Data

Somya Jain, MPS Bhatia

보안공학연구지원센터(IJDTA) International Journal of Database Theory and Application Vol.6 No.4 2013.08 pp.109-118

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

Predicting fire nature is artistry as much as it’s a science. Forecasting the burnt area and range of field plays a vital role in resource abatement and renewal efforts. Literature studies have shown that machine learning techniques achieved better performance in forecasting and trend perusal. The purpose of this paper is to investigate the relevance of the state-of-the-art machine learning techniques epsilon Support Vector Regression and Nu-SVR to predict forest fire occurrence and burned area utilizing the meteorological data. The goals of this research are to (1) Identifying the best parameter settings using a grid-search and pattern search technique; (2) comparing the prediction accuracy among the models using different data sorting methods, random sampling and cross-validation. In conclusion, the experiments show that E-SVR performs better using various fitness-functions and variance analysis. The study is carried out to build predictive models for guesstimating the risk of the outbreaks in Montesinho Natural Park.

10

Role of Formalism in Software Reusability’s Effectiveness

Muhammad Ilyas, Mubashir Abbas

보안공학연구지원센터(IJDTA) International Journal of Database Theory and Application Vol.6 No.4 2013.08 pp.119-130

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

Economical, qualitative and shorten time software development are key objectives in software engineering. Different techniques are used to achieve entire goals. Reusability is one of the popular software development methodologies, which effectively reduces time, cost and effort for software development. It also minimizes software failure risk by using already tested components. The objective of minimal time, cost and effort can be fulfilled and maximized by effective use of reusability. This effectiveness can be achieved by formalizing each activity during reusability process. Adaptation of formalism relatively enhances effectiveness of reusability methodology. In this paper we have investigated some factors about reusability effectiveness and role of formalism in its effectiveness. In this investigation 42 factors are taken into consideration. These factors are grouped into 10 sections like Reusability Process, Reusable Test, Formalism and Extraction of Reusable components etc. Our findings are based on statistical analysis of industrial data that indicates the way in which reusability is taking place and productivity is earned.

11

Automatic Extraction of Semi-structured Web Data

Fang Dong, Mengchi Liu, Yifeng Li

보안공학연구지원센터(IJDTA) International Journal of Database Theory and Application Vol.6 No.4 2013.08 pp.131-144

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

As a huge data source the internet contains a large number of valuable information, and the data of information is usually in the form of semi-structured in HTML web pages. In order to extract the web data and organize the data with the relationships which are similar to the real world, this paper has proposed a method for automatic data extraction from the web. With the combination of keywords and database content matching, the target web pages which contain valuable data will be crawled. Via HTML structure and visual features, extracting the data from the web pages crawled. Eventually, the data been extracted will be integrated to the structure of information network model. Experimental results indicate that this method can be able to apply to semi-structured data extraction in the web, and this paper has provided positive significance to extraction and manage semi- structured web data.

12

The Use of Data Mining Techniques and Support Vector Regression for Financial Forecasting

Liqiang Hou, Shanlin Yang, Zhiqiang Chen

보안공학연구지원센터(IJDTA) International Journal of Database Theory and Application Vol.6 No.4 2013.08 pp.145-156

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

In recent years, data mining techniques such as neural networks, support vector Regression have been applied extensively to the task of predicting financial variables. As influenced by various factors, the volatility of stock shows a non-linear characteristic, which demonstrates that the forecasting is a non-linear problem. Support vector regression (SVR) is proven to be useful in dealing with non-linear forecasting problems in recent years. The key point in using SVR for forecasting is how to determine the appropriate parameters. An improved Artificial Neural Networks(ANN) algorithm is used to optimize the parameter set of (C, σ), which influences the performance of this model directly. By doing so, this model can deal with the nonlinearity and multi-factors of volatility, and ensure stability and accuracy of support vector machine based regression. Finally, we study a case with the satisfactory result by the SPA test which is showing that this model is more accurate than other models, which guarantees its application.

13

Distributed Workflow Architecture Based on Flexible Data Management

Yinzhou Zhu, Baolin Yin

보안공학연구지원센터(IJDTA) International Journal of Database Theory and Application Vol.6 No.4 2013.08 pp.157-168

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

In recent years, distributed workflow management technology attracted much attention for the requirement of rapid development of process oriented software systems. However the current workflow management systems do not have sufficient capabilities in controlling data stream between the workflow engine and applications, decreasing the reusability of the workflow components. Therefore we enhance the basic data model of the distributed workflow system, and propose an architecture based on a flexible data mapping model to improve the reusability of the workflow components in distributed workflow management. A real-world scenario which is built and implemented based on this architecture is shown to prove the effectiveness and usefulness of the architecture.

14

Research on Method of Product Configuration Design Based on Product Family Ontology Model

J. H. Ge, Y. P. Wang, J. Zhang, H. Gao

보안공학연구지원센터(IJDTA) International Journal of Database Theory and Application Vol.6 No.4 2013.08 pp.169-178

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

In order to meet the design requirements of quick configuration of products for consumers, facing the problems of the poor share and reuse of product configuration knowledge, and inefficient configuration, propose the product configuration design method of product family model based on the forecast of user demands. Build the ontology modeling of product family using the share and reuse of ontology in the semantic and knowledge level; Achieve the dynamic update of product family model using the method of ontology integration; Based on getting the users ' needs, through the map from ontology of user requirements to ontology model of product family, convert customer needs into knowledge of product configuration effectively, then complete the ontology map of user demands, and achieve the conversion from customer requirements to function of product family .Verify the validity of the method through the example of reducer.

15

ETL Process Modeling In DWH Using Enhanced Quality Techniques

Kushanoor Akbar, Dr. S.Murali Krishna, T. Vidya Sagar Reddy

보안공학연구지원센터(IJDTA) International Journal of Database Theory and Application Vol.6 No.4 2013.08 pp.179-198

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

Large organizations have a lot of data. The data can be stored in many formats including data bases and unstructured file. This data bases must be collected, compared and made to work as a seamless whole but the different databases communicate well. A Data warehouse is an integrated collection of subject- oriented data in the support of decision making. The integration of data sources is achieved through the use of ETL (Extract, Transform and load) process. It is therefore extensively recognized that the appropriate design of ETL process are key factors in success of Data Warehouse Project. Data warehouse is used to provide effective result from multi- dimensional data analysis. Defective data lead to break downs in the supply chain, poor business decisions and inferior customer relationship management. So data quality is the degree to which data meet the specific needs of the customer. The accuracy and correctness of the results depend on the quality of the data. Improving the quality of data is important in data warehouse because it is used in the process decision support which requires accurate data. This project presents a data warehouse construction with quality decision support system to “Manage results for an organization using customer care center”. Organization used to maintain customer care to support and handle customer queries, to maintain details of customers, to provide frequent information regarding to their premiums, loans. This project determines a detail report such as how many customers are there in an Organization. How many customers paid full premiums, what are their dues, total amount paid? Which locations customer exists? How many customers are more valued customers? Total amount credited in organizations quarterly, what percent is gain/loss. In this paper we take source as flat files, relational tables and the data is extracted in staging area and then it is loaded in to a data warehouse. The different five themes frame our analysis is: Integration, Implementation, Intelligence, and Innovation and quality. The factors Definition conformance, completeness, validity, accuracy, non- duplication, accessibility applied on data warehouse dynamically to improve the performance of data warehouses.

16

Cloud Model-based Outlier Detection Algorithm for Categorical Data

Dajiang Lei, Liping Zhang, Lisheng Zhang

보안공학연구지원센터(IJDTA) International Journal of Database Theory and Application Vol.6 No.4 2013.08 pp.199-214

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

Most of the existing outlier detection methods aim at numerical data, but there will be a large number of categorical data in real life. Some outlier detection algorithms have been designed for categorical data. There are two main problems of outlier detection for categorical data, which are the similarity measure between categorical data objects and the detection efficiency problem. A cloud model-based outlier detection algorithm for categorical data is proposed in this paper. The algorithm is based on data driven idea and does not require the user to specify parameters. We utilize the synthetic data set and real data set to verify, compare our algorithm with the existing outlier detection algorithms for categorical data, and the experimental result demonstrates that our proposed algorithm has a higher detection rate and lower false alarm rate, while the time complexity is also more competitive.

 
페이지 저장