So much database types as well as system weared. Where the system have to support execution of work but in other matter will very complicate in the case of reporting system, because possibly one of the problem of datas amount which so much, disagree with used systems. It is of course become dilemma in making of reliable report and data integration which earn to be trusted.
Challenging from qualified management data is a[n critical component concerning business guarantee. In this time many technologies supporting from various aspect of " extract, transform, load and ( ETL) process, is but designed especially for the examination of, authentication, and sweeping for the things assuring qualified data. Technology like that have to can access and refine / tapping data of any source, including former data and data system of non-relational. That have to provide a[n step make an audit of when testing data to ascertain the integrity of and identify all and anomalies of inconsistencies. That Thing have to convert, cleaning, readjusting, and strengthening data is such as needed. Also have to can unite and clean ( data cleansing) in final repository, providing analysis and give report detailedly and summary. Finally, can walk very efficient, timely, costeffective in processing data.
Trust of data which do not consistence, complete ketidak or mistake of data can become " Risk Business" unacceptable. Data quality is a[n vital component as guarantee of[is continuity of company business. Quality of bad data can endanger attainment of efficiency and target of operational system. That Thing also can erod value of business information system, above of him entrust to make decision. based on decision is bad data, can cause monetary loss directly, destroy network, and destroy company credibility. Some companies recognize data as strategic asset, leader of business hold responsibility to ascertain correctness, quality of, and reliability of information.
In a Data Base figuring in more than 20 million data archives, prima facie strength [is] efesiensi shortened process time, delivery of data ( Extract), Transfer of into other form ( Transform) and Inclusion / Import ( Load) - ( ETL Process). The same as sweeping of effective data, conversion, and authentication is important for the success of than equipment of Data Base.
Data of Cleansing is covering some processes as follow:
1. Data Definition
Intake process ( Extract) Data of Database and or from other system, from various format type and data base type like Oracle, AS400, Dbase, Foxpro, SQL, Excel, Text, and others). Data Extract can be conducted by 2 methods:
• Direct Method
Data taken by connect is direct the than Database, with snapshot technique - only read henceforth to process.
• Indirect Method
Data taken indirectly from Database conducted process degradation of Data beforehand to format . txt henceforth to process.
2. Verification and Data Integration
Process Verification Data is process inspection of data per record pursuant to the data type and field of like format of date, ascii, numeric. etc
Target of process the data verification is to ascertain entire/all data to be processed have as according to format of Source got Data of Text / Database..
3. Data Cleansing
Process Data of Cleansing, in this case that is doing standardization in entire/all database to be content than the data base as a whole according to the format form or standard of [is] same, like determination :
• Writing of Block letters / Lower Case
• Writing of title
• Uniforming of Abbreviation word
• Tanda baca (Ex : - _ . , ; and etc)
• Conversion Letter of Romawi (ex: V = 5)
• etc
4. Data Identification
Process identify can be done by using some conditions pursuant to fringe which have been determined. As for obtained result from identifying data earn like.:
• Active Customer number
• Customer Non Active (based on criteria)
• Data Duplication
• Single Data
5. Data Matching
Adjustment process or adaptation of data ( " Match" or " Note of Match") pursuant to some alternatives like :Nama Customer dan Alamat Customer
• Customer Number
• etc
Friday, December 26, 2008
Data Cleansing
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment