Skip navigation

Please use this identifier to cite or link to this item: http://localhost:8080/xmlui/handle/123456789/585
Title: Fast Algorithms for Mining Generalized Association Rules and Sequential Patterns in Massive Databases
Authors: Gunaseelan, D
Nadarajan, R
Keywords: Fast
Algorithms
Massive
Databases
Sequential
Issue Date: 30-Jun-2002
Publisher: Anna University
Abstract: Dataminingisanemergingfieldinthedatabasetechnology.Thegoalofdataminingisknowledgediscovery,thatis,toexcavateinformationfromhistoricalorganizationaldatabasesthatcanbeusedtoguidebusinessstrategiesanddecisionmaking.Inthisdissertation,fastandefficientalgorithmsformininggeneralizedassociationrulesandsequentialpatternsinmassivedatabasesarepresented.Anassociationrule,forexample,couldbe“98%ofthecustomerswhobuybreadandbutteralsobuyjam”.Theproblemistofindoutallsuchruleswhosefrequencyisgreaterthansomeuser-definedminimumsupport.Thisthesisdealswithalgorithmicandsystemsaspectsofscalabledataminingalgorithmsappliedtomassivedatabases.Thealgorithmicaspectsfocusonthedesignofefficientandscalablealgorithmsfortwo-keyrulediscoverytechniques-generalizedassociationrulesandgeneralizedsequentialpatterns.Thesystemsaspectsdealwiththescalableimplementationofthesemethodsonsequentialmachines.Twoincrementalupdatedtechniquesformininggeneralizedassociationrulesandsequentialpatternstogeneraterulesandpatternsinmassivedatasetsarepresented.Thefirstoneisthedatabase,whichisfixedwithchangingminimumsupport.Thesecondoneisthegivenoriginaldatabase,whennewincrementdb(transactionaldatabase)isaddedtotheoriginaldatabaseDBwithfixedminimumsupportandminimumconfidence. Usingpartitionmethodassociationrulesandsequentialpatternshavebeengenerated.Themajoradvantageofthepartitionmethodisscanningthedatabaseexactlytwotimestocomputethelargeitemsetsbymeansofconstructingatransactionlistforeachlargeitemset.Insequentialpattern,largemaximalsequencesaregeneratedusingparallelpartitionmethod.Thespeed-upandsize-uppropertiesshowthattheproposedparallelpartitionmethodisbetterthansequentialpartitionmethod.Themethodofpatterndecompositioncanavoidthecostlyprocessofcandidatesetgenerationandsaveagreatamountofcomputingtimewithreduceddatabasesize.TheproblemofmininggeneralizedassociationrulesandsequentialpatternsusingTIDmethodhasbeenanalyzed.Byusingthismethod,thecostofexecutiontimehasbeenreducedandlinearlyscalable.Anothermostimportantproblemofmininggeneralizedassociationrulesinthedistributedenvironmenthasalsobeenpresentedhere.ThecomputingtimeofthefastdistributedalgorithmisintheO(n),whereastheparallelbasedalgorithmformininggeneralizedassociationruleisintheO(n2).Hence,theproposedfast-distributedalgorithmismorereliablethanthepreviousparallelandsequentialalgorithm.Extensiveexperimentshavebeenconductedforsolvingtheabovetwoproblems(generalizedassociationrulesandgeneralizedsequentialpatterns),showingimmenseimprovementoverthepreviousapproaches,withlinearscalabilityindatabasesize.
URI: http://localhost:8080/xmlui/handle/123456789/585
Appears in Collections:Computer Applications

Files in This Item:
File Description SizeFormat 
abstract 2.pdfABSTRACT50.58 kBAdobe PDFView/Open
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.