Fast Algorithms for Mining Generalized Association Rules and Sequential Patterns in Massive Databases

Gunaseelan, D; Nadarajan, R

Please use this identifier to cite or link to this item: http://localhost:8080/xmlui/handle/123456789/585

Title:	Fast Algorithms for Mining Generalized Association Rules and Sequential Patterns in Massive Databases
Authors:	Gunaseelan, D Nadarajan, R
Keywords:	Fast Algorithms Massive Databases Sequential
Issue Date:	30-Jun-2002
Publisher:	Anna University
Abstract:	Dataminingisanemergingfieldinthedatabasetechnology.Thegoalofdataminingisknowledgediscovery,thatis,toexcavateinformationfromhistoricalorganizationaldatabasesthatcanbeusedtoguidebusinessstrategiesanddecisionmaking.Inthisdissertation,fastandefficientalgorithmsformininggeneralizedassociationrulesandsequentialpatternsinmassivedatabasesarepresented.Anassociationrule,forexample,couldbe“98%ofthecustomerswhobuybreadandbutteralsobuyjam”.Theproblemistofindoutallsuchruleswhosefrequencyisgreaterthansomeuser-definedminimumsupport.Thisthesisdealswithalgorithmicandsystemsaspectsofscalabledataminingalgorithmsappliedtomassivedatabases.Thealgorithmicaspectsfocusonthedesignofefficientandscalablealgorithmsfortwo-keyrulediscoverytechniques-generalizedassociationrulesandgeneralizedsequentialpatterns.Thesystemsaspectsdealwiththescalableimplementationofthesemethodsonsequentialmachines.Twoincrementalupdatedtechniquesformininggeneralizedassociationrulesandsequentialpatternstogeneraterulesandpatternsinmassivedatasetsarepresented.Thefirstoneisthedatabase,whichisfixedwithchangingminimumsupport.Thesecondoneisthegivenoriginaldatabase,whennewincrementdb(transactionaldatabase)isaddedtotheoriginaldatabaseDBwithfixedminimumsupportandminimumconfidence. Usingpartitionmethodassociationrulesandsequentialpatternshavebeengenerated.Themajoradvantageofthepartitionmethodisscanningthedatabaseexactlytwotimestocomputethelargeitemsetsbymeansofconstructingatransactionlistforeachlargeitemset.Insequentialpattern,largemaximalsequencesaregeneratedusingparallelpartitionmethod.Thespeed-upandsize-uppropertiesshowthattheproposedparallelpartitionmethodisbetterthansequentialpartitionmethod.Themethodofpatterndecompositioncanavoidthecostlyprocessofcandidatesetgenerationandsaveagreatamountofcomputingtimewithreduceddatabasesize.TheproblemofmininggeneralizedassociationrulesandsequentialpatternsusingTIDmethodhasbeenanalyzed.Byusingthismethod,thecostofexecutiontimehasbeenreducedandlinearlyscalable.Anothermostimportantproblemofmininggeneralizedassociationrulesinthedistributedenvironmenthasalsobeenpresentedhere.ThecomputingtimeofthefastdistributedalgorithmisintheO(n),whereastheparallelbasedalgorithmformininggeneralizedassociationruleisintheO(n2).Hence,theproposedfast-distributedalgorithmismorereliablethanthepreviousparallelandsequentialalgorithm.Extensiveexperimentshavebeenconductedforsolvingtheabovetwoproblems(generalizedassociationrulesandgeneralizedsequentialpatterns),showingimmenseimprovementoverthepreviousapproaches,withlinearscalabilityindatabasesize.
URI:	http://localhost:8080/xmlui/handle/123456789/585
Appears in Collections:	Computer Applications

Files in This Item:

File	Description	Size	Format
abstract 2.pdf	ABSTRACT	50.58 kB	Adobe PDF	View/Open

Show full item record