Fast Algorithms for Mining Generalized Association Rules and Sequential Patterns in Massive Databases

Gunaseelan, D; Nadarajan, R

Full metadata record

DC Field	Value	Language
dc.contributor.author	Gunaseelan, D	-
dc.contributor.author	Nadarajan, R	-
dc.date.accessioned	2022-05-11T07:30:44Z	-
dc.date.available	2022-05-11T07:30:44Z	-
dc.date.issued	2002-06-30	-
dc.identifier.uri	http://localhost:8080/xmlui/handle/123456789/585	-
dc.description.abstract	Dataminingisanemergingfieldinthedatabasetechnology.Thegoalofdataminingisknowledgediscovery,thatis,toexcavateinformationfromhistoricalorganizationaldatabasesthatcanbeusedtoguidebusinessstrategiesanddecisionmaking.Inthisdissertation,fastandefficientalgorithmsformininggeneralizedassociationrulesandsequentialpatternsinmassivedatabasesarepresented.Anassociationrule,forexample,couldbe“98%ofthecustomerswhobuybreadandbutteralsobuyjam”.Theproblemistofindoutallsuchruleswhosefrequencyisgreaterthansomeuser-definedminimumsupport.Thisthesisdealswithalgorithmicandsystemsaspectsofscalabledataminingalgorithmsappliedtomassivedatabases.Thealgorithmicaspectsfocusonthedesignofefficientandscalablealgorithmsfortwo-keyrulediscoverytechniques-generalizedassociationrulesandgeneralizedsequentialpatterns.Thesystemsaspectsdealwiththescalableimplementationofthesemethodsonsequentialmachines.Twoincrementalupdatedtechniquesformininggeneralizedassociationrulesandsequentialpatternstogeneraterulesandpatternsinmassivedatasetsarepresented.Thefirstoneisthedatabase,whichisfixedwithchangingminimumsupport.Thesecondoneisthegivenoriginaldatabase,whennewincrementdb(transactionaldatabase)isaddedtotheoriginaldatabaseDBwithfixedminimumsupportandminimumconfidence. Usingpartitionmethodassociationrulesandsequentialpatternshavebeengenerated.Themajoradvantageofthepartitionmethodisscanningthedatabaseexactlytwotimestocomputethelargeitemsetsbymeansofconstructingatransactionlistforeachlargeitemset.Insequentialpattern,largemaximalsequencesaregeneratedusingparallelpartitionmethod.Thespeed-upandsize-uppropertiesshowthattheproposedparallelpartitionmethodisbetterthansequentialpartitionmethod.Themethodofpatterndecompositioncanavoidthecostlyprocessofcandidatesetgenerationandsaveagreatamountofcomputingtimewithreduceddatabasesize.TheproblemofmininggeneralizedassociationrulesandsequentialpatternsusingTIDmethodhasbeenanalyzed.Byusingthismethod,thecostofexecutiontimehasbeenreducedandlinearlyscalable.Anothermostimportantproblemofmininggeneralizedassociationrulesinthedistributedenvironmenthasalsobeenpresentedhere.ThecomputingtimeofthefastdistributedalgorithmisintheO(n),whereastheparallelbasedalgorithmformininggeneralizedassociationruleisintheO(n2).Hence,theproposedfast-distributedalgorithmismorereliablethanthepreviousparallelandsequentialalgorithm.Extensiveexperimentshavebeenconductedforsolvingtheabovetwoproblems(generalizedassociationrulesandgeneralizedsequentialpatterns),showingimmenseimprovementoverthepreviousapproaches,withlinearscalabilityindatabasesize.	en_US
dc.language.iso	en	en_US
dc.publisher	Anna University	en_US
dc.subject	Fast	en_US
dc.subject	Algorithms	en_US
dc.subject	Massive	en_US
dc.subject	Databases	en_US
dc.subject	Sequential	en_US
dc.title	Fast Algorithms for Mining Generalized Association Rules and Sequential Patterns in Massive Databases	en_US
dc.type	Thesis	en_US
Appears in Collections:	Computer Applications