CE- Ultimate Project Ideas For Data Mining

Kaustubh Katdare · 2011-01-04T22:16:20+00:00

1. Building a Multiple-Criteria Negotiation Support System 2. An Exploratory Study of Database Integration Processes 3. COFI approach for Mining Frequent Item sets 4. Online Random Shuffling of Large Database Tables 5. A Flexible Content Adaptation System Using a Rule-Based Approach 6. Efficient Revalidation of XML Documents 7. Practical Algorithms and Lower Bounds for Similarity Search in Massive Graphs 8. Enhancing the Effectiveness of Clustering with Spectra Analysis 9. Efficient Monitoring Algorithm for Fast News Alerts 10. Top-k Monitoring in Wireless Sensor Networks 11. Frequent Closed Sequence Mining without Candidate Maintenance 12. Maintaining Strong Cache Consistency for the Domain Name System 13. Efficient Skyline and Top-k Retrieval in Subspaces 14. Efficient Process of Top-k Range-Sum Queries over Multiple Streams with Minimized Global Error 15. Fast Nearest Neighbor Condensation for Large Data Sets Classification 16. Wildcard Search in Structured Peer-to-Peer Networks 17. Neural-Based Learning Classifier Systems 18. Discovering Frequent Agreement Sub trees from Phylogenetic Data 19. Watermarking Relational Databases Using Optimization-Based Techniques 20. Extracting Actionable Knowledge from Decision Trees 21. A Requirements Driven Framework for Benchmarking Semantic Web Knowledge Base Systems 22. The Threshold Algorithm: From Middleware Systems to the Relational Engine 23. Rank Aggregation for Automatic Schema Matching 24. Rule Extraction from Support Vector Machines: A Sequential Covering Approach 25. Efficient Computation of Iceberg Cubes by Bounding Aggregate Functions 26. A Note on Linear Time Algorithms for Maximum Error Histograms 27. Toward Exploratory Test-Instance-Centered Diagnosis in High-Dimensional Classification 28. An Entropy Weighting k-Means Algorithm for Subspace Clustering of High-Dimensional Sparse Data 29. A Method for Estimating the Precision of Place name Matching 30. Efficiently Querying Large XML Data Repositories: A Survey 31. Graph-Based Analysis of Human Transfer Learning Using a Game Tested 32. Evaluating Universal Quantification in XML 33. Customer Profiling & Segmentation using Data Mining Techniques 34. Efficient Frequent Item set Mining Using Global Profit Weighted (GPW) Support Threshold 35. Fast Algorithms for Frequent Item set Mining using FP-Trees 36. Mining Confident Rules without Support Requirement 37. Mining Frequent Item set without Support Threshold 38. unified framework for utility based measures 39. An Adaptation of the Vector-Space Model for Ontology-Based Information Retrieval 40. The Google Similarity Distance 41. Reverse Nearest Neighbors Search in Ad Hoc Subspaces 42. Quality-Aware Sampling and Its Applications in Incremental Data Mining 43. An Exact Data Mining Method for Finding Center Strings and All Their Instances 44. Negative Samples Analysis in Relevance Feedback 45. Bayesian Networks for Knowledge-Based Authentication 46. Continuous Nearest Neighbor Queries over Sliding Windows 47. The Concentration of Fractional Distances 48. Efficient Approximate Query Processing in Peer-to-Peer Networks 49. Ontology-Based Service Representation and Selection 50. Compressed Hierarchical Mining of Frequent Closed Patterns from Dense Data Sets 51. Semi-supervised Regression with Co-training-Style Algorithms 52. Evaluation of Clustering with Banking Credit Card segment 53. An efficient clustering algorithm for huge dimensional database 54. Novel approach for Targeted Association Querying 55. Hiding Sensitive Association Rules with Limited Side Effects 56. A Relation-Based Search Engine in Semantic Web 57. Classifier Ensembles with a Random Linear Oracle 58. An Efficient Web Page Change Detection System Based on an Optimized Hungarian Algorithm 59. Mining Nonambiguous Temporal Patterns for Interval-Based Events 60. Peer-to-Peer in Metric Space and Semantic Space 61. Adaptive Index Utilization in Memory-Resident Structural Joins 62. On Three Types of Covering-Based Rough Sets 63. Discovering Frequent Generalized Episodes When Events Persist for Different Durations 64. Efficient and Scalable Algorithms for Inferring Likely Invariants in Distributed Systems 65. Mining Closed Frequent item set using CHARM algorithm 66. Integrating Constraints and Metric Learning in Semi-Supervised Clustering 67. Foundational Approach to Mining Item set Utilities from Databases 68. Fast Frequent Pattern Mining 69. Evaluation for Mining Share Frequent item sets Containing Infrequent Subsets Source: #-Link-Snipped-#

CE- Ultimate Project Ideas For Data Mining

Kaustubh Katdare

Administrator

Updated: Oct 26, 2024

Views: 1.7K

1. Building a Multiple-Criteria Negotiation Support System
2. An Exploratory Study of Database Integration Processes
3. COFI approach for Mining Frequent Item sets
4. Online Random Shuffling of Large Database Tables
5. A Flexible Content Adaptation System Using a Rule-Based Approach
6. Efficient Revalidation of XML Documents
7. Practical Algorithms and Lower Bounds for Similarity Search in Massive Graphs
8. Enhancing the Effectiveness of Clustering with Spectra Analysis
9. Efficient Monitoring Algorithm for Fast News Alerts
10. Top-k Monitoring in Wireless Sensor Networks
11. Frequent Closed Sequence Mining without Candidate Maintenance
12. Maintaining Strong Cache Consistency for the Domain Name System
13. Efficient Skyline and Top-k Retrieval in Subspaces
14. Efficient Process of Top-k Range-Sum Queries over Multiple Streams with Minimized Global Error
15. Fast Nearest Neighbor Condensation for Large Data Sets Classification
16. Wildcard Search in Structured Peer-to-Peer Networks
17. Neural-Based Learning Classifier Systems
18. Discovering Frequent Agreement Sub trees from Phylogenetic Data
19. Watermarking Relational Databases Using Optimization-Based Techniques
20. Extracting Actionable Knowledge from Decision Trees
21. A Requirements Driven Framework for Benchmarking Semantic Web Knowledge Base Systems
22. The Threshold Algorithm: From Middleware Systems to the Relational Engine
23. Rank Aggregation for Automatic Schema Matching
24. Rule Extraction from Support Vector Machines: A Sequential Covering Approach
25. Efficient Computation of Iceberg Cubes by Bounding Aggregate Functions
26. A Note on Linear Time Algorithms for Maximum Error Histograms
27. Toward Exploratory Test-Instance-Centered Diagnosis in High-Dimensional Classification
28. An Entropy Weighting k-Means Algorithm for Subspace Clustering of High-Dimensional Sparse Data
29. A Method for Estimating the Precision of Place name Matching
30. Efficiently Querying Large XML Data Repositories: A Survey
31. Graph-Based Analysis of Human Transfer Learning Using a Game Tested
32. Evaluating Universal Quantification in XML
33. Customer Profiling & Segmentation using Data Mining Techniques
34. Efficient Frequent Item set Mining Using Global Profit Weighted (GPW) Support Threshold
35. Fast Algorithms for Frequent Item set Mining using FP-Trees
36. Mining Confident Rules without Support Requirement
37. Mining Frequent Item set without Support Threshold
38. unified framework for utility based measures
39. An Adaptation of the Vector-Space Model for Ontology-Based Information Retrieval
40. The Google Similarity Distance
41. Reverse Nearest Neighbors Search in Ad Hoc Subspaces
42. Quality-Aware Sampling and Its Applications in Incremental Data Mining
43. An Exact Data Mining Method for Finding Center Strings and All Their Instances
44. Negative Samples Analysis in Relevance Feedback
45. Bayesian Networks for Knowledge-Based Authentication
46. Continuous Nearest Neighbor Queries over Sliding Windows
47. The Concentration of Fractional Distances
48. Efficient Approximate Query Processing in Peer-to-Peer Networks
49. Ontology-Based Service Representation and Selection
50. Compressed Hierarchical Mining of Frequent Closed Patterns from Dense Data Sets
51. Semi-supervised Regression with Co-training-Style Algorithms
52. Evaluation of Clustering with Banking Credit Card segment
53. An efficient clustering algorithm for huge dimensional database
54. Novel approach for Targeted Association Querying
55. Hiding Sensitive Association Rules with Limited Side Effects
56. A Relation-Based Search Engine in Semantic Web
57. Classifier Ensembles with a Random Linear Oracle
58. An Efficient Web Page Change Detection System Based on an Optimized Hungarian Algorithm
59. Mining Nonambiguous Temporal Patterns for Interval-Based Events
60. Peer-to-Peer in Metric Space and Semantic Space
61. Adaptive Index Utilization in Memory-Resident Structural Joins
62. On Three Types of Covering-Based Rough Sets
63. Discovering Frequent Generalized Episodes When Events Persist for Different Durations
64. Efficient and Scalable Algorithms for Inferring Likely Invariants in Distributed Systems
65. Mining Closed Frequent item set using CHARM algorithm
66. Integrating Constraints and Metric Learning in Semi-Supervised Clustering
67. Foundational Approach to Mining Item set Utilities from Databases
68. Fast Frequent Pattern Mining
69. Evaluation for Mining Share Frequent item sets Containing Infrequent Subsets

Source: #-Link-Snipped-#

0

Replies

Howdy guest!

Dear guest, you must be logged-in to participate on CrazyEngineers. We would love to have you as a member of our community. Consider creating an account or login.

Replies

Rupam Das

Member • Sep 4, 2011

"Data Generalization" is a sub domain in Data Mining. Consider that there is a database of census data. Govorment can not publish it as it is because that has sensitive information. So just change the tables in a way that it does not reveal any information. You can go for several topics in this.

1) k-Anonymity
2) Closeness
3) Anti Closeness

Its is a recent happening field of data mining.

Are you sure? This action cannot be undone.
Cancel
Rupam Das

Member • Sep 4, 2011

What is Data Mining and How it is Different from "Views" in Relational Databases or Data Warehouses

Consider a large database with hundreds of tables and thousands of rows in each table. Consider that transactions of over two years are stored in these tables. How can you extract meaningful information from this data. Not all the data stored here will have significance in presenting any definitive patterns. So a basic principle is to check each table and try to understand what the information of the table present. Once that is done views are written, which combines several tables to give us meaningful summery.

Data warehouse in simple terms is a collection of metadata or description of data rather than the actual data. this warehouses are software where the data patterns are stored. Therefore warehouses know which rows or columns are more significant and what are the required bandwidth for various transactions. Warehouses are the software that makes data delivery and transactions faster. Also several filters are implemented in the warehouse that filters the possible noises or the data that are not required by any significant analysis process.

Data mining on the other hand is more associated with Business Intelligence where tables and rows are automatically analyzed to obtain the relationship amongst the entities of a database.

Hence data mining is a task of generating rules or summery from large data that can be used analyze new data.

For example if a telecom company wants to know golden rules of retaining it's customers, it must know the criteria that have lead to successful retaining of the customers in the past. Out of few million customers database, it is impossible by any simple process to generate the rules. Hence a datamining or BI( Business Intelligence) must be applied to such a database.
Though data mining is not a definitive task, it certainly is a probabilistic model. Hence data mining can answer a prediction in terms of probability of certain event to occur.

Common tasks and areas of data mining are:
1) Data Generalization
2) Clustering and Grouping
3) Noise Removal From data
4) Rule Mining From data
5) Predictive data mining
6) Pattern Analysis in data mining
7) Data redundency removal
8) Data Mining for Data Warehousing and Business Intelligence
9) Data mining for data Preservation
10) Spatial and Multidimensional Data Mining
Source(#-Link-Snipped-#)

Are you sure? This action cannot be undone.
Cancel
harika harry

Member • Dec 3, 2011

Hello sir,
I'm harika doing Mtech in JNTU.I have choose a topic on high utility pattern mining for incremental databases. i cant understand how to proceed i need some guidance. can you please help me. The topic is nice and interesting.

Are you sure? This action cannot be undone.
Cancel
ramyaIT

Member • Dec 23, 2011

Hi sir, iam ramya doing Final B.Tech. I need project on data mining please suggest me some innovative project with abstract

Are you sure? This action cannot be undone.
Cancel
K!r@nS!ngu

Member • Dec 23, 2011

Distributed Data Mining in Peer-to-Peer Networks

ABSTRACT

Peer-to-peer (P2P) networks are gaining popularity in many applications such as file sharing, e-commerce, and social networking, many of which deal with rich, distributed data sources that can benefit from data mining. P2P networks are, in fact, well-suited to distributed data mining (DDM), which deals with the problem of data analysis in environments with distributed data, computing nodes, and users. This article offers an overview of DDM applications and algorithms for P2P environments, focusing particularly on local algorithms that perform data analysis by using computing primitives with limited communication overhead. The authors describe both exact and approximate local P2P data mining algorithms that work in a decentralized and communication-efficient manner

Personalized Web search for improving retrieval effectiveness

ABSTRACT

Current Web search engines are built to serve all users, independent of the special needs of any individual user. Personalization of Web search is to carry out retrieval for each user incorporating his/her interests. We propose a novel technique to learn user profiles from users' search histories. The user profiles are then used to improve retrieval effectiveness in Web search. A user profile and a general profile are learned from the user's search history and a category hierarchy, respectively. These two profiles are combined to map a user query into a set of categories which represent the user's search intention and serve as a context to disambiguate the words in the user's query. Web search is conducted based on both the user query and the set of categories. Several profile learning and category mapping algorithms and a fusion algorithm are provided and evaluated. Experimental results indicate that our technique to personalize Web search is both effective and efficient.

Are you sure? This action cannot be undone.
Cancel