CE- Ultimate Project Ideas For Data Mining

1. Building a Multiple-Criteria Negotiation Support System
2. An Exploratory Study of Database Integration Processes
3. COFI approach for Mining Frequent Item sets
4. Online Random Shuffling of Large Database Tables
5. A Flexible Content Adaptation System Using a Rule-Based Approach
6. Efficient Revalidation of XML Documents
7. Practical Algorithms and Lower Bounds for Similarity Search in Massive Graphs
8. Enhancing the Effectiveness of Clustering with Spectra Analysis
9. Efficient Monitoring Algorithm for Fast News Alerts
10. Top-k Monitoring in Wireless Sensor Networks
11. Frequent Closed Sequence Mining without Candidate Maintenance
12. Maintaining Strong Cache Consistency for the Domain Name System
13. Efficient Skyline and Top-k Retrieval in Subspaces
14. Efficient Process of Top-k Range-Sum Queries over Multiple Streams with Minimized Global Error
15. Fast Nearest Neighbor Condensation for Large Data Sets Classification
16. Wildcard Search in Structured Peer-to-Peer Networks
17. Neural-Based Learning Classifier Systems
18. Discovering Frequent Agreement Sub trees from Phylogenetic Data
19. Watermarking Relational Databases Using Optimization-Based Techniques
20. Extracting Actionable Knowledge from Decision Trees
21. A Requirements Driven Framework for Benchmarking Semantic Web Knowledge Base Systems
22. The Threshold Algorithm: From Middleware Systems to the Relational Engine
23. Rank Aggregation for Automatic Schema Matching
24. Rule Extraction from Support Vector Machines: A Sequential Covering Approach
25. Efficient Computation of Iceberg Cubes by Bounding Aggregate Functions
26. A Note on Linear Time Algorithms for Maximum Error Histograms
27. Toward Exploratory Test-Instance-Centered Diagnosis in High-Dimensional Classification
28. An Entropy Weighting k-Means Algorithm for Subspace Clustering of High-Dimensional Sparse Data
29. A Method for Estimating the Precision of Place name Matching
30. Efficiently Querying Large XML Data Repositories: A Survey
31. Graph-Based Analysis of Human Transfer Learning Using a Game Tested
32. Evaluating Universal Quantification in XML
33. Customer Profiling & Segmentation using Data Mining Techniques
34. Efficient Frequent Item set Mining Using Global Profit Weighted (GPW) Support Threshold
35. Fast Algorithms for Frequent Item set Mining using FP-Trees
36. Mining Confident Rules without Support Requirement
37. Mining Frequent Item set without Support Threshold
38. unified framework for utility based measures
39. An Adaptation of the Vector-Space Model for Ontology-Based Information Retrieval
40. The Google Similarity Distance
41. Reverse Nearest Neighbors Search in Ad Hoc Subspaces
42. Quality-Aware Sampling and Its Applications in Incremental Data Mining
43. An Exact Data Mining Method for Finding Center Strings and All Their Instances
44. Negative Samples Analysis in Relevance Feedback
45. Bayesian Networks for Knowledge-Based Authentication
46. Continuous Nearest Neighbor Queries over Sliding Windows
47. The Concentration of Fractional Distances
48. Efficient Approximate Query Processing in Peer-to-Peer Networks
49. Ontology-Based Service Representation and Selection
50. Compressed Hierarchical Mining of Frequent Closed Patterns from Dense Data Sets
51. Semi-supervised Regression with Co-training-Style Algorithms
52. Evaluation of Clustering with Banking Credit Card segment
53. An efficient clustering algorithm for huge dimensional database
54. Novel approach for Targeted Association Querying
55. Hiding Sensitive Association Rules with Limited Side Effects
56. A Relation-Based Search Engine in Semantic Web
57. Classifier Ensembles with a Random Linear Oracle
58. An Efficient Web Page Change Detection System Based on an Optimized Hungarian Algorithm
59. Mining Nonambiguous Temporal Patterns for Interval-Based Events
60. Peer-to-Peer in Metric Space and Semantic Space
61. Adaptive Index Utilization in Memory-Resident Structural Joins
62. On Three Types of Covering-Based Rough Sets
63. Discovering Frequent Generalized Episodes When Events Persist for Different Durations
64. Efficient and Scalable Algorithms for Inferring Likely Invariants in Distributed Systems
65. Mining Closed Frequent item set using CHARM algorithm
66. Integrating Constraints and Metric Learning in Semi-Supervised Clustering
67. Foundational Approach to Mining Item set Utilities from Databases
68. Fast Frequent Pattern Mining
69. Evaluation for Mining Share Frequent item sets Containing Infrequent Subsets

Source: #-Link-Snipped-#

Replies

  • Rupam Das
    Rupam Das
    "Data Generalization" is a sub domain in Data Mining. Consider that there is a database of census data. Govorment can not publish it as it is because that has sensitive information. So just change the tables in a way that it does not reveal any information. You can go for several topics in this.

    1) k-Anonymity
    2) Closeness
    3) Anti Closeness

    Its is a recent happening field of data mining.
  • Rupam Das
    Rupam Das
    What is Data Mining and How it is Different from "Views" in Relational Databases or Data Warehouses

    Consider a large database with hundreds of tables and thousands of rows in each table. Consider that transactions of over two years are stored in these tables. How can you extract meaningful information from this data. Not all the data stored here will have significance in presenting any definitive patterns. So a basic principle is to check each table and try to understand what the information of the table present. Once that is done views are written, which combines several tables to give us meaningful summery.

    Data warehouse in simple terms is a collection of metadata or description of data rather than the actual data. this warehouses are software where the data patterns are stored. Therefore warehouses know which rows or columns are more significant and what are the required bandwidth for various transactions. Warehouses are the software that makes data delivery and transactions faster. Also several filters are implemented in the warehouse that filters the possible noises or the data that are not required by any significant analysis process.

    Data mining on the other hand is more associated with Business Intelligence where tables and rows are automatically analyzed to obtain the relationship amongst the entities of a database.

    Hence data mining is a task of generating rules or summery from large data that can be used analyze new data.

    For example if a telecom company wants to know golden rules of retaining it's customers, it must know the criteria that have lead to successful retaining of the customers in the past. Out of few million customers database, it is impossible by any simple process to generate the rules. Hence a datamining or BI( Business Intelligence) must be applied to such a database.
    Though data mining is not a definitive task, it certainly is a probabilistic model. Hence data mining can answer a prediction in terms of probability of certain event to occur.

    Common tasks and areas of data mining are:
    1) Data Generalization
    2) Clustering and Grouping
    3) Noise Removal From data
    4) Rule Mining From data
    5) Predictive data mining
    6) Pattern Analysis in data mining
    7) Data redundency removal
    8) Data Mining for Data Warehousing and Business Intelligence
    9) Data mining for data Preservation
    10) Spatial and Multidimensional Data Mining
    Source(#-Link-Snipped-#)
  • harika harry
    harika harry
    Hello sir,
    I'm harika doing Mtech in JNTU.I have choose a topic on high utility pattern mining for incremental databases. i cant understand how to proceed i need some guidance. can you please help me. The topic is nice and interesting.
  • ramyaIT
    ramyaIT
    Hi sir, iam ramya doing Final B.Tech. I need project on data mining please suggest me some innovative project with abstract
  • K!r@nS!ngu
    K!r@nS!ngu
    Distributed Data Mining in Peer-to-Peer Networks

    ABSTRACT





    Peer-to-peer (P2P) networks are gaining popularity in many applications such as file sharing, e-commerce, and social networking, many of which deal with rich, distributed data sources that can benefit from data mining. P2P networks are, in fact, well-suited to distributed data mining (DDM), which deals with the problem of data analysis in environments with distributed data, computing nodes, and users. This article offers an overview of DDM applications and algorithms for P2P environments, focusing particularly on local algorithms that perform data analysis by using computing primitives with limited communication overhead. The authors describe both exact and approximate local P2P data mining algorithms that work in a decentralized and communication-efficient manner

    Personalized Web search for improving retrieval effectiveness

    ABSTRACT

    Current Web search engines are built to serve all users, independent of the special needs of any individual user. Personalization of Web search is to carry out retrieval for each user incorporating his/her interests. We propose a novel technique to learn user profiles from users' search histories. The user profiles are then used to improve retrieval effectiveness in Web search. A user profile and a general profile are learned from the user's search history and a category hierarchy, respectively. These two profiles are combined to map a user query into a set of categories which represent the user's search intention and serve as a context to disambiguate the words in the user's query. Web search is conducted based on both the user query and the set of categories. Several profile learning and category mapping algorithms and a fusion algorithm are provided and evaluated. Experimental results indicate that our technique to personalize Web search is both effective and efficient.

You are reading an archived discussion.

Related Posts

1. Fabrication of Remote operated weapon System 2. Automatic double axis Pneumatic JCB 3. Automatic Car Parking System for apartment Building 4. PLC based automatic Multi-machine Lubrication System 5. Electronic...
Samsung has developed faster computer memory module i.e DDR4 RAM and is capable to rad and write twice time faster then the previous one (DDR3).The transfer rate of DDR3 is...
1)What is the meaning of front end & back end design in VLSI ?....Which has more demand in industry ? 2)What are the steps in VLSI product development ? Which...
Friends, I am having a doubt regarding broadcasting done by router or not ? Suppose we are having this scenario. LAN1 { 172.19.1.1 ,172.19.1.2 ,172.19.1.4,172.19.1.8 } connected via S1 LAN2...
Crack your knuckles and get ready to rack your brains, as Technovision 7.0, the Annual Technical Festival presented by the IEEE SRKNEC student Branch (Shri Ramdeobaba KamlaNehru Engineering College) is...