Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Data Mining

    As you maybe aware from some of my posts I am trying to get data from the internet to analyse in STATA this is known as data mining. Looking on Wikipedia there are a number of software available for this both free (open source) and paid for. Does anyone have any experience of this software and using it with STATA. Can you recommend one please for use with STATA?

    Software as follows:-
    Software
    Free open-source data mining software and applications


    The following applications are available under free/open source licenses. Public access to application source code is also available.
    • Carrot2: Text and search results clustering framework.
    • Chemicalize.org: A chemical structure miner and web search engine.
    • ELKI: A university research project with advanced cluster analysis and outlier detection methods written in the Javalanguage.
    • GATE: a natural language processing and language engineering tool.
    • KNIME: The Konstanz Information Miner, a user friendly and comprehensive data analytics framework.
    • Massive Online Analysis (MOA): a real-time big data stream mining with concept drift tool in the Java programming language.
    • MEPX - cross platform tool for regression and classification problems based on a Genetic Programming variant.
    • ML-Flex: A software package that enables users to integrate with third-party machine-learning packages written in any programming language, execute classification analyses in parallel across multiple computing nodes, and produce HTML reports of classification results.
    • MLPACK library: a collection of ready-to-use machine learning algorithms written in the C++ language.
    • NLTK (Natural Language Toolkit): A suite of libraries and programs for symbolic and statistical natural language processing (NLP) for the Python language.
    • OpenNN: Open neural networks library.
    • Orange: A component-based data mining and machine learning software suite written in the Python language.
    • R: A programming language and software environment for statistical computing, data mining, and graphics. It is part of the GNU Project.
    • scikit-learn is an open source machine learning library for the Python programming language
    • Torch: An open source deep learning library for the Lua programming language and scientific computing framework with wide support for machine learning algorithms.
    • UIMA: The UIMA (Unstructured Information Management Architecture) is a component framework for analyzing unstructured content such as text, audio and video – originally developed by IBM.
    • Weka: A suite of machine learning software applications written in the Java programming language.
    Proprietary data-mining software and applications

    The following applications are available under proprietary licenses.
Working...
X