16 Matching Annotations
  1. Jan 2019
    1. By utilizing the Deeplearning4j library1 for model representation, learning and prediction, KNIME builds upon a well performing open source solution with a thriving community.
    2. KNIME includes Python in various processing nodes for data processing, model learning and prediction, and the generation of visualizations.
    3. One of KNIME's strengths is its multitude of nodes for data analysis and machine learning. While its base configuration already offers a variety of algorithms for this task, the plugin system is the factor that enables third-party developers to easily integrate their tools and make them compatible with the output of each other.
    4. KNIME allows nodes from different research areas to be mixed to create truly cross-domain workflows.
    5. Apart from the free and open source KNIME Analytics Platform, KNIME also has commercial offerings. The KNIME server provides a platform for sharing workflows. It has a web interface and is connected to a KNIME instance for executing workflows remotely on demand or according to a schedule. Also commercially available are the Big Data Extensions and the KNIME Spark executor.
    6. old nodes in KNIME are never completely removed from the program but are deprecated so that workflows built with old versions can still be run and produce the same results years later.
    7. Passing the data between those tools often involves complex scripts for controlling data flow, data transformation, and statistical analysis. Such scripts are not only prone to be platform dependent, they also tend to grow as the experiment progresses and are seldomly well documented, a fact that hinders the reproducibility of the experiment. Workflow systems such as KNIME Analytics Platform aim to solve these problems by providing a platform for connecting tools graphically and guaranteeing the same results on different operating systems. As an open source software, KNIME allows scientists and programmers to provide their own extensions to the scientific community.
    8. SeqAn implements various applications that can be used for different tasks for example to map reads, apply read error correction, conduct protein searches, run variant detection and many more. However, analysts are not interested in a single execution of one tool but design and execute entire pipelines using different tools for different tasks contained in the pipeline. Often they also require some downstream analysis steps, e.g. computing some statistics, generating reports and so on. Hence it was desirable to add SeqAn applications to the KNIME workflow engine, which offers many additional analysis and data mining features.
    9. The Taverna workbench comes with an integration to myExperiment,13 a website for publishing and sharing scientific workflows. KNIME offers this integration as part of the de.NBI/CIBI plugin.14
    10. The tools presented above are all used in various areas of the life sciences, but their main task is the orchestration of external tools that exchange files with each other. Natively, KNIME goes a different way by encouraging a deep tool integration that is compatible with KNIME's table format. With this approach data are embedded into table cells allowing for easy tool interoperability without the need for file conversions.
    11. Orchestrating the execution of many command line tools is a task for Galaxy, while an analysis of life science data with subsequent statistical analysis and visualization is best carried out in KNIME or Orange. Orange with its “ad-hoc” execution of nodes caters to scientists doing quick analyses on small amounts of data, while KNIME is built from the ground up for large tables and images. Noteworthy is that none of the mentioned tools provide image processing capabilities as extensive as those of the KNIME Image Processing plugin (KNIP).
    12. Compared to other tools KNIME focuses on a deeper integration of tools and tries to manage the data that flows in the workflow by itself. Tools like Galaxy and Taverna, on the other hand, rather orchestrate command line tools that exchange files. Orange is very similar to KNIME in that it has extensive machine learning capabilities, but focuses more on the analysis of smaller data sets. We conclude that there are workflow tools for a variety of different use cases and that it is the scientists task to choose the tool that fits the problem at hand best. While there are certainly overlaps, each tool excels at its intended purpose.
    13. KNIME's network mining extension also has an integration with the open source bioinformatics software platform Cytoscape,9 which can be used to visualize molecular interaction networks and biological pathways. Installing the KNIME Connector plugin in Cytoscape enables users to exchange networks between the two tools.
    14. In conclusion, the KNIME Image Processing extensions not only enable scientists to easily mix-and-match image processing algorithms with tools from other domains (e.g. machine-learning), scripting languages (e.g. R or Python) or perform a cross-domain analysis using heterogenous data-types (e.g. molecules or sequences), they also open the doors for explorative design of bioimage analysis workflows and their application to process hundreds of thousands of images.
    15. In order to further foster this “write once, run anywhere” framework, several independent projects collaborated closely in order to create ImageJ-Ops, an extensible Java framework for image processing algorithms. ImageJ-Ops allows image processing algorithms to be used within a wide range of scientific applications, particularly KNIME and ImageJ and consequently, users need not choose between those applications, but can take advantage of both worlds seamlessly.
    16. Most notably, integrating with ImageJ2 and FIJI allows scientists to easily turn ImageJ2 plugins into KNIME nodes, without having to be able to script or program a single line of code