Hypothesis

16 Matching Annotations

Jan 2019
www.sciencedirect.com www.sciencedirect.com

KNIME for reproducible cross-domain analysis of life science data

16
1. Maciej_Motyka 02 Jan 2019
  
  in Public
  
  By utilizing the Deeplearning4j library1 for model representation, learning and prediction, KNIME builds upon a well performing open source solution with a thriving community.
  
  KNIME ML/AI deep learning integration
2. Maciej_Motyka 02 Jan 2019
  
  in Public
  
  KNIME includes Python in various processing nodes for data processing, model learning and prediction, and the generation of visualizations.
  
  KNIME Python integration
3. Maciej_Motyka 02 Jan 2019
  
  in Public
  
  One of KNIME's strengths is its multitude of nodes for data analysis and machine learning. While its base configuration already offers a variety of algorithms for this task, the plugin system is the factor that enables third-party developers to easily integrate their tools and make them compatible with the output of each other.
  
  KNIME integration ML/AI
4. Maciej_Motyka 02 Jan 2019
  
  in Public
  
  KNIME allows nodes from different research areas to be mixed to create truly cross-domain workflows.
  
  KNIME integration
5. Maciej_Motyka 02 Jan 2019
  
  in Public
  
  Apart from the free and open source KNIME Analytics Platform, KNIME also has commercial offerings. The KNIME server provides a platform for sharing workflows. It has a web interface and is connected to a KNIME instance for executing workflows remotely on demand or according to a schedule. Also commercially available are the Big Data Extensions and the KNIME Spark executor.
  
  KNIME bioinformatics open source commercial
6. Maciej_Motyka 02 Jan 2019
  
  in Public
  
  old nodes in KNIME are never completely removed from the program but are deprecated so that workflows built with old versions can still be run and produce the same results years later.
  
  KNIME
7. Maciej_Motyka 02 Jan 2019
  
  in Public
  
  Passing the data between those tools often involves complex scripts for controlling data flow, data transformation, and statistical analysis. Such scripts are not only prone to be platform dependent, they also tend to grow as the experiment progresses and are seldomly well documented, a fact that hinders the reproducibility of the experiment. Workflow systems such as KNIME Analytics Platform aim to solve these problems by providing a platform for connecting tools graphically and guaranteeing the same results on different operating systems. As an open source software, KNIME allows scientists and programmers to provide their own extensions to the scientific community.
  
  KNIME integration
8. Maciej_Motyka 02 Jan 2019
  
  in Public
  
  SeqAn implements various applications that can be used for different tasks for example to map reads, apply read error correction, conduct protein searches, run variant detection and many more. However, analysts are not interested in a single execution of one tool but design and execute entire pipelines using different tools for different tasks contained in the pipeline. Often they also require some downstream analysis steps, e.g. computing some statistics, generating reports and so on. Hence it was desirable to add SeqAn applications to the KNIME workflow engine, which offers many additional analysis and data mining features.
  
  KNIME SeqAn integration
9. Maciej_Motyka 02 Jan 2019
  
  in Public
  
  The Taverna workbench comes with an integration to myExperiment,13 a website for publishing and sharing scientific workflows. KNIME offers this integration as part of the de.NBI/CIBI plugin.14
  
  KNIME
10. Maciej_Motyka 02 Jan 2019
  
  in Public
  
  The tools presented above are all used in various areas of the life sciences, but their main task is the orchestration of external tools that exchange files with each other. Natively, KNIME goes a different way by encouraging a deep tool integration that is compatible with KNIME's table format. With this approach data are embedded into table cells allowing for easy tool interoperability without the need for file conversions.
  
  KNIME Galaxy integration
11. Maciej_Motyka 02 Jan 2019
  
  in Public
  
  Orchestrating the execution of many command line tools is a task for Galaxy, while an analysis of life science data with subsequent statistical analysis and visualization is best carried out in KNIME or Orange. Orange with its “ad-hoc” execution of nodes caters to scientists doing quick analyses on small amounts of data, while KNIME is built from the ground up for large tables and images. Noteworthy is that none of the mentioned tools provide image processing capabilities as extensive as those of the KNIME Image Processing plugin (KNIP).
  
  KNIME Galaxy integration image processing ImageJ
12. Maciej_Motyka 02 Jan 2019
  
  in Public
  
  Compared to other tools KNIME focuses on a deeper integration of tools and tries to manage the data that flows in the workflow by itself. Tools like Galaxy and Taverna, on the other hand, rather orchestrate command line tools that exchange files. Orange is very similar to KNIME in that it has extensive machine learning capabilities, but focuses more on the analysis of smaller data sets. We conclude that there are workflow tools for a variety of different use cases and that it is the scientists task to choose the tool that fits the problem at hand best. While there are certainly overlaps, each tool excels at its intended purpose.
  
  KNIME Galaxy integration
13. Maciej_Motyka 02 Jan 2019
  
  in Public
  
  KNIME's network mining extension also has an integration with the open source bioinformatics software platform Cytoscape,9 which can be used to visualize molecular interaction networks and biological pathways. Installing the KNIME Connector plugin in Cytoscape enables users to exchange networks between the two tools.
  
  KNIME Cytoscape integration
14. Maciej_Motyka 02 Jan 2019
  
  in Public
  
  In conclusion, the KNIME Image Processing extensions not only enable scientists to easily mix-and-match image processing algorithms with tools from other domains (e.g. machine-learning), scripting languages (e.g. R or Python) or perform a cross-domain analysis using heterogenous data-types (e.g. molecules or sequences), they also open the doors for explorative design of bioimage analysis workflows and their application to process hundreds of thousands of images.
  
  KNIME image processing ImageJ
15. Maciej_Motyka 02 Jan 2019
  
  in Public
  
  In order to further foster this “write once, run anywhere” framework, several independent projects collaborated closely in order to create ImageJ-Ops, an extensible Java framework for image processing algorithms. ImageJ-Ops allows image processing algorithms to be used within a wide range of scientific applications, particularly KNIME and ImageJ and consequently, users need not choose between those applications, but can take advantage of both worlds seamlessly.
  
  KNIME image processing ImageJ
16. Maciej_Motyka 02 Jan 2019
  
  in Public
  
  Most notably, integrating with ImageJ2 and FIJI allows scientists to easily turn ImageJ2 plugins into KNIME nodes, without having to be able to script or program a single line of code
  
  image processing KNIME ImageJ
Visit annotations in context

Tags

KNIME

bioinformatics

deep learning

Python

Cytoscape

open source

Galaxy

SeqAn

ML/AI

image processing

ImageJ

integration

commercial

Annotators

Maciej_Motyka

URL

sciencedirect.com/science/article/pii/S0168165617315651

Tags

Annotators

URL