BIRT Exchange - “Products and Services for BIRT”, “Get the most out of BIRT”.
BIRT uses JDBC to access databases. To access an ODBC-only database on Windows, you need a JDBC-ODBC Bridge Driver. The features of the JDBC drivers determine what BIRT can make from the databases, e.g. concerning support of standard and non-standard SQL datatypes ect.
The fictive company of the sample database is “Classic Models Inc.”.
“OWB 11.2 is installed as part of every Oracle Database 11g Release 2 installation”.
The OWB is linked to a specific version of Oracle Database. So you can´t use OWB 11.2 with Oracle 12 database. As OWB is discontinued, there will be no newer version available for newer Oracle databases ( e.g. Oracle 12,.. ) .
“Qlik Sense” is a local server application for Win64 computers. You may operate with the desktop client “Qlik Sense Desktop” - which is a modified browser - or may use a standard Internet browser instead, by calling URLs:
The OpenSource commandline application VisiData - “An interactive multitool for tabular data. It combines the clarity of a spreadsheet, the efficiency of the terminal, and the power of Python, into a lightweight utility which can handle millions of rows with ease”.
The OpenSource OpenRefine, GitHub "OpenRefine" ( formerly: “Google Refine” ) - “A powerful tool for working with messy data: cleaning it; transforming it from one format into another; and extending it with web services and external data”.
The OpenSource library Tensorflow, GitHub "tensorflow" - “TensorFlow is an Open Source Software Library for Machine Intelligence”, “Open source software library for numerical computation using data flow graphs”.
GitHub "google/skflow" - “Simplified interface for TensorFlow (mimicking Scikit Learn)”.
The free dashboard tool “Power BI Desktop for Windows” for Win7, Win8 - “Analytics tools at your fingertips”.
The publishing service “Power BI”, just for users of the commercial “Office 365” service?! - “The easy way to see your important data in one place. With a few clicks, connect to data from applications you use and get started with pre-built dashboards from experts”.
“Microsoft Power BI” is a dashboard tool, not a reporting tool!
It is powered by “Microsoft SQL Server Analysis Services (SSAS)”.
It is now easy to catch data from HTML pages ( e.g. data tables of Wikipedia ).
Data visualisation:
Built-in graphics ( by a graphics engine similar / identical to that known from Micrsoft Excel, Microsoft SQL Server Reporting Services and Microsoft SQL Server Analysis Services ).
There is an API by which third parties may offer own visualisations.
Indeed with Microsoft Excel 2016, you may create the same dashboards, but you can´t publish it on the web.
You may save “Microsoft Power BI Desktop” as in single ”.pbix” files. Experts told me, that for saving a project, the software needs 3 times of the RAM memory as needed for the loaded data. So it might be that you may load big data, but are not able to save it. In general, this might not be a problem with the 32-bit edition ( on Windows PCs with 2 or 3 GByte of RAM ), but also with the 64-bit edition ( i.e. a project may be saved on a Win64 computer with 16 GB RAM, but not on a Win64 computer with 4 GB RAM ).
You might create test cases ( e.g. at business intelligence trainings ) with “Microsoft Power BI”, as alternative to NBI.
“Microsoft Power BI Desktop” also supports the “M” and “DAX” query languages, by the “Advanced Editor” ( “Home / Edit Queries ( = Query Editor ) / Advanced Editor” ).
If you install “Microsoft R” prior to “Microsoft Power BI Desktop”, you may use “R” for data processing and graphical reporting → See Mathematical Engineering.
I was told by experts, in 2016-03:
Now legacy versions of “Microsoft Power BI Desktop” were able to process 10.000 datasets for visualisation.
Current versions of “Microsoft Power BI Desktop” were able to process 10.0000 datasets for visualisation.
With “R” called by “Microsoft Power BI Desktop”, you may visualize 150.000 datasets.
To get a BI report, connect with a client ( Internet browser, smartphone app,..) to the “Microsoft Power BI” cloudservice. The basic “Power BI” account is free.
By this, you may access any cloud document.
The cloud servce may access a registered onpremise Windows server at your ( the datacenter of your ) company, if the computer runs a free “Power BI Gateway - Personal” service, by a VPN connection.
Experts told me, that the free VPN service shipped with Windows is not under development for some time. Therefore it isn´t very advisable to use it in production environments.
Experts told me, that “Tabular” is like “PowerBI” / “QlickView” / “Tableau” but without graphical user interface and graphical reporting. Instead, “Tabular” may be accessed by any tool by DAX & MDX query language.
On the TDWI 2018 conference in München, Germany, Joshua Görner ( TDWI Konferenz 2018, Sprecherdetails "Joshua Görner" ) was speaker. I joined his seminar “Beyond Jupyter Notebooks: Build yoru own Data Science Platform in less than 1 hour” .
Minio - “Private Cloud Storage. Minio is a high performance distributed object storage server, designed for large-scale private cloud infrastructure. Minio is widely deployed across the world with over 105.1M+ docker pulls”.
Joshua Görner suggested to use the Python web framework “Flask” to build datascience applications. “Flask” is providing easily “Functions as a Service” for datascience applications. See Python 10/10 - Web Application Frameworks.
The speech “Mi 9.1: A Hands-on Introduction to Your First Data Science Project” by A. Wider, E. Grasmeder at the conference TDWI Konferenz 2018 by TDWI Germany e.V., 2018-06-25 - 2018-06-27, introduced me to the “TWDE Datalab” .
The online cloud based predictive analysis service BigML - “Machine Learning for everyone. Easily add data-driven decisions and predictive power to your company”, “NOW FREE. Unlimited tasks ( up to 16 MB/task )”.
A free Jupyter Notebook server, most packages of Anaconda are installed, but not 100% compatible ( different version of Python packages, few missing packages ).
SAP HANA Cloud Platform Help "Signing Up for a Developer Account" - “The name of your new developer account contains your user ID ( e.g. 'p0123456789' ) and the suffix trial ( e.g. 'p0123456789trial' )”. However, you still have to login with your SAP ID or E-Mail address, not your developer ID.
As of 2016-02, you may install developer editions on SAP HANA Cloud Platform, Microsoft Azure Cloud, Amazon Web Services only. Experts expect, that after 2016-05 there will be a downloadable developer edition and / or a entry-level commercial edition of SAP HANA.
“Azure Event Hubs provides highly scalable publish-subscribe event ingestors. An event hub can collect millions of events per second, so that you can process and analyze the massive amounts of data produced by your connected devices and applications”.
“Together, Event Hubs and Stream Analytics provide an end-to-end solution for real-time analytics. Event Hubs lets you feed events into Azure in real-time, and Stream Analytics jobs can process those events in real-time”.
“For example, you can send web clicks, sensor readings, or online log events to Event Hubs. You can then create Stream Analytics jobs to use Event Hubs as the input data streams for real-time filtering, aggregating, and correlation”.
“In a real-world scenario, you could have hundreds of these sensors generating events as a stream. Ideally, a gateway device would run code to push these events to Azure Event Hubs or Azure IoT Hubs. Your Stream Analytics job would ingest these events from Event Hubs and run real-time analytics queries against the streams. Then, you could send the results to one of the supported outputs”
“Data Ingest” ( German: “etwas ( Nahrung ) zu sich nehmen”, “Daten einfügen” ) - If you process data - by machine learning, by statistical methods -, the resulting data is added ( “ingested” ) to the original data. The term “insert” is not used, as it is already used with SQL.
“SQLstream: Blaze enables companies to be data-driven in real time. Real-time data can be discovered, analyzed, and aggregated instantly, and delivered as a continuous ingest into Hadoop, data warehouses, and other enterprise systems”.
Elastic "Learn, Docs" - "Kibana Dashboard" - “A Kibana dashboard displays a set of saved visualizations in groups that you can arrange freely. You can save a dashboard to share or reload at a later time”.
Kibana 4 Tutorial - “A web frontend to analyze data held in an elasticsearch cluster”.
“3.4 The Eliteness Model and BM25. We now re-introduce term frequencies into our model... We suppose that for any document-term pair, there is a hidden property which we refer to as eliteness. This can be interpreted as a form of aboutness: if the term is elite in the document, in some sense the document is about the concept denoted by the term. Now we assume that actual occurrences of the term in the document depend on eliteness, and that there may be an association between eliteness (to the term) and relevance (to the query)”.
EN.Wikipedia "Full text search", DE.Wikipedia "Volltextrecherche" - “Zur schnellen Informationsgewinnung und dem Auffinden aus bekannten wie auch nicht bekannten (aber auf den Medien vorhandenen) Dokumenten wird die Volltextrecherche genutzt. Die Volltextrecherche dient daher dem Auffinden, Entdecken und Extrahieren unbekannter, nicht trivialer und wichtiger Informationen aus großen Mengen von unstrukturierten Texten/Dateien und ist somit auch ein wichtiger Teilbereich des Text Mining. Sie ist eine Sofortlösung, um für eine konkrete Fragestellung auf Systeme wie Dokumentenmanagement und Data-Mining verzichten zu können”.
While commercial editions of OpenSource BI software tools start with 10.000 EUR/a... 50.000 EUR/a for single customers, there are totally different costs, if the customer wants to integrate or bundle the commercial edition of the OpenSource BI software tool with the own software solution to end-customers, as OEM partner .
I was told by experts, that some business intelligence software tools are not based on multidimensional databases, but SQL, NoSQL,...
Speicherguide "Big Data heißt: Loslassen von strukturierten Daten" - “Polystrukturierte Daten ist ein neuer Begriff, der strukturierte und unstrukturierte Daten sowie die neuerdings massiv auftretenden maschinengenerierten Daten wie Sensor-Daten und Web-Logs mit einbezieht”.