Can I create Viewpoint reports without having to set up Spark, HDFS, and Kafka?



The following explains how to configure Viewpoint to generate reports from the INSIGHT mongo database. Out-of-the-box Viewpoint is configured to generate reports from the REPORTING mongo database. However, there are cases where it is desirable to be able to use the insight database either in place of or in addition to the the reporting database. One such case is where a collection exists in the insight database where the data is already formatted in a way that fits a needed report. Another case is where a cluster administrator does not wish to go through the overhead and complexity of setting up a full spark cluster, HDFS, Kafka cluster. (A spark cluster is required to populate the reporting mongo database)

All that is required is to set up a connection to insight mongo in the Apache Drill instance that Insight is communicating with and then to reference this connection in a viewpoint report. Apache Drill can have multiple database connections so it is possible to allow insight to draw from both the insight and mongo databases at the same time. Some reports can be set up to use insight's database and other can be set up to use the reporting databse.

Adding a connection to the Insight Database to Apache Drill

First determine the host, port, username and password required to connect to the insight mongo database. These are found in the insight configuration file. For example
 

  #############################################
  [root]# vim /opt/insight/etc/config.groovy

  ...
  mongo.username="insight_user"
  mongo.password="changeme!"
  mongo.host="localhost"
  mongo.port=27017

  ##############################################


Now open the drill web UI in a web browser and login in as the drilladmin user. Note that the drilladmin user is a linux user on the box where drill is running. By default Drill runs on port 8047. For example:

http://localhost:8047

Click on the "Storage" tab at the top. A page will appear with the current Storage Plugins. At the bottom will be a text field for "Storage Name". Enter "mongo2". Click the create button next to it. A configuration page will appear with a large text area. In this text area enter the following, substituting <user>, <password>, <host>, and <port> with the values from /opt/insight/etc/config.groovy as explained above.

#########################################
{
  "type": "mongo",
  "connection": "mongodb://<user>:<password>@<host>:<port>/insight",
  "enabled": true
}

##########################################
For example:
##########################################
{
  "type": "mongo",
  "connection": "mongodb://insight_user:changeme!@localhost:27017/insight",
  "enabled": true
}
##########################################
Next test that this new connection worked. Click the "Query" in the menu at the top of the Drill page. Enter the following SQL. Note that this will return 5 entries from the workload_view collection in the insight mongo database.

SELECT * FROM mongo2.insight.workload_view LIMIT 5;

Click Submit. This should display a table showing five rows representing five jobs that have run on the cluster. If the table appears you have set up the connection to the insight mongo database successfully. Remember that the insight mongo connection is called "mongo2". This will be helpful when you create a report.

 

Note that in some environments the password must be URL encoded. In others no such URL encoding is needed. If you have an error connecting try URL encoding the password at a site like  https://www.urlencoder.org/  For example:

Before URL encoding: changeme!

After URL encoding: changeme%21



Creating a reporting in Viewpoint

Log into Viewpoint as an admin. Go to REPORTING > REPORTS and click CREATE REPORT. Fill out a name and click SWITCH TO ADVANCED. A text area will appear that will allow you to enter SQL. You can enter whatever you like but here is some sample SQL that will query for up to a certain number (specified by LIMIT) of jobs in the insight workload_view collection that have completed between a start date and and end date.

Notice  that the fields that are prefixed with dollar signs (e.g. $start_date, $end_date, $order_type, and $limit) get filled in when you run the report based on the settings the the person running the report selects. More on this later.

##########################################
SELECT
    t.user_name AS ´User´,
    t.completion_datetime AS ´Completion Date´
FROM mongo2.insight.workload_view t
WHERE
    t.completion_datetime > TO_TIMESTAMP($start_date/1000) AND
    t.completion_datetime < TO_TIMESTAMP($end_date/1000)
ORDER BY t.completion_datetime $order_type
LIMIT $limit

##########################################

In the Variables Default Values section select the default values for $start_date, $end_date, $order_type, and $limit. For "Interval" select something large, like a Month or a Quarter so that the time range will be large enough that some jobs show up. This will populate $start_date and $end_date. For Limit pick something small, like 10 so that the report doesn't take forever to load. This will affect $limit. Likewise selecting Sort By will populate $order_type in the SQL above.

Notice the TO_TIMESTAMP($start_date/1000). $start_date and $end_date get popululated based on the date interval selected when you run the report. If Viewpoint changes $start_date and $end_date to unix epoch timestamps in MILLISECONDS (e.g. 1570566960000). Drill's TO_TIMESTAMP function changes a unix epoch timestamp in SECONDS to a timestamp object that can be used in "where" clauses with Mongo dates. So we have to divide by 1000 to turn the epoch time in MILLISECONDS to an epoch time in SECONDS.

In the Layout portion of the report editor select table for this report. This can be changed of course but using table initially is the easiest way to verify that the jobs included in the report are the ones you expect.

Click GENERATE PREVIEW. A table should appear showing the results of this query. Click SAVE AND CLOSE.

 

Testing the report


Now test the report. Go to REPORTING > REPORTS. The new report should be there. Click the name of the report and the report should appear. Clicking the gear icon next to the name of the report will allow you to change the time interval, limit, and sorting direction.

Last update:
2020-03-09 20:11
Author:
Nate Seeley
Revision:
1.4
Average rating:0 (0 Votes)

You can comment this FAQ

Chuck Norris has counted to infinity. Twice.