TIBCO Scribe® Online Connector For Apache HBase

The TIBCO Scribe® Online Connector for Apache HBase is based on the CData driver for Apache HBase, this Connector allows you to integrate Apache HBase with CRM, accounting, eCommerce, and marketing systems. If you use Apache HBase to store data for an application you have developed, you can integrate that application in TIBCO Scribe® Online without building a custom Connector. This Connector based on Scribe.Connector.AdoNet library and CData ApacheHBase ADO.NET provider.

Possible use cases include: 

Connector Specifications

  Supported

Agent Types

On Premise X
Cloud X

Replication Services

Source  
Target  

Integration Services

Source X
Target X

Migration Services

Source  
Target  

Maps

Integration X
Request-Reply X
Message  

This Connector is available from the TIBCO Scribe® Online Marketplace. See Marketplace TIBCO Scribe® Certified Connectors for more information.

Supported Entities

Apache HBase database tables and views are exposed as entities.

Special Operations

Supports the Execute Block to execute stored procedures, where each stored procedure is represented as an entity and each input parameter is represented as a field within that entity. See Execute Block and the CData Apache HBase documentation for additional information.

Setup Considerations

Requires Apache HBase 2017 or higher.

Selecting An Agent Type For Apache HBase

Refer to TIBCO Scribe® Online Agents for information on available Agent types and how to select the best Agent for your Solution.

Connecting To Apache HBase

Note: Best practice is to create Connections with credentials that limit permissions in the target system, following the principle of least privilege. Using Administrator level credentials in a Connection provides Administrator level access to the target system for TIBCO Scribe® Online users. Depending on the entities supported, a TIBCO Scribe® Online user could alter user accounts in the target system.

  1. Select More > Connections from the menu.
  2. From the Connections page select Add to open the Add a New Connection dialog.
  3. Select the Connector from the drop-down list in the Connection Type field, and then enter the following information for this Connection:
    • Name — This can be any meaningful name, up to 25 characters.
    • Alias — An alias for this Connection name. The alias is generated from the Connection name, and can be up to 25 characters. The Connection alias can include letters, numbers, and underscores. Spaces and special characters are not accepted. You can change the alias. For more information, see Connection Alias.
    • Server — Enter the IPv4 address or URL of the Apache HBase server instance.
    • Port — Optional. Enter the port number for the Apache HBase server. If not set, the default value of 5432 is used. Must be in the 1023 to 65535 range.
    • AuthScheme — Enter one of the following authentication methods: 
      • NONE — No authentication.
      • BASIC — Basic authentication.
      • NEGOTIATE — Use Kerberos authentication.

      See the Auth Scheme section in the CData documentation for Apache HBase for more information.

    • User — Name of the database user with access to connect to Apache HBase.
    • Password — Password for the database user. Not required if AuthScheme is set to NONE.
    • Additional Parameters — Optional field where you can specify one or more connection string parameters. See the Connection String Options section of the CData documentation for a list of parameters that can be used and their default values. Note that in some cases the CData Apache HBase ADO.NET Provider does not fully support all of the possible parameters.

      Syntax for the Additional Parameters field is as follows:

      • All blank characters, except those within a value or within quotation marks, are ignored
      • Preceding and trailing spaces are ignored unless enclosed in single or double quotes, such as Keyword=" value"
      • Semicolons (;) within a value must be delimited by quotation marks
      • Use a single quote (') if the value begins with a double quote (")
      • Use a double quote (") if the value begins with a single quote (')
      • Parameters are case-insensitive
      • If a KEYWORD=VALUE pair occurs more than once in the connection string, the value associated with the last occurrence is used
      • If a keyword contains an equal sign (=), it must be preceded by an additional equal sign to indicate that the equal sign is part of the keyword
      • Parameters that are handled by other fields or default settings in the Connection dialog are ignored if used in the Additional Parameters field, including: 
        • Server
        • Port
        • AuthScheme
        • User
        • Password
        • Logfile — This parameter is not visible in the Connection dialog, but is set by the Connector. The default size is a maximum of 10MB. Any CData log files generated by this setting are stored in the default TIBCO Scribe® Online Agent Logs directory, C:\Program Files (x86)\Scribe Software\TIBCO Scribe® Online Agent\logs\. The format for log file names for CData logs are as follows: <ConnectorName><GUID of the Connection><DateTimeStamp>.log

          Note: For information on setting log file verbosity, see Verbosity in the CData Help.

        • MaxLogFileSize — This parameter is set by the Connector to a maximum of 10MB.
        • Other
        • RTK
  4. Select Test to ensure that the Agent can connect to your database. Be sure to test the Connection against all Agents that use this Connection. See Testing Connections.
  5. Select OK to save the Connection.

Metadata Notes

Tables and Views from the Apache HBase database are exposed as entities. HBase supports a bytes-in/bytes-out interface using Put and Result. Anything that can be converted to an array of bytes can be stored as a value. Because this database is untyped, the selection of a suitable type of data occurs when the metadata is returned. Metadata parser behavior can be controlled using the following parameters:

Naming

Connection metadata must have unique entity, relationship, and field names. If your Connection metadata has duplicate names, review the source system to determine if the duplicates can be renamed.

TypeDetectionScheme

This parameter controls how the data type of the field is determined. See the Type Detection Scheme Connection String option in the CData documentation. Possible values include:

RowScanDepth

The number of rows to scan to determine columns and their data types for the table. Default value is 10000 rows. See the Row Scan Depth Connection String option in the CData documentation.

WARNING: For large tables, if this value is too low, data types may not be as accurate as they could be.

Metadata Parser Example

Assume there is a table of employees with one column family, labeled docs, with columns Age, Salary, Name and Birthday, and some corresponding values inside.

If TypeDetectionScheme is set to RowScan, the data is interpreted and displayed in TIBCO Scribe® Online as shown in the following table.

Field

Data Type

Value

docsAge

Integer32

56

docsSalare

DateTime

2018-10-09T21:00:00Z

docsName

String(2000)

John

docsBirthday

Integer31

80000

RowKey

String(255)

123

Relationships

Apache HBase Connector As IS Source

Consider the following when using the Apache HBase Connector as an Integration Services source.

Filtering

The CData Provider supports two types of filtering:

Net Change

Net Change is supported for date and timestamp values if metadata discovery is enabled by setting the TypeDetectionScheme Connection String to RowScan.

When a datetime is configured on the Query Block on the Block Properties Net Change Tab to query for new and updated records, that configuration is treated as an additional filter. The Net Change datetime filter is applied as an AND after any other filters specified on the Block Properties Filter Tab. TIBCO Scribe® Online builds a query combining both the Net Change filter and the filters on the Filter tab. See Net Change And Filters for an example.

Some Connectors for TIBCO Scribe® Online only support one filter. For those Connectors you can use either Net Change or one filter on the Filter tab, not both.

Note: The Net Change date is ignored when previewing data on the Preview tab. Filters on the Block Properties Filters tab are used to filter the data on the Preview tab.

DateTime Properties

Apache HBase stores all data as strings. Data type conversions are handled by the CData provider. When the RowScan option is specified for the TypeDetectionSchema Connection String, additional Connection Strings can be used to modify the behavior of the DateTime Conversion.

Note: If a DateTime column contains only a time value, such as 12:32:55, it is treated as today's date with the time value appended.

Double Properties

The CData Provider uses the local machine’s culture to convert and read double values. Depending on the regional settings, the number formatting may differ. For example all strings in the following list are read as the same value:

Note: Regardless of how these numbers are represented by the CData Provider, they are stored in the database as strings with different delimiters. This affects filtering when the TypeDetectionScheme Connection String is set to None.

Native Query

The Apache HBase Connector supports SQL queries in Native Query Blocks to create customized queries for Apache HBase. The query can be as simple or complex as you need it to be; however, it should return a single result set. The native query text is sent to Apache HBase exactly as it is entered without any modifications.

You can use SELECT , UPDATE , INSERT and DELETE clauses. If support for Enhanced SQL is enabled, you can use Joins and Aggregate functions. For additional details, see the SQL Compliance section of the CData documentation .

After entering the SQL query, you must select Test to validate the query. Invalid queries are not accepted by the Connector. See Native Query Block and Creating Native Queries For Microsoft SQL Server for additional information.

When entering a query for Apache HBase in the Native Query Block, note the following: 

When testing a Native Query in a Map, if the source datastore does not return any data, TIBCO Scribe® Online cannot build the schema for the underlying metadata and the Map cannot be saved. To allow TIBCO Scribe® Online to build the schema, do the following:

  1. Create a single temporary record in the source datastore that matches the Native Query.
  2. Test the Native Query and ensure that it is successful.
  3. Save the Map.
  4. Remove the temporary record from the source datastore.

Apache HBase Connector As IS Target

Consider the following when using the Apache HBase Connector as an Integration Solution target.

Update And Insert

Batch Processing

Batch processing is not supported.

TIBCO Scribe® Online API Considerations

To create connections with the TIBCO Scribe® Online API, the Apache HBase Connector requires the following information:

Connector Name

Apache HBase

Connector ID

F96010A2-5783-45D8-A248-38F3DC736B25

TIBCO Scribe® Online Connection Properties

In addition, this Connector uses the Connection properties shown in the following table.

Note: Connection property names are case-sensitive.

Name Data Type Required Secured Usage

Server

string

Yes

No

 

Port

string

No

No

Integer

AuthSchem

string

Yes

No

Supported values:

  NONE

  BASIC

  NEGOTIATE

User

string

No

No

User can be empty in Apache HBase

Password

string

No

Yes

Password can be empty in Apache HBase

ConnectionString

string

No

No

 

License Agreement

The TIBCO Scribe® Online End User License Agreement for the Apache HBase Connector describes TIBCO and your legal obligations and requirements. TIBCO suggests that you read the End User License Agreement.

More Information

For additional information on this Connector, refer to the Knowledge Base and Discussions in the TIBCO Community.