2.6 Emitter: Exporting Data to External System
Data created by Momentum components are stored on a distributed data lake. Momentum can efficiently access these data for any processing and analysis. However, there are use cases when the processed data from Momentum needs to be exported to a third-party external system. The emitter allows exporting data from Momentum to various external systems. To keep Momentum data warehouse, Impulse, as an independent system, it is also treated as an external system. In other words, data from the lake needs to be exported to Impulse like any other external system.
This section explains how to configure an emitter that can be attached to a data pipeline (see Setting Up Data Pipeline section) to automate the data ingestion, transformation and export via emitter to external system.
To configure an emitter:
- Expand Emitter from the main menu option (left hand side menu panel) and click “Emitter Home”.
- Click “Create New Emitter” located at the top menu bar and fill out the config form as explained below.
- Emitter Name: a user defined unique name to identify your emitter.
- Group Name: this field is used in case of IoT to group multiple devices so that their data is collected within the same group. For all other purposes, give any name.
- Memory per Core: How much RAM for parallel process to be used. 1GB should be sufficient for most cases but should be given more for larger dataset.
- Max Core: Number of CPU cores to be used for parallelism. 1 core for small data should be enough.
- Storage Type: Select the appropriate external storage system where you want your data to be exported to. Depending on the type of the storage system, the form fields will different. In this example, the form fields of “Impulse” storage system is explained.
- URL: Give the hostname of Impulse server. For security, give the local hostname as opposed to public DNS or IP address. Make sure the local hostname or IP address is accessible to Momentum server.
- Port: The default port of Impulse is 18888. Use a different port of your admin has installed Impulse to listen on a different port. Make sure Momentum can access this port.
- Warehouse Alias: You must obtain the warehouse alias. See Creating a Warehouse to create a warehouse and alias to be used in this field. The alias name must exist in the data warehouse.
- Table Name: table name where you want to export data to. If the table name does not exist, it be created.
- Username: the authorized username that has at least write permission to the data warehouse
- Password: The authorized user’s password.
- Primary Partition Column: This must be a datatime column. If there is no datetime column, enter __none__ .
- Partition Column Datetime Format: Specify the datetime format. For example: YYYY-MM-DD hh:mm:sss. The datetime should be as specified by jodatime guidelines, https://www.joda.org/joda-time/key_format.html. If your dataset does not have a datetime column and you specified __none__ in the primary partition column field, enter __none__ here as well.
- Partition Granularity: This field is to define the granularity level of your partition either by all, none, second, minute, fifteen_minute, thirty_minute, hour, day, week, month, quarter or year.
- Missing Datatime Placeholder: If there is no datetime column or the datetime value is null or invalid, the datetime will be replaced by the value provided in this field. Leave it default to replace the datetime to current datetime.
- Save Mode: Select the appropriate mode as:
Overwrite: delete all previous records and recreate new set of records
Append: create new rows and append to existing dataset
Combine and Overwrite: Combine new data with the old ones and then overwrite.
Incremental: Update existing rows if matching primary key is found else create a new row.
- Partition Overwrite Period Start Date: For SaveMode = Combine & Overwrite, specify the start date of the partition that you want to overwrite with the new data. Otherwise, leave it blank.
- Partition Overwrite Period End Date: For SaveMode = Combine & Overwrite, specify the end date of the partition that you want to overwrite with the new data. Otherwise, leave it blank.
- Status: Select active.
- Click Submit to save the emitter config.
Emitter does not independently run. It always run as a part of a pipeline.
Editing An Existing Emitter
- Expand Emitter menu from the left menu options –> click “Emitter Home”
- On the emitter home page, select (checkbox) the emitter you want to edit.
- Click “Edit” located at the top on the emitter home page.
- Edit the form fields as necessary and submit to save the changes.