Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
167 changes: 167 additions & 0 deletions docs/user_guides/fs/data_source/creation/crm_sales_analytics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
# How-To set up a CRM, Sales & Analytics Data Source

## Introduction

The `CRM, Sales & Analytics` data source lets you connect Hopsworks to supported business applications and marketing platforms.
The following sources are available:

- Facebook Ads
- Freshdesk
- Google Ads
- Google Analytics
- HubSpot
- Pipedrive
- Salesforce
- Shopify

In this guide, you will configure a Data Source in Hopsworks by saving the credentials required by the selected source.

!!! note
Currently, it is only possible to create data sources in the Hopsworks UI.
You cannot create a data source programmatically.

## Prerequisites

Before you begin, make sure you have:

- A unique name for the data source in Hopsworks.
- Read credentials for the external system you want to connect.
- Any source-specific identifiers required by that system, such as account, customer, property, or domain identifiers.
- For Google Ads and Google Analytics, a service account JSON keyfile that can be uploaded to the Hopsworks project.

## Creation in the UI

### Step 1: Set up new Data Source

Head to the Data Source View on Hopsworks (1) and set up a new data source (2).

<figure markdown>
![Data Source Creation](../../../../assets/images/guides/fs/data_source/data_source_overview.png)
<figcaption>The Data Source View in the User Interface</figcaption>
</figure>

### Step 2: Select storage and source

Choose `CRM, Sales & Analytics` as the storage type.
Then enter a unique **Name**, an optional **Description**, and select the source you want to configure.

<figure markdown>
![CRM, Sales & Analytics - Facebook Ads](../../../../assets/images/guides/fs/data_source/crm_sales_analytics_facebook_ads.png)
<figcaption>CRM, Sales & Analytics data source selection</figcaption>
</figure>

### Step 3: Enter source-specific credentials

The required fields depend on the selected source.

#### Facebook Ads

Required fields:

- **Access Token**
- **Account Id**

<figure markdown>
![Facebook Ads Data Source](../../../../assets/images/guides/fs/data_source/crm_sales_analytics_facebook_ads.png)
<figcaption>Facebook Ads data source form</figcaption>
</figure>

#### Freshdesk

Required fields:

- **API Key**
- **Domain**

<figure markdown>
![Freshdesk Data Source](../../../../assets/images/guides/fs/data_source/crm_sales_analytics_freshdesk.png)
<figcaption>Freshdesk data source form</figcaption>
</figure>

#### Google Ads

Required fields:

- **Authentication JSON Keyfile**
- **Developer Token**
- **Customer Id**
- **Impersonated Email**

The JSON keyfile can be selected either from an existing project file or uploaded as a new file.

<figure markdown>
![Google Ads Data Source](../../../../assets/images/guides/fs/data_source/crm_sales_analytics_google_ads.png)
<figcaption>Google Ads data source form</figcaption>
</figure>

#### Google Analytics

Required fields:

- **Authentication JSON Keyfile**
- **Property Id**

The JSON keyfile can be selected either from an existing project file or uploaded as a new file.

<figure markdown>
![Google Analytics Data Source](../../../../assets/images/guides/fs/data_source/crm_sales_analytics_google_analytics.png)
<figcaption>Google Analytics data source form</figcaption>
</figure>

#### HubSpot

Required fields:

- **API Key**

<figure markdown>
![HubSpot Data Source](../../../../assets/images/guides/fs/data_source/crm_sales_analytics_hubspot.png)
<figcaption>HubSpot data source form</figcaption>
</figure>

#### Pipedrive

Required fields:

- **API Key**

<figure markdown>
![Pipedrive Data Source](../../../../assets/images/guides/fs/data_source/crm_sales_analytics_pipedrive.png)
<figcaption>Pipedrive data source form</figcaption>
</figure>

#### Salesforce

Required fields:

- **Security Token**
- **Username**
- **Password**

<figure markdown>
![Salesforce Data Source](../../../../assets/images/guides/fs/data_source/crm_sales_analytics_salesforce.png)
<figcaption>Salesforce data source form</figcaption>
</figure>

#### Shopify

Required fields:

- **Shop URL**
- **Private App Password**

<figure markdown>
![Shopify Data Source](../../../../assets/images/guides/fs/data_source/crm_sales_analytics_shopify.png)
<figcaption>Shopify data source form</figcaption>
</figure>

### Step 4: Save the credentials

After entering the required fields for the selected source:

1. Click **Save Credentials**.
2. Click **Next: Select resource** to continue configuring the data source for downstream use.

## Next Steps

Move on to the [usage guide for data sources](../usage.md) to see how you can use your newly created data source.
71 changes: 71 additions & 0 deletions docs/user_guides/fs/data_source/creation/rest_api.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# How-To set up a REST API Data Source

## Introduction

The `REST API` data source lets you connect Hopsworks to external HTTP APIs.
You can use it to store the base connection details, optional headers, and the authentication method required by the target API.

In this guide, you will configure a REST API Data Source in the Hopsworks UI.

!!! note
Currently, it is only possible to create data sources in the Hopsworks UI.
You cannot create a data source programmatically.

## Prerequisites

Before you begin, make sure you have:

- A unique name for the data source in Hopsworks.
- The **Base URL** of the target API.
- Any headers you want to send with requests.
- The authentication details required by the target API.

## Creation in the UI

### Step 1: Set up new Data Source

Head to the Data Source View on Hopsworks (1) and set up a new data source (2).

<figure markdown>
![Data Source Creation](../../../../assets/images/guides/fs/data_source/data_source_overview.png)
<figcaption>The Data Source View in the User Interface</figcaption>
</figure>

### Step 2: Enter REST API settings

Select `REST API` as the storage type.
Then provide the common connection settings shown in the form:

1. **Name:** A unique name for the data source.
2. **Description:** Optional description.
3. **Base URL:** The base endpoint for the external API.
4. **Headers:** Optional header key-value pairs. Use the `+` button to add headers.
5. **Authentication:** Select the authentication mode required by the API.

The following authentication modes are available in the UI:

- `NONE`
- `BEARER_TOKEN`
- `API_KEY`
- `HTTP_BASIC`
- `OAUTH2_CLIENT`

<figure markdown>
![REST API Data Source](../../../../assets/images/guides/fs/data_source/rest_api_creation.png)
<figcaption>REST API data source form</figcaption>
</figure>

!!! note
The screenshot shows the form with `NONE` selected.
When you choose another authentication mode, the form will prompt for the additional credentials required by that method.

### Step 3: Save the credentials

After entering the connection details:

1. Click **Save Credentials**.
2. Click **Next: Select resource** to continue configuring the data source for downstream use.

## Next Steps

Move on to the [usage guide for data sources](../usage.md) to see how you can use your newly created REST API data source.
2 changes: 2 additions & 0 deletions docs/user_guides/fs/data_source/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,8 @@ Cloud agnostic storage systems:
2. [Snowflake](creation/snowflake.md): Query Snowflake databases and tables using SQL.
3. [Kafka](creation/kafka.md): Read data from a Kafka cluster into a Spark Structured Streaming Dataframe.
4. [HopsFS](creation/hopsfs.md): Easily connect and read from directories of Hopsworks' internal File System.
5. [CRM, Sales & Analytics](creation/crm_sales_analytics.md): Connect to supported CRM, sales, and analytics platforms.
6. [REST API](creation/rest_api.md): Connect to external HTTP APIs with configurable headers and authentication.

## AWS

Expand Down
19 changes: 19 additions & 0 deletions docs/user_guides/fs/data_source/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,25 @@ Example for any data warehouse/SQL based external sources, we set the desired SQ
This enables users to create feature groups within Hopsworks without the hassle of data migration.
For more information on `Connector API`, read detailed guide about [external feature groups](../feature_group/create_external.md).

## Ingesting Data into a Managed Feature Group

Data Sources can also be used to create a managed feature group and ingest data from the source into Hopsworks.
In this workflow, Hopsworks creates a sink-enabled feature group together with an ingestion job that copies data from the source into the feature group.

This is different from an external feature group:

- An **external feature group** keeps the data in the external source and stores only metadata in Hopsworks.
- A **managed feature group with ingestion enabled** copies the source data into Hopsworks and can keep it synchronized through recurring ingestion jobs.

This workflow is especially useful when you want to:

- Materialize source data inside Hopsworks.
- Schedule recurring ingestions.
- Use full-load or incremental ingestion strategies.
- Build managed feature groups from SQL, CRM, or REST API sources.

For the full workflow, including schema selection, ingestion job configuration, loading strategies, and REST pagination, see [Ingest Data with dltHub][ingest-data-with-dlthub].

## Writing Training Data

Data Sources are also used while writing training data to external sources.
Expand Down
2 changes: 1 addition & 1 deletion docs/user_guides/fs/feature_group/create.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
description: Documentation on how to create a Feature Group and the different APIs available to insert data to a Feature Group in Hopsworks.
---

# How to create a Feature Group
# How to create a Feature Group { #create-feature-group }

## Introduction

Expand Down
2 changes: 1 addition & 1 deletion docs/user_guides/fs/feature_group/create_external.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
description: Documentation on how to create an external feature group in Hopsworks and the different APIs available to interact with them.
---

# How to create an External Feature Group
# How to create an External Feature Group { #create-external-feature-group }

## Introduction

Expand Down
1 change: 1 addition & 0 deletions docs/user_guides/fs/feature_group/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ This section serves to provide guides and examples for the common usage of abstr

- [Create a Feature Group](create.md)
- [Create an external Feature Group](create_external.md)
- [Ingest Data with dltHub](ingest_with_dlthub.md)
- [Deprecating Feature Group](deprecation.md)
- [Data Types and Schema management](data_types.md)
- [Statistics](statistics.md)
Expand Down
Loading
Loading