Microsoft SharePoint

📘
Check out the API docs here for more information.

Set up

Source

We offer two different options for making a Sharepoint connection:

Microsoft Sharepoint

We recommend using this method. It uses Kleene's multi-tenanted oAuth application for authentication. To be able to successfully run extracts using this source please check within Azure App Services that there is not a restriction that would prevent access. If you need to have IP whitelisting for compliance reasons please contact Kleene and we will help to set this up.

To set up the source you will just need your SharePoint username and password. You will also need to make sure this user has access to the files you are looking to ingest.

Sharepoint (own application)

If you are not able to use an Azure application hosted within a separate directory then you are able to connect using your own oAuth application. To do this you will need to first create the application within Azure.

Search App Registrations within Azure
Click New registration (top left under the title)
Give it a name e.g. sharepoint-microsoft
Choose Single tenant account type
For the redirect uri please add these 3 web ones
https://app.dev.kleene.ai/callback.html
https://app.kleene.ai/callback.html
Add a client secret and keep a note of it somewhere secure

To set up the source you will need to enter your Application (client) ID that can be found on the Overview section of your application within Azure App Registrations, the client secret you created for this application and your SharePoint username and password. You will also need to make sure this user has access to the files you are looking to ingest.

Extract

The Sharepoint connector has 13 different reports, allowing you to pull CSV and XLSX file types, either from your private OneDrive folders, or your shared sites (Sharepoint). If you are extracting an XLSX file you can either choose an all sheets report which will ingest data from every sheet in the XLSX file or a single sheet report where the name of the sheet must be specified in the 'Sheet Name' input field.

⬇️ Report	Incremental	✏️ Description	📄 Link to API endpoint
OneDrive CSV File	N/A	Retrieves the specified CSV file from OneDrive	Get OneDrive CSV File
OneDrive CSV Folder	File Last Updated At	Retrieves all CSV files from the specified folder within OneDrive	Get OneDrive CSV Folder
OneDrive XLSX File	N/A	Retrieves any sheet with the specified sheet name from the specified XLSX file within OneDrive	Get OneDrive XLSX File
OneDrive XLSX File All Sheets	N/A	Retrieves all sheets from the specified XLSX file within OneDrive	Get OneDrive XLSX File All Sheets
OneDrive XLSX Folder	File Last Updated At	Retrieves any sheet with the specified sheet name from any XLSX file from the specified folder within OneDrive	Get OneDrive XLSX Folder
OneDrive XLSX Folder All Sheets	File Last Updated At	Retrieves all sheets from any XLSX file from the specified folder within OneDrive	Get OneDrive XLSX Folder All Sheets
Sharepoint CSV File	N/A	Retrieves the specified CSV file from Sharepoint	Get Sharepoint CSV File
Sharepoint CSV Folder	File Last Updated At	Retrieves all CSV files from the specified folder within Sharepoint	Get Sharepoint CSV Folder
Sharepoint XLSX File	N/A	Retrieves any sheet with the specified sheet name from the specified XLSX file within Sharepoint	Get Sharepoint XLSX File
Sharepoint XLSX File All Sheets	N/A	Retrieves all sheets from the specified XLSX file within Sharepoint	Get Sharepoint XLSX File All Sheets
Sharepoint XLSX Folder	File Last Updated At	Retrieves any sheet with the specified sheet name from any XLSX file from the specified folder within Sharepoint	Get Sharepoint XLSX Folder
Sharepoint XLSX Folder All Sheets	File Last Updated At	Retrieves all sheets from all XLSX file from the specified folder within Sharepoint	Get Sharepoint XLSX Folder All Sheets
Sharepoint XLSX Range	N/A	See detailed explanation below	Get Sharepoint XLSX Range
Sharepoint Lists	LAST_MODIFIED_DATE	Retrieves details about all lists in a Sharepoint site	Get Sharepoint Lists with metadata

If the 'Headers Included' checkbox is ticked the first row of each file will always be considered the column headers. If it is not ticked the column headers will be numbered.

OneDrive

To pull either an entire folder or a single file, you will first need to know the path.

If your file is within 1 folder, then your path will be <folder_name>/<file_name>, if your file is not within a folder, then your path will be <file_name>.

For the example file below (file.csv), the path would be folder/file.csv

Sharepoint

To pull individual files from shared sites (SharePoint), you will need the following details. Some of the details you can get from the URL when you are on your shared sites in your browser. E.g. if this is the URL: https://benminorodata.sharepoint.com/sites/Test then your base URL is benminorodata.sharepoint.comand your site is Test.

Base URL - As above
Site - As above
Base Folder - This will be the base folder, in which the file sits, which is usually visible from the left-hand side menu when you are on the site home page. See below. e.g. Documents, or docs

Path - This would be the path to the file from the base folder. E.g. Book.xlsx or folder_name/file.csv

📘
Ingesting an entire folder
If you are looking to ingest an entire folder, you can just add the path to that folder instead of the file name. When this option is selected, the extract will look for all xlsx or csv files within that folder and will ingest them.

If you file sites within your base folder, and within another folder called January, then the file name is data.csv, then your path would be January/data.csv

Following on from the example given above for SharePoint, consider the same SharePoint space:

https://benminodata.sharepoint.com/sites/Test

Where the base folder will Documents and the path as Example/test_files.

Below is example for the test_files folder containing .xlsx files, as well as how to connect to just one .xlsx file within that same folder. Be sure to select the Report as either SharePoint XLSX Folder or SharePoint XLSX File depending on your use case.

The case is nearly identical when working with .csv files, where instead the Report in the extract is selected as either SharePoint CSV folder or SharePoint CSV File.

Range report

For the XLSX Range report, you can provide the specific range of cells you are looking to pull from a specified sheet. E.g. for the following range, you will sheet A1:C4 as the range.

You can also provide columns to exclude if you have PII in your worksheets. To do this, simply provide a comma-separated list of columns. e.g. Col1,Col2.

Limitations

For XLSX files all sheets will either be ingested or the first sheet with the specified sheet name
If any of your column headers surpasses the maximum character threshold for column headers in your warehouse the column headers will be truncated. This threshold is 255 and 157 for Snowflake and Redshift respectively.
The order of columns from the Sheet is not persisted in the warehouse.

📘

Set up

Source

Extract

OneDrive

Sharepoint

📘Ingesting an entire folder

Range report

Limitations

📘
Ingesting an entire folder