Microsoft SharePoint
Version 0.0.1
Check out the API docs here for more information.
Set up
Source
We offer two different options for making a Sharepoint connection:
Microsoft Sharepoint
We recommend using this method. It uses Kleene's multi-tenanted oAuth application for authentication. To be able to successfully run extracts using this source please check within Azure App Services that there is not a restriction that would prevent access. If you need to have IP whitelisting for compliance reasons please contact Kleene and we will help to set this up.
To set up the source you will just need your SharePoint username and password. You will also need to make sure this user has access to the files you are looking to ingest.
Sharepoint (own application)
If you are not able to use an Azure application hosted within a separate directory then you are able to connect using your own oAuth application. To do this you will need to first create the application within Azure.
- Search App Registrations within Azure
- Click New registration (top left under the title)
- Give it a name e.g. sharepoint-microsoft
- Choose Single tenant account type
- For the redirect uri please add these 3 web ones
https://dev.kleene.ai/callback.html
https://app.kleene.ai/callback.html - Add a client secret and keep a note of it somewhere secure
To set up the source you will need to enter your Application (client) ID that can be found on the Overview section of your application within Azure App Registrations, the client secret you created for this application and your SharePoint username and password. You will also need to make sure this user has access to the files you are looking to ingest.
Extract
The Sharepoint connector has 13 different reports, allowing you to pull CSV and XLSX file types, either from your private OneDrive folders, or your shared sites (Sharepoint). If you are extracting an XLSX file you can either choose an all sheets report which will ingest data from every sheet in the XLSX file or a single sheet report where the name of the sheet must be specified in the 'Sheet Name' input field.
⬇️ Report | Incremental | ✏️ Description | 📄 Link to API endpoint |
---|---|---|---|
OneDrive CSV File | N/A | Retrieves the specified CSV file from OneDrive | Get OneDrive CSV File |
OneDrive CSV Folder | File Last Updated At | Retrieves all CSV files from the specified folder within OneDrive | Get OneDrive CSV Folder |
OneDrive XLSX File | N/A | Retrieves any sheet with the specified sheet name from the specified XLSX file within OneDrive | Get OneDrive XLSX File |
OneDrive XLSX File All Sheets | N/A | Retrieves all sheets from the specified XLSX file within OneDrive | Get OneDrive XLSX File All Sheets |
OneDrive XLSX Folder | File Last Updated At | Retrieves any sheet with the specified sheet name from any XLSX file from the specified folder within OneDrive | Get OneDrive XLSX Folder |
OneDrive XLSX Folder All Sheets | File Last Updated At | Retrieves all sheets from any XLSX file from the specified folder within OneDrive | Get OneDrive XLSX Folder All Sheets |
Sharepoint CSV File | N/A | Retrieves the specified CSV file from Sharepoint | Get Sharepoint CSV File |
Sharepoint CSV Folder | File Last Updated At | Retrieves all CSV files from the specified folder within Sharepoint | Get Sharepoint CSV Folder |
Sharepoint XLSX File | N/A | Retrieves any sheet with the specified sheet name from the specified XLSX file within Sharepoint | Get Sharepoint XLSX File |
Sharepoint XLSX File All Sheets | N/A | Retrieves all sheets from the specified XLSX file within Sharepoint | Get Sharepoint XLSX File All Sheets |
Sharepoint XLSX Folder | File Last Updated At | Retrieves any sheet with the specified sheet name from any XLSX file from the specified folder within Sharepoint | Get Sharepoint XLSX Folder |
Sharepoint XLSX Folder All Sheets | File Last Updated At | Retrieves all sheets from all XLSX file from the specified folder within Sharepoint | Get Sharepoint XLSX Folder All Sheets |
Sharepoint XLSX Range | N/A | See detailed explanation below | Get Sharepoint XLSX Range |
Sharepoint Lists | LAST_MODIFIED_DATE | Retrieves details about all lists in a Sharepoint site | Get Sharepoint Lists with metadata |
If the 'Headers Included' checkbox is ticked the first row of each file will always be considered the column headers. If it is not ticked the column headers will be numbered.
OneDrive
To pull either an entire folder or a single file, you will first need to know the path.
If your file is within 1 folder, then your path will be <folder_name>/<file_name>
, if your file is not within a folder, then your path will be <file_name>
.
For the example file below (file.csv), the path would be folder/file.csv
Sharepoint
To pull individual files from shared sites (SharePoint), you will need the following details. Some of the details you can get from the URL when you are on your shared sites in your browser. E.g. if this is the URL: https://benminorodata.sharepoint.com/sites/Test
then your base URL is benminorodata.sharepoint.com
and your site is Test
.
- Base URL - As above
- Site - As above
- Base Folder - This will be the base folder, in which the file sits, which is usually visible from the left-hand side menu when you are on the site home page. See below. e.g.
Documents
, ordocs
- Path - This would be the path to the file from the base folder. E.g.
Book.xlsx
orfolder_name/file.csv
Ingesting an entire folder
If you are looking to ingest an entire folder, you can just add the path to that folder instead of the file name. When this option is selected, the extract will look for all xlsx or csv files within that folder and will ingest them.
If you file sites within your base folder, and within another folder called January
, then the file name is data.csv
, then your path would be January/data.csv
Following on from the example given above for SharePoint, consider the same SharePoint space:
https://benminodata.sharepoint.com/sites/Test
Where the base folder will Documents
and the path as Example/test_files
.
Below is example for the test_files
folder containing .xlsx files, as well as how to connect to just one .xlsx file within that same folder. Be sure to select the Report
as either SharePoint XLSX Folder
or SharePoint XLSX File
depending on your use case.
The case is nearly identical when working with .csv files, where instead the Report
in the extract is selected as either SharePoint CSV folder
or SharePoint CSV File
.
Range report
For the XLSX Range report, you can provide the specific range of cells you are looking to pull from a specified sheet. E.g. for the following range, you will sheet A1:C4 as the range.
You can also provide columns to exclude if you have PII in your worksheets. To do this, simply provide a comma-separated list of columns. e.g. Col1,Col2.
Limitations
- For XLSX files all sheets will either be ingested or the first sheet with the specified sheet name
- If any of your column headers surpasses the maximum character threshold for column headers in your warehouse the column headers will be truncated. This threshold is 255 and 157 for Snowflake and Redshift respectively.
- The order of columns from the Sheet is not persisted in the warehouse.
Updated 8 months ago