Microsoft SharePoint

Version 0.0.1

πŸ“˜

Check out the API docs here for more information.

Set up


Source

We offer two different options for making a Sharepoint connection:

Microsoft Sharepoint

We recommend using this method. It uses Kleene's multi-tenanted oAuth application for authentication. To be able to successfully run extracts using this source please check within Azure App Services that there is not a restriction that would prevent access. If you need to have IP whitelisting for compliance reasons please contact Kleene and we will help to set this up.

To set up the source you will just need your SharePoint username and password. You will also need to make sure this user has access to the files you are looking to ingest.

Sharepoint (own application)

If you are not able to use an Azure application hosted within a separate directory then you are able to connect using your own oAuth application. To do this you will need to first create the application within Azure.

  1. Search App Registrations within Azure
  2. Click New registration (top left under the title)
  3. Give it a name e.g. sharepoint-microsoft
  4. Choose Single tenant account type
  5. For the redirect uri please add these 3 web ones
    https://dev.kleene.ai/callback.html
    https://app.kleene.ai/callback.html
  6. Add a client secret and keep a note of it somewhere secure

To set up the source you will need to enter your Application (client) ID that can be found on the Overview section of your application within Azure App Registrations, the client secret you created for this application and your SharePoint username and password. You will also need to make sure this user has access to the files you are looking to ingest.

Extract

The Sharepoint connector has 13 different reports, allowing you to pull CSV and XLSX file types, either from your private OneDrive folders, or your shared sites (Sharepoint). If you are extracting an XLSX file you can either choose an all sheets report which will ingest data from every sheet in the XLSX file or a single sheet report where the name of the sheet must be specified in the 'Sheet Name' input field.

⬇️ ReportIncremental✏️ DescriptionπŸ“„ Link to API endpoint
OneDrive CSV FileN/ARetrieves the specified CSV file from OneDriveGet OneDrive CSV File
OneDrive CSV FolderFile Last Updated AtRetrieves all CSV files from the specified folder within OneDriveGet OneDrive CSV Folder
OneDrive XLSX FileN/ARetrieves any sheet with the specified sheet name from the specified XLSX file within OneDriveGet OneDrive XLSX File
OneDrive XLSX File All SheetsN/ARetrieves all sheets from the specified XLSX file within OneDriveGet OneDrive XLSX File All Sheets
OneDrive XLSX FolderFile Last Updated AtRetrieves any sheet with the specified sheet name from any XLSX file from the specified folder within OneDriveGet OneDrive XLSX Folder
OneDrive XLSX Folder All SheetsFile Last Updated AtRetrieves all sheets from any XLSX file from the specified folder within OneDriveGet OneDrive XLSX Folder All Sheets
Sharepoint CSV FileN/ARetrieves the specified CSV file from SharepointGet Sharepoint CSV File
Sharepoint CSV FolderFile Last Updated AtRetrieves all CSV files from the specified folder within SharepointGet Sharepoint CSV Folder
Sharepoint XLSX FileN/ARetrieves any sheet with the specified sheet name from the specified XLSX file within SharepointGet Sharepoint XLSX File
Sharepoint XLSX File All SheetsN/ARetrieves all sheets from the specified XLSX file within SharepointGet Sharepoint XLSX File All Sheets
Sharepoint XLSX FolderFile Last Updated AtRetrieves any sheet with the specified sheet name from any XLSX file from the specified folder within SharepointGet Sharepoint XLSX Folder
Sharepoint XLSX Folder All SheetsFile Last Updated AtRetrieves all sheets from all XLSX file from the specified folder within SharepointGet Sharepoint XLSX Folder All Sheets
Sharepoint XLSX RangeN/ASee detailed explanation belowGet Sharepoint XLSX Range
Sharepoint ListsLAST_MODIFIED_DATERetrieves details about all lists in a Sharepoint siteGet Sharepoint Lists with metadata

If the 'Headers Included' checkbox is ticked the first row of each file will always be considered the column headers. If it is not ticked the column headers will be numbered.

OneDrive

To pull either an entire folder or a single file, you will first need to know the path.

If your file is within 1 folder, then your path will be <folder_name>/<file_name>, if your file is not within a folder, then your path will be <file_name>.

For the example file below (file.csv), the path would be folder/file.csv

Sharepoint

To pull individual files from shared sites (SharePoint), you will need the following details. Some of the details you can get from the URL when you are on your shared sites in your browser. E.g. if this is the URL: https://benminorodata.sharepoint.com/sites/Test then your base URL is benminorodata.sharepoint.comand your site is Test.

  • Base URL - As above
  • Site - As above
  • Base Folder - This will be the base folder, in which the file sits, which is usually visible from the left-hand side menu when you are on the site home page. See below. e.g. Documents, or docs

  • Path - This would be the path to the file from the base folder. E.g. Book.xlsx or folder_name/file.csv

πŸ“˜

Ingesting an entire folder

If you are looking to ingest an entire folder, you can just add the path to that folder instead of the file name. When this option is selected, the extract will look for all xlsx or csv files within that folder and will ingest them.

If you file sites within your base folder, and within another folder called January, then the file name is data.csv, then your path would be January/data.csv


Following on from the example given above for SharePoint, consider the same SharePoint space:

https://benminodata.sharepoint.com/sites/Test

Where the base folder will Documents and the path as Example/test_files.

Below is example for the test_files folder containing .xlsx files, as well as how to connect to just one .xlsx file within that same folder. Be sure to select the Report as either SharePoint XLSX Folder or SharePoint XLSX File depending on your use case.



The case is nearly identical when working with .csv files, where instead the Report in the extract is selected as either SharePoint CSV folder or SharePoint CSV File.



Range report

For the XLSX Range report, you can provide the specific range of cells you are looking to pull from a specified sheet. E.g. for the following range, you will sheet A1:C4 as the range.

You can also provide columns to exclude if you have PII in your worksheets. To do this, simply provide a comma-separated list of columns. e.g. Col1,Col2.

Limitations


  • For XLSX files all sheets will either be ingested or the first sheet with the specified sheet name
  • If any of your column headers surpasses the maximum character threshold for column headers in your warehouse the column headers will be truncated. This threshold is 255 and 157 for Snowflake and Redshift respectively.
  • The order of columns from the Sheet is not persisted in the warehouse.