Microsoft developer toolkit for Python: Fabric/Power BI, MS Graph (Entra), and SharePoint.
Until 2026-06-17 this package was developed in a private repository. The git history was squashed when the project was opened to the public, so commits before that date are not preserved.
pip install msdev-kitOr install from GitHub:
pip install git+https://github.com/Bernardo-Rufino/msdev-kit.gitFor local development:
git clone https://github.com/Bernardo-Rufino/msdev-kit.git
cd msdev-kit
pip install -e .Requirements: Python >= 3.10, an Azure app registration with a client ID and client secret.
from msdev_kit import Auth
from msdev_kit.fabric import Workspace
from msdev_kit.graph import GraphClient
from msdev_kit.sharepoint import SharePointClient
# authenticate
auth = Auth(tenant_id="...", client_id="...", client_secret="...")
# fabric: list workspaces
ws = Workspace(auth.get_token('fabric'))
workspaces = ws.list_workspaces_for_user()
# graph: look up a user
graph = GraphClient(auth)
user_id = graph.get_user_id('user@company.com')
# sharepoint: download a file
sp = SharePointClient(auth, sp_hostname='company', sp_site_path='sites/DataTeam')
sp.download_file('/Reports/monthly.xlsx', local_dir='./downloads')All classes use a shared Auth object. You can use different service principals for different services — instantiate one Auth per SPN:
from msdev_kit import Auth
# service principal auth
fabric_auth = Auth(tenant_id="...", client_id="spn-a", client_secret="...")
graph_auth = Auth(tenant_id="...", client_id="spn-b", client_secret="...")| Service | Scope | Usage |
|---|---|---|
pbi (default) |
Power BI API | auth.get_token() or auth.get_token('pbi') |
fabric |
Fabric API | auth.get_token('fabric') |
graph |
MS Graph API | auth.get_token('graph') |
azure |
Azure Management API | auth.get_token('azure') |
For scenarios requiring user context (e.g., RLS-enabled datasets):
token = auth.get_token_for_user('pbi') # opens browser for login
token = auth.get_token_for_user('fabric')Set up credentials via environment variables or a .env file:
TENANT_ID='<YOUR_TENANT_ID>'
CLIENT_ID='<YOUR_CLIENT_ID>'
CLIENT_SECRET='<YOUR_CLIENT_SECRET>'from msdev_kit import Auth
from msdev_kit.fabric import Workspace, Dataset, Report, Dataflow, Pipeline
auth = Auth(tenant_id, client_id, client_secret)Fabric classes take a token string — call auth.get_token('fabric') or auth.get_token('pbi') depending on the API.
- Workspace — workspaces, users, permissions
- Dataset — semantic models, DAX queries, permissions
- Report — metadata, definitions, visuals, measures
- Dataflow — Gen1, Gen2, Gen2 CI/CD management
- Pipeline — Data Pipeline management
- Other modules — Capacity, Admin, KQL, Notebook, Database
Manage Power BI workspaces, users, and permissions.
ws = Workspace(auth.get_token('pbi'))
workspaces = ws.list_workspaces_for_user()
ws.add_user('user@company.com', workspace_id, 'Member', 'User')| Method | Description |
|---|---|
list_workspaces_for_user(...) |
List all workspaces the user has access to, with optional filters. |
get_workspace_details(workspace_id) |
Get details for a specific workspace. |
list_users(workspace_id) |
List all users in a workspace. |
list_reports(workspace_id) |
List all reports in a workspace. |
add_user(user_principal_name, workspace_id, access_right, user_type) |
Add a user or service principal to a workspace. |
update_user(user_principal_name, workspace_id, access_right) |
Update a user's role on a workspace. |
remove_user(user_principal_name, workspace_id) |
Remove a user from a workspace. |
batch_update_user(user, workspaces_list) |
Batch update a user across multiple workspaces. |
Manage datasets (semantic models), permissions, and execute DAX queries.
ds = Dataset(auth.get_token('pbi'))
result = ds.execute_query(workspace_id, dataset_id, "EVALUATE Sales")| Method | Description |
|---|---|
list_datasets(workspace_id) |
List all datasets in a workspace. |
get_dataset_details(workspace_id, dataset_id) |
Get details of a specific dataset. |
get_dataset_name(workspace_id, dataset_id) |
Resolve the display name of a dataset. Tries PBI API first, falls back to Fabric semantic models API. |
execute_query(workspace_id, dataset_id, query) |
Execute a DAX query. Runs a COUNTROWS pre-check to detect truncation. |
list_users(workspace_id, dataset_id) |
List users with access to a dataset. |
add_user(user_principal_name, workspace_id, dataset_id, access_right) |
Grant a user access to a dataset. |
update_user(user_principal_name, workspace_id, dataset_id, access_right) |
Update a user's access to a dataset. |
remove_user(user_principal_name, workspace_id, dataset_id) |
Remove a user's access to a dataset. |
list_dataset_related_reports(workspace_id, dataset_id) |
List all reports linked to a dataset. |
export_dataset_related_reports(workspace_id, dataset_id) |
Export all reports linked to a dataset as .pbix files. |
Retrieve report metadata, definitions, visuals, and report-level measures.
rpt = Report(auth.get_token('pbi'))
pages = rpt.list_report_pages(workspace_id, report_id)| Method | Description |
|---|---|
list_reports(workspace_id) |
List all reports in a workspace. |
get_report_metadata(workspace_id, report_id) |
Get metadata for a specific report. |
get_report_name(workspace_id, report_id) |
Get a report's display name. |
list_report_pages(workspace_id, report_id) |
List all pages in a report. |
get_legacy_report_pages_and_visuals(json_data, workspace_id, report_id) |
Parse a PBIR-Legacy report JSON and extract pages/visuals into a DataFrame. |
get_legacy_report_json(workspace_id, report_id, operations) |
Get and decode the full report definition for PBIR-Legacy reports. |
export_report(workspace_id, report_id, ...) |
Export a report as a .pbix file. |
get_report_measures(workspace_id, report_id, operations) |
Extract report-level measures and generate a DAX Query View script. |
rebind_report(workspace_id, report_id, new_dataset_id, ...) |
Rebind a report to a new dataset and migrate Read access. |
Manage Power BI and Fabric dataflows, including Gen1, Gen2, and Gen2 CI/CD.
df = Dataflow(auth.get_token('fabric'))
# upgrade Gen1 to Gen2 CI/CD
result = df.upgrade_to_gen2_cicd(
workspace_id='<workspace_id>',
dataflow_id='<gen1_dataflow_id>',
display_name='my_dataflow_cicd',
source_type='gen1'
)| Method | Description |
|---|---|
list_dataflows(workspace_id) |
List all dataflows (Gen1, Gen2, Gen2 CI/CD), merged and deduplicated. |
get_dataflow_details(workspace_id, dataflow_id) |
Get details of a specific dataflow. |
get_dataflow_name(workspace_id, dataflow_id) |
Resolve the display name of a dataflow. |
create_dataflow(workspace_id, dataflow_content) |
Create a new Power BI dataflow. |
delete_dataflow(workspace_id, dataflow_id, type='pbi') |
Delete a dataflow. Use type='fabric' for Fabric API. |
export_dataflow_json(workspace_id, dataflow_id, dataflow_name) |
Export a dataflow definition as JSON. |
get_dataflow_gen2_definition(workspace_id, dataflow_id) |
Get the definition of a Dataflow Gen2 CI/CD item. |
create_dataflow_gen2_from_definition(workspace_id, display_name, definition) |
Create a Dataflow Gen2 CI/CD from a definition. |
update_dataflow_gen2_from_definition(workspace_id, dataflow_id, display_name, definition) |
Update an existing Dataflow Gen2 CI/CD definition. |
get_data_destinations(workspace_id, dataflow_id) |
Get data destination details for each table in a dataflow. |
change_data_destination(workspace_id, dataflow_id, destination_type, ...) |
Change data destination (Lakehouse/Warehouse). Modes: preview, replace, create. |
create_dataflow_with_new_destination(workspace_id, dataflow_id, ...) |
Create a new Gen2 CI/CD dataflow with a different data destination. |
upgrade_to_gen2_cicd(...) |
Upgrade a Gen1 or Gen2 (standard) dataflow to Gen2 CI/CD. |
Manage Fabric Data Pipelines.
pipe = Pipeline(auth.get_token('fabric'))
activities = pipe.get_pipeline_activities(workspace_id, 'My Pipeline')| Method | Description |
|---|---|
list_pipelines(workspace_id) |
List all Fabric Data Pipelines in a workspace. |
get_pipeline(workspace_id, pipeline_id) |
Get the metadata of a specific pipeline. |
get_pipeline_definition(workspace_id, pipeline_id) |
Get the full definition of a pipeline. |
update_pipeline_definition(workspace_id, pipeline_id, definition) |
Update an existing pipeline definition. |
get_pipeline_activities(workspace_id, pipeline_id_or_name) |
Get activities from a pipeline. Accepts ID or display name. |
find_pipelines_by_dataflow(workspace_id, dataflow_id_or_name) |
Find pipelines that reference a specific dataflow. |
replace_dataflow_id_in_pipeline(workspace_id, pipeline_id, old_id, new_id) |
Replace a dataflow ID in all RefreshDataflow activities. |
Example: replacing a dataflow destination and updating pipelines
When change_data_destination(mode='replace') is used on a standard Gen2 dataflow, the original is deleted and a new CI/CD dataflow is created with a new ID. Pipelines referencing the old ID must be updated:
from msdev_kit import Auth
from msdev_kit.fabric import Dataflow, Pipeline
auth = Auth(tenant_id, client_id, client_secret)
dataflow = Dataflow(auth.get_token('pbi'))
pipeline = Pipeline(auth.get_token('fabric'))
workspace_id = '<workspace_id>'
old_dataflow_id = '<dataflow_id>'
# 1. replace dataflow destination (creates new CI/CD, deletes original)
result = dataflow.change_data_destination(
workspace_id=workspace_id,
dataflow_id=old_dataflow_id,
destination_type='Warehouse',
destination_workspace_id=workspace_id,
destination_item_id='<warehouse_id>',
mode='replace'
)
new_dataflow_id = result['content']['id']
# 2. find pipelines referencing the old dataflow ID
matches = pipeline.find_pipelines_by_dataflow(workspace_id, old_dataflow_id)
# 3. update each pipeline to use the new ID
for m in matches['content']:
pipeline.replace_dataflow_id_in_pipeline(
workspace_id, m['pipeline_id'], old_dataflow_id, new_dataflow_id
)| Module | Class | Description |
|---|---|---|
| Capacity | Capacity |
Monitor and manage Power BI and Fabric capacities. |
| Operations | Operations |
Track long-running Fabric API operations. |
| Admin | Admin |
Power BI Admin API operations. |
| KQL | KQLDatabase |
Query Kusto (KQL) databases in Microsoft Fabric. |
| Notebook | Notebook |
Manage Fabric notebooks (list, get metadata). |
| Database | Database |
Query and write to SQL databases (Lakehouse, Warehouse) via ODBC. |
Manage Entra ID (Azure AD) users and groups via the MS Graph API.
from msdev_kit import Auth
from msdev_kit.graph import GraphClient
auth = Auth(tenant_id, client_id, client_secret)
graph = GraphClient(auth)
# look up a user and add them to a group
user_id = graph.get_user_id('user@company.com')
group_id = graph.get_group_id('Data Team')
graph.add_group_member(group_id, user_id)
# list all members of a group
members = graph.list_group_members(group_id)| Method | Description |
|---|---|
get_user_id(email) |
Resolve user object ID by UPN/email, with mail fallback. |
get_group_id(group_name) |
Resolve Entra group object ID by display name. |
list_group_members(group_id) |
Paginated member list (id, displayName, mail, UPN). |
add_group_member(group_id, user_id) |
Add user to group. Silently ignores already-member errors. |
remove_group_member(group_id, user_id) |
Remove user from group. Silently ignores 404/403. |
Manage SharePoint files and folders via MS Graph API (no ACS/Office365 dependency).
from msdev_kit import Auth
from msdev_kit.sharepoint import SharePointClient
auth = Auth(tenant_id, client_id, client_secret)
sp = SharePointClient(auth, sp_hostname='company', sp_site_path='sites/DataTeam')
# download a file
sp.download_file('/Reports/monthly.xlsx', local_dir='./downloads')
# upload a file (from path or bytes)
sp.upload_file('/Reports/updated.xlsx', source='./local/updated.xlsx')
sp.upload_file('/Reports/data.csv', source=csv_bytes, content_type='text/csv')
# create nested folders
sp.create_folder('/Reports/2026/Q1')| Method | Description |
|---|---|
download_file(file_path, local_dir) |
Download a file from the default document library. Returns local file path. |
upload_file(remote_path, source, content_type?) |
Upload/overwrite a file. source is a local file path (str) or raw bytes. |
create_folder(folder_path) |
Create a folder and all intermediate folders. |
Hostname and site path inputs are normalized automatically:
| Input | Normalized to |
|---|---|
company |
company.sharepoint.com |
company.sharepoint.com |
company.sharepoint.com |
https://company.sharepoint.com |
company.sharepoint.com |
DataTeam |
sites/DataTeam |
sites/DataTeam |
sites/DataTeam |
/sites/DataTeam |
sites/DataTeam |
- The Power BI REST API has a 200 requests per hour rate limit.
- Not all users can be updated via the API. See Microsoft docs: Dataset permissions.
- Dataset query limits (executeQueries API):
- Max 100,000 rows or 1,000,000 values (rows x columns) per query, whichever is hit first.
- Max 15 MB of data per query.
- 120 query requests per minute per user.
- Only DAX queries are supported (no MDX, INFO functions, or DMV).
- Datasets hosted in Azure Analysis Services or with a live connection to on-premises AAS are not supported.
- Service Principals are not supported for datasets with RLS or SSO enabled.
End-to-end scripts for the most common scenarios live in examples/.
They share a small _setup.py that builds Auth and the service clients from
environment variables (or ./utils/.env). Run any of them from the repo root:
python -m examples.workspaces
python -m examples.dataflowsExternal contributions are welcome via pull requests. Please:
- Open an issue first for non-trivial changes so the design can be discussed.
- Use
feature/<name>/fix/<name>branches. Direct pushes tomainare blocked by repository rulesets. - Keep PRs focused. Add or update tests under
tests/for any behavior change. - Run
pytestlocally before opening a PR; theCollaborationworkflow runs the same suite on every PR and must pass before merging.
PyPI releases are published automatically from main by the Publish to PyPI
workflow, gated by a manual approval on the pypi environment.