cluedin 1.0.0

Today, I'm happy to announce the first release of cluedin – a Python library to work with CluedIn – a data management platform.

The library is available on PyPi and GitHub. It's open-source and free to use.

In the first release, I focused on authentication and GraphQL API, but I plan to add more features.

Installation

The installation is simple:

pip install cluedin

Usage

Authentication

To get a CluedIn access token, you need to create a context object and call the cluedin.auth.load_token_into_context(context) method:

import cluedin

context = {
    "protocol": "http", # if you skip this parameter, it will fall back to `https`
    "domain": "cluedin.local",
    "organization": "foobar",
    "user": "admin@foobar.com",
    "password": "Foobar23!"
}

cluedin.load_token_into_context(context)

print(context['access_token'])

GraphQL

To run GraphQL requests, you need to pass the context object to cluedin.gql.gql(context, query, variables) method:

query = """
  query searchEntities($cursor: PagingCursor, $query: String, $pageSize: Int) {
    search(
      query: $query
      sort: FIELDS
      cursor: $cursor
      pageSize: $pageSize
      sortFields: {field: "id", direction: ASCENDING}
    ) {
      totalResults
      cursor
      entries {
        id
        name
        entityType
        properties
      }
    }
  }
"""

variables = {
    "query": "entityType:/Infrastructure/User",
    "pageSize": 1
}

response = cluedin.gql.gql(context, query, variables)

You can also get all results pages using the cluedin.gql.entries(context, query, variables) generator:

import numpy as np
import pandas as pd

query = """
  query searchEntities($cursor: PagingCursor, $query: String, $pageSize: Int) {
    search(
      query: $query
      sort: FIELDS
      cursor: $cursor
      pageSize: $pageSize
      sortFields: {field: "id", direction: ASCENDING}
    ) {
      totalResults
      cursor
      entries {
        id
        name
        entityType
        properties
      }
    }
  }
"""

variables = {
    "query": "*",
    "pageSize": 10000
}

entries = np.array([x for x in cluedin.gql.entries(context, query, variables)])

df = pd.DataFrame(entries.tolist(), columns=list(entries[0].keys()))

I hope you find this library helpful. If you have any questions or suggestions, please let me know.

Release notes

Authentication

cluedin.auth.get_token_response(context) – to get access token responses.
cluedin.load_token_into_context(context) – to load JWT access tokens into context objects.

GraphQL

cluedin.gql.gql(context, query, variables) – to run GraphQL requests.
cluedin.gql.entries(context, query, variables) – a generator to return paged results from GraphQL requests.

URLs

cluedin.urls.get_protocol(context) - to return HTTP protocol from context (defaults to https if value is not provided in the context).
cluedin.urls.get_org_url(context) - returns organization's (a.k.a. tenant's) URL (like https://foobar.cluedin.local).
cluedin.urls.get_auth_url(context) - returns authentication URL (like https://foobar.cluedin.local/auth).
cluedin.urls.get_api_url(context) - returns API URL (like https://foobar.cluedin.local/api/api).
cluedin.urls.get_graphql_url(context) - returns GraphQL API URL (like https://foobar.cluedin.local/api/api/graphql).

Utilities

cluedin.utils.load(filename) - to load JSON files into objects.
cluedin.utils.save(obj, filename, sort_keys=True) - to save objects into JSON files.

PyPi package: https://pypi.org/project/cluedin/1.0.0/