Add OpenTelemetry instrumentation for LiteLLM by whoIam0987 · Pull Request #88 · alibaba/loongsuite-python-agent

whoIam0987 · 2025-12-15T08:41:47Z

Description

Instrument Litellm with genai util.
Fixes # (issue)

Type of change

Please delete options that are not relevant.

New feature (non-breaking change which adds functionality)

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

Modify releated unit tests.

Does This PR Require a Core Repo Change?

No.

Checklist:

See contributing.md for styleguide, changelog guidelines, and more.

Followed the style guidelines of this project
Changelogs have been updated
Unit tests have been added
Documentation has been updated

CLAassistant · 2025-12-15T08:41:56Z

All committers have signed the CLA.

Copilot

Pull request overview

This pull request adds OpenTelemetry instrumentation for the LiteLLM library, which provides a unified interface to 100+ LLM providers. The instrumentation captures telemetry data for LLM operations including completions, embeddings, streaming, tool calls, and retry mechanisms.

Changes:

Added new instrumentation package for LiteLLM with support for sync/async completion and embedding APIs
Implemented streaming response wrappers with proper span lifecycle management
Added comprehensive test suite covering various LiteLLM features including tool calls, retries, and error handling

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 19 comments.

Show a summary per file

File	Description
`__init__.py`	Main instrumentor class that wraps LiteLLM functions and manages telemetry handlers
`_wrapper.py`	Completion wrappers for sync/async calls with streaming support
`_embedding_wrapper.py`	Embedding operation wrappers for sync/async calls
`_stream_wrapper.py`	Stream response wrappers handling chunk accumulation and finalization
`_utils.py`	Utility functions for message conversion, provider parsing, and invocation creation
`version.py`, `package.py`	Package metadata and dependency declarations
`pyproject.toml`	Package configuration with build system and dependencies
`README.rst`	Documentation for installation, configuration, and usage
`test_*.py`	Comprehensive test suite covering completions, embeddings, streaming, tools, retries, and errors
`test-requirements.txt`	Test dependency specifications

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-05T13:55:37Z

...uite-instrumentation-litellm/src/opentelemetry/instrumentation/litellm/_embedding_wrapper.py

+
+        # Create invocation object


Duplicate comment. The comment "Create invocation object" appears twice consecutively on lines 63 and 65. Remove one of these duplicate lines.

Suggested change

# Create invocation object

Opinions have been adopted

Copilot · 2026-02-05T13:55:38Z

...uite-instrumentation-litellm/src/opentelemetry/instrumentation/litellm/_embedding_wrapper.py

+
+        # Create invocation object


Duplicate comment. The comment "Create invocation object" appears twice consecutively on lines 164 and 166. Remove one of these duplicate lines.

Suggested change

# Create invocation object

Opinions have been adopted

Copilot · 2026-02-05T13:55:38Z

...ite/loongsuite-instrumentation-litellm/src/opentelemetry/instrumentation/litellm/_wrapper.py

+
+        # For streaming, we need special handling


Duplicate comment. The comment "For streaming, we need special handling" appears twice consecutively on lines 85 and 87. Remove one of these duplicate lines.

Suggested change

# For streaming, we need special handling

Opinions have been adopted

Copilot · 2026-02-05T13:55:38Z

...ite/loongsuite-instrumentation-litellm/src/opentelemetry/instrumentation/litellm/_wrapper.py

+                stream_wrapper = AsyncStreamWrapper(
+                    stream=response,
+                    span=invocation.span,  # For TTFT tracking
+                    callback=lambda span,
+                    last_chunk,
+                    error: self._handle_stream_end_with_handler(
+                        invocation, last_chunk, error, stream_wrapper
+                    ),
+                )


Potential circular reference in lambda callback. The lambda function references stream_wrapper (line 366) before it's assigned (line 360). While Python closures can handle this due to lazy evaluation, this creates a circular reference that could potentially cause issues with garbage collection. Consider restructuring to avoid the circular reference, perhaps by creating the callback after the AsyncStreamWrapper is instantiated or using a different callback pattern.

Set the callback after instantiating the stream_wrapper.

Copilot · 2026-02-05T13:55:39Z

...ite/loongsuite-instrumentation-litellm/src/opentelemetry/instrumentation/litellm/__init__.py

+* ``ENABLE_LITELLM_INSTRUMENTOR``: Enable/disable instrumentation (default: true)
+* ``ARMS_LITELLM_INSTRUMENTATION_ENABLED``: Alternative enable/disable flag (default: true)


The documentation claims that the ENABLE_LITELLM_INSTRUMENTOR environment variable can be used to enable/disable instrumentation, but this variable is never referenced in the actual code. Only ARMS_LITELLM_INSTRUMENTATION_ENABLED is actually implemented. Either implement support for this environment variable or remove it from the documentation to avoid confusion.

Suggested change

* ``ENABLE_LITELLM_INSTRUMENTOR``: Enable/disable instrumentation (default: true)

* ``ARMS_LITELLM_INSTRUMENTATION_ENABLED``: Alternative enable/disable flag (default: true)

* ``ARMS_LITELLM_INSTRUMENTATION_ENABLED``: Enable/disable instrumentation (default: true)

All of ARMS_LITELLM_INSTRUMENTATION_ENABLED were replaced with ENABLE_LITELLM_INSTRUMENTOR

Copilot · 2026-02-05T13:55:41Z

...ite/loongsuite-instrumentation-litellm/src/opentelemetry/instrumentation/litellm/_wrapper.py

+                suppress_token = context.attach(
+                    context.set_value(SUPPRESS_LLM_SDK_KEY, True)
+                )
+            except Exception:


'except' clause does nothing but pass and there is no explanatory comment.

Add explanatory comment

Copilot · 2026-02-05T13:55:41Z

...ite/loongsuite-instrumentation-litellm/src/opentelemetry/instrumentation/litellm/_wrapper.py

+                                except Exception:
+                                    pass


'except' clause does nothing but pass and there is no explanatory comment.

Suggested change

except Exception:

pass

except Exception as decode_error:

# Ignore JSON parsing errors and fall back to the raw string,

# but log at debug level for diagnosability.

logger.debug(

"Failed to JSON-decode tool call arguments %r: %s",

arguments,

decode_error,

)

Add explanatory comment

Copilot · 2026-02-05T13:55:42Z

...ite/loongsuite-instrumentation-litellm/src/opentelemetry/instrumentation/litellm/_wrapper.py

+            except Exception:
+                pass


'except' clause does nothing but pass and there is no explanatory comment.

Suggested change

except Exception:

pass

except Exception as handler_error:

# Swallow exceptions from telemetry failure reporting, but log them for diagnostics.

logger.debug(

"Error while reporting LLM failure in _handle_stream_end_with_handler: %s",

handler_error,

)

Opinions have been adopted

Copilot · 2026-02-05T13:55:42Z

...ite/loongsuite-instrumentation-litellm/src/opentelemetry/instrumentation/litellm/_wrapper.py

+                suppress_token = context.attach(
+                    context.set_value(SUPPRESS_LLM_SDK_KEY, True)
+                )
+            except Exception:


'except' clause does nothing but pass and there is no explanatory comment.

Add explanatory comment

Copilot · 2026-02-05T13:55:42Z

...ite/loongsuite-instrumentation-litellm/src/opentelemetry/instrumentation/litellm/_wrapper.py

+                    context.set_value(SUPPRESS_LLM_SDK_KEY, True)
+                )
+            except Exception:
+                pass


'except' clause does nothing but pass and there is no explanatory comment.

Suggested change

pass

# Failed to attach suppression context; proceed without suppression.

logger.exception(

"Failed to attach suppression context for LiteLLM instrumentation"

)

Add explanatory comment

Cirilla-zmh

Thanks for the contribution! Please address the remaining comments so we can move this PR forward.

Cirilla-zmh · 2026-02-13T07:05:37Z

instrumentation-loongsuite/loongsuite-instrumentation-litellm/test-requirements.txt

@@ -0,0 +1,8 @@
+litellm>=1.0.0


Could you please add these tests as github workflows?

Just refer to:

loongsuite-python-agent/CONTRIBUTING-loongsuite-zh.md

Lines 290 to 291 in 043b0a9

- 测试

- 添加新 instrumentation 时，请记住更新 `tox.ini`，在 `envlist`、`command_pre` 和 `commands` 部分添加适当的规则

loongsuite-python-agent/CONTRIBUTING-loongsuite-zh.md

Lines 225 to 240 in 043b0a9

## 本地运行测试

1. 转到您的 Python Agent 仓库目录。`git clone git@github.com:alibaba/loongsuite-python-agent.git && cd loongsuite-python-agent`。

2. 确保您已安装 `tox`。`pip install tox`。

3. 运行 `tox` 不带任何参数以运行所有包的测试。阅读更多关于 [tox](https://tox.readthedocs.io/en/latest/) 的信息。

由于执行依赖项安装的预步骤，某些测试可能很慢。为了帮助解决这个问题，您可以先运行一次 tox，然后使用 toxdir 中先前安装的依赖项运行测试，如下所示：

1. 首次运行（例如，opentelemetry-instrumentation-aiopg）

```console

tox -e py312-test-instrumentation-aiopg

```

2. 再次运行测试而不执行预步骤：

```console

.tox/py312-test-instrumentation-aiopg/bin/pytest instrumentation/opentelemetry-instrumentation-aiopg

```

Cirilla-zmh · 2026-02-13T07:06:22Z

instrumentation-loongsuite/loongsuite-instrumentation-litellm/pyproject.toml

+  "License :: OSI Approved :: Apache Software License",
+  "Programming Language :: Python",
+  "Programming Language :: Python :: 3",
+  "Programming Language :: Python :: 3.8",


We didn't support Python 3.8 in loongsuite any more.

Cirilla-zmh · 2026-02-13T07:07:05Z

instrumentation-loongsuite/loongsuite-instrumentation-litellm/test-requirements.txt

+pytest
+pytest-asyncio
+openai
+-e aliyun-semantic-conventions


I believe it doesn't work here. Could you have a check again?

Cirilla-zmh · 2026-02-13T07:12:56Z

instrumentation-loongsuite/loongsuite-instrumentation-litellm/tests/test_error_handling.py

+        Test handling when max_tokens is exceeded.
+        """
+
+        os.environ["DASHSCOPE_API_KEY"] = os.environ.get(


pytest.vcr may help to record LLM-releated requests and responses so that you could replay them next time you run the same tests.

Please check:

loongsuite-python-agent/instrumentation-loongsuite/loongsuite-instrumentation-agentscope/tests/conftest.py

Lines 221 to 311 in 043b0a9

# ==================== VCR Configuration ====================

@pytest.fixture(scope="module")

def vcr_config():

"""Configure VCR for recording and replaying HTTP requests"""

return {

"filter_headers": [

("authorization", "Bearer test_api_key"),

("api-key", "test_api_key"),

],

"decode_compressed_response": True,

"before_record_response": scrub_response_headers,

}

class LiteralBlockScalar(str):

"""Format string as literal block scalar, preserving whitespace and not interpreting escape characters"""

def literal_block_scalar_presenter(dumper, data):

"""Represent scalar string as literal block using '|' syntax"""

return dumper.represent_scalar("tag:yaml.org,2002:str", data, style="|")

yaml.add_representer(LiteralBlockScalar, literal_block_scalar_presenter)

def process_string_value(string_value):

"""Format JSON or return long string as LiteralBlockScalar"""

try:

json_data = json.loads(string_value)

return LiteralBlockScalar(json.dumps(json_data, indent=2))

except (ValueError, TypeError):

if len(string_value) > 80:

return LiteralBlockScalar(string_value)

return string_value

def convert_body_to_literal(data):

"""Search for body strings in data and attempt to format JSON"""

if isinstance(data, dict):

for key, value in data.items():

# Handle response body case (e.g., response.body.string)

if key == "body" and isinstance(value, dict) and "string" in value:

value["string"] = process_string_value(value["string"])

# Handle request body case (e.g., request.body)

elif key == "body" and isinstance(value, str):

data[key] = process_string_value(value)

else:

convert_body_to_literal(value)

elif isinstance(data, list):

for idx, choice in enumerate(data):

data[idx] = convert_body_to_literal(choice)

return data

class PrettyPrintJSONBody:

"""Make request and response body recordings more readable"""

@staticmethod

def serialize(cassette_dict):

cassette_dict = convert_body_to_literal(cassette_dict)

return yaml.dump(

cassette_dict, default_flow_style=False, allow_unicode=True

)

@staticmethod

def deserialize(cassette_string):

return yaml.load(cassette_string, Loader=yaml.Loader)

@pytest.fixture(scope="module", autouse=True)

def fixture_vcr(vcr):

"""Register VCR serializer"""

vcr.register_serializer("yaml", PrettyPrintJSONBody)

return vcr

def scrub_response_headers(response):

"""

Scrub sensitive response headers. Note they are case-sensitive!

"""

# Clean response headers as needed

if "Set-Cookie" in response["headers"]:

response["headers"]["Set-Cookie"] = "test_set_cookie"

return response

loongsuite-python-agent/instrumentation-loongsuite/loongsuite-instrumentation-agentscope/tests/test_model.py

Lines 169 to 217 in 043b0a9

@pytest.mark.vcr()

def test_model_call_basic(instrument_no_content, span_exporter, request):

"""Test basic model call"""

# Initialize agentscope

agentscope.init(project="test_basic")

# Create model

model = DashScopeChatModel(

api_key=request.config.option.api_key,

model_name="qwen-max",

)

# Prepare messages

messages = [{"role": "user", "content": "Hello!"}]

# Call model

async def call_model():

response = await model(messages)

if hasattr(response, "__aiter__"):

result = []

async for chunk in response:

result.append(chunk)

return result[-1] if result else response

return response

response = asyncio.run(call_model())

assert response is not None

# Verify spans

spans = span_exporter.get_finished_spans()

assert len(spans) >= 1, f"Expected at least 1 span, got {len(spans)}"

# Find chat model span

chat_spans = [span for span in spans if span.name.startswith("chat ")]

assert len(chat_spans) >= 1, (

f"No chat spans found. Available spans: {[s.name for s in spans]}"

)

# Verify span attributes

chat_span = chat_spans[0]

_assert_chat_span_attributes(

chat_span,

request_model="qwen-max",

expect_input_messages=False, # Do not capture content by default

expect_output_messages=False, # Do not capture content by default

expect_time_to_first_token=True,

)

print("✓ Model call (basic) completed successfully")

github-actions bot assigned 123liuziming, Cirilla-zmh and ralf0131 Dec 15, 2025

github-actions bot requested review from 123liuziming, Cirilla-zmh and ralf0131 December 15, 2025 08:57

whoIam0987 force-pushed the mingzhi/litellm branch from c26059b to d8affed Compare January 28, 2026 07:30

whoIam0987 changed the title ~~[WIP] Add OpenTelemetry instrumentation for LiteLLM~~ Add OpenTelemetry instrumentation for LiteLLM Feb 5, 2026

ralf0131 requested a review from Copilot February 5, 2026 13:50

Copilot started reviewing on behalf of ralf0131 February 5, 2026 13:50 View session

Copilot AI reviewed Feb 5, 2026

View reviewed changes

whoIam0987 added 3 commits February 11, 2026 16:35

Add OpenTelemetry instrumentation for litellm

fe19ebe

Add Litellm Instrument Unit Test

f13d7d5

Fix the CI failure.

002d956

whoIam0987 force-pushed the mingzhi/litellm branch from d8affed to 002d956 Compare February 11, 2026 08:36

Cirilla-zmh requested changes Feb 13, 2026

View reviewed changes

		* ``ENABLE_LITELLM_INSTRUMENTOR``: Enable/disable instrumentation (default: true)
		* ``ARMS_LITELLM_INSTRUMENTATION_ENABLED``: Alternative enable/disable flag (default: true)

-                                except Exception:
-                                    pass
+                                except Exception as decode_error:
+                                    # Ignore JSON parsing errors and fall back to the raw string,
+                                    # but log at debug level for diagnosability.
+                                    logger.debug(
+                                        "Failed to JSON-decode tool call arguments %r: %s",
+                                        arguments,
+                                        decode_error,
+                                    )

-            except Exception:
-                pass
+            except Exception as handler_error:
+                # Swallow exceptions from telemetry failure reporting, but log them for diagnostics.
+                logger.debug(
+                    "Error while reporting LLM failure in _handle_stream_end_with_handler: %s",
+                    handler_error,
+                )

-                pass
+                # Failed to attach suppression context; proceed without suppression.
+                logger.exception(
+                    "Failed to attach suppression context for LiteLLM instrumentation"
+                )

	- 测试
	- 添加新 instrumentation 时，请记住更新 `tox.ini`，在 `envlist`、`command_pre` 和 `commands` 部分添加适当的规则

	## 本地运行测试

	1. 转到您的 Python Agent 仓库目录。`git clone git@github.com:alibaba/loongsuite-python-agent.git && cd loongsuite-python-agent`。
	2. 确保您已安装 `tox`。`pip install tox`。
	3. 运行 `tox` 不带任何参数以运行所有包的测试。阅读更多关于 [tox](https://tox.readthedocs.io/en/latest/) 的信息。

	由于执行依赖项安装的预步骤，某些测试可能很慢。为了帮助解决这个问题，您可以先运行一次 tox，然后使用 toxdir 中先前安装的依赖项运行测试，如下所示：

	1. 首次运行（例如，opentelemetry-instrumentation-aiopg）
	```console
	tox -e py312-test-instrumentation-aiopg
	```
	2. 再次运行测试而不执行预步骤：
	```console
	.tox/py312-test-instrumentation-aiopg/bin/pytest instrumentation/opentelemetry-instrumentation-aiopg
	```

	# ==================== VCR Configuration ====================


	@pytest.fixture(scope="module")
	def vcr_config():
	"""Configure VCR for recording and replaying HTTP requests"""
	return {
	"filter_headers": [
	("authorization", "Bearer test_api_key"),
	("api-key", "test_api_key"),
	],
	"decode_compressed_response": True,
	"before_record_response": scrub_response_headers,
	}


	class LiteralBlockScalar(str):
	"""Format string as literal block scalar, preserving whitespace and not interpreting escape characters"""


	def literal_block_scalar_presenter(dumper, data):
	"""Represent scalar string as literal block using '\|' syntax"""
	return dumper.represent_scalar("tag:yaml.org,2002:str", data, style="\|")


	yaml.add_representer(LiteralBlockScalar, literal_block_scalar_presenter)


	def process_string_value(string_value):
	"""Format JSON or return long string as LiteralBlockScalar"""
	try:
	json_data = json.loads(string_value)
	return LiteralBlockScalar(json.dumps(json_data, indent=2))
	except (ValueError, TypeError):
	if len(string_value) > 80:
	return LiteralBlockScalar(string_value)
	return string_value


	def convert_body_to_literal(data):
	"""Search for body strings in data and attempt to format JSON"""
	if isinstance(data, dict):
	for key, value in data.items():
	# Handle response body case (e.g., response.body.string)
	if key == "body" and isinstance(value, dict) and "string" in value:
	value["string"] = process_string_value(value["string"])

	# Handle request body case (e.g., request.body)
	elif key == "body" and isinstance(value, str):
	data[key] = process_string_value(value)

	else:
	convert_body_to_literal(value)

	elif isinstance(data, list):
	for idx, choice in enumerate(data):
	data[idx] = convert_body_to_literal(choice)

	return data


	class PrettyPrintJSONBody:
	"""Make request and response body recordings more readable"""

	@staticmethod
	def serialize(cassette_dict):
	cassette_dict = convert_body_to_literal(cassette_dict)
	return yaml.dump(
	cassette_dict, default_flow_style=False, allow_unicode=True
	)

	@staticmethod
	def deserialize(cassette_string):
	return yaml.load(cassette_string, Loader=yaml.Loader)


	@pytest.fixture(scope="module", autouse=True)
	def fixture_vcr(vcr):
	"""Register VCR serializer"""
	vcr.register_serializer("yaml", PrettyPrintJSONBody)
	return vcr


	def scrub_response_headers(response):
	"""
	Scrub sensitive response headers. Note they are case-sensitive!
	"""
	# Clean response headers as needed
	if "Set-Cookie" in response["headers"]:
	response["headers"]["Set-Cookie"] = "test_set_cookie"
	return response

	@pytest.mark.vcr()
	def test_model_call_basic(instrument_no_content, span_exporter, request):
	"""Test basic model call"""
	# Initialize agentscope
	agentscope.init(project="test_basic")

	# Create model
	model = DashScopeChatModel(
	api_key=request.config.option.api_key,
	model_name="qwen-max",
	)

	# Prepare messages
	messages = [{"role": "user", "content": "Hello!"}]

	# Call model
	async def call_model():
	response = await model(messages)
	if hasattr(response, "__aiter__"):
	result = []
	async for chunk in response:
	result.append(chunk)
	return result[-1] if result else response
	return response

	response = asyncio.run(call_model())
	assert response is not None

	# Verify spans
	spans = span_exporter.get_finished_spans()
	assert len(spans) >= 1, f"Expected at least 1 span, got {len(spans)}"

	# Find chat model span
	chat_spans = [span for span in spans if span.name.startswith("chat ")]
	assert len(chat_spans) >= 1, (
	f"No chat spans found. Available spans: {[s.name for s in spans]}"
	)

	# Verify span attributes
	chat_span = chat_spans[0]
	_assert_chat_span_attributes(
	chat_span,
	request_model="qwen-max",
	expect_input_messages=False, # Do not capture content by default
	expect_output_messages=False, # Do not capture content by default
	expect_time_to_first_token=True,
	)

	print("✓ Model call (basic) completed successfully")

Conversation

whoIam0987 commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

How Has This Been Tested?

Does This PR Require a Core Repo Change?

Checklist:

Uh oh!

CLAassistant commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Cirilla-zmh left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

whoIam0987 commented Dec 15, 2025 •

edited

Loading

CLAassistant commented Dec 15, 2025 •

edited

Loading